1
|
Merra G, Gualtieri P, La Placa G, Frank G, Della Morte D, De Lorenzo A, Di Renzo L. The Relationship between Exposome and Microbiome. Microorganisms 2024; 12:1386. [PMID: 39065154 PMCID: PMC11278511 DOI: 10.3390/microorganisms12071386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 07/02/2024] [Accepted: 07/06/2024] [Indexed: 07/28/2024] Open
Abstract
Currently, exposome studies include a raft of different monitoring tools, including remote sensors, smartphones, omics analyses, distributed lag models, etc. The similarity in structure between the exposome and the microbiota plus their functions led us to pose three pertinent questions from this viewpoint, looking at the actual relationship between the exposome and the microbiota. In terms of the exposome, a bistable equilibrium between health and disease depends on constantly dealing with an ever-changing totality of exposures that together shape an individual from conception to death. Regarding scientific knowledge, the exposome is still lagging in certain areas, like the importance of microorganisms in the equation. The human microbiome is defined as an aggregate assemblage of gut commensals that are hosted by our surfaces related to the external environment. Commensals' resistance to a variety of environmental exposures, such as antibiotic administration, confirms that a layer of these organisms is protected within the host. The exposome is a conceptual framework defined as the environmental component of the science-inspired systems ideology that shifts from a specificity-based medical approach to reasoning in terms of complexity. A parallel concept in population health research and precision public health is the human flourishing index, which aims to account for the numerous environmental factors that affect individual and population well-being beyond ambient pollution.
Collapse
Affiliation(s)
- Giuseppe Merra
- Department of Biomedicine and Prevention, Section of Clinical Nutrition and Nutrigenomics, University of Rome Tor Vergata, 00133 Rome, Italy
| | - Paola Gualtieri
- Department of Biomedicine and Prevention, Section of Clinical Nutrition and Nutrigenomics, University of Rome Tor Vergata, 00133 Rome, Italy
| | - Giada La Placa
- Ph.D. School of Applied Medical-Surgical-Sciences, Univeristy of Rome Tor Vergata, 00133 Rome, Italy (G.F.)
| | - Giulia Frank
- Ph.D. School of Applied Medical-Surgical-Sciences, Univeristy of Rome Tor Vergata, 00133 Rome, Italy (G.F.)
| | - David Della Morte
- Department of Biomedicine and Prevention, Section of Clinical Nutrition and Nutrigenomics, University of Rome Tor Vergata, 00133 Rome, Italy
| | - Antonino De Lorenzo
- Department of Biomedicine and Prevention, Section of Clinical Nutrition and Nutrigenomics, University of Rome Tor Vergata, 00133 Rome, Italy
| | - Laura Di Renzo
- Department of Biomedicine and Prevention, Section of Clinical Nutrition and Nutrigenomics, University of Rome Tor Vergata, 00133 Rome, Italy
| |
Collapse
|
2
|
Teichman S, Lee MD, Willis AD. Analyzing microbial evolution through gene and genome phylogenies. Biostatistics 2024; 25:786-800. [PMID: 37897441 PMCID: PMC11247178 DOI: 10.1093/biostatistics/kxad025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 08/15/2023] [Accepted: 08/27/2023] [Indexed: 10/30/2023] Open
Abstract
Microbiome scientists critically need modern tools to explore and analyze microbial evolution. Often this involves studying the evolution of microbial genomes as a whole. However, different genes in a single genome can be subject to different evolutionary pressures, which can result in distinct gene-level evolutionary histories. To address this challenge, we propose to treat estimated gene-level phylogenies as data objects, and present an interactive method for the analysis of a collection of gene phylogenies. We use a local linear approximation of phylogenetic tree space to visualize estimated gene trees as points in low-dimensional Euclidean space, and address important practical limitations of existing related approaches, allowing an intuitive visualization of complex data objects. We demonstrate the utility of our proposed approach through microbial data analyses, including by identifying outlying gene histories in strains of Prevotella, and by contrasting Streptococcus phylogenies estimated using different gene sets. Our method is available as an open-source R package, and assists with estimating, visualizing, and interacting with a collection of bacterial gene phylogenies.
Collapse
Affiliation(s)
- Sarah Teichman
- University of Washington Department of Statistics, Box 354322, Seattle, WA 98195-4322, USA
| | - Michael D Lee
- KBR NASA Ames Research Center, PO Box 1, Moffett Field, CA 94035-1000
- Blue Marble Space Institute of Science, 600 1st Avenue, 1st Floor, Seattle, WA 98104, USA
| | - Amy D Willis
- University of Washington Department of Biostatistics, Hans Rosling Center for Population Health, Box 351617, Seattle, WA 98195-1617, USA
| |
Collapse
|
3
|
De Meester L, Vázquez-Domínguez E, Kassen R, Forest F, Bellon MR, Koskella B, Scherson RA, Colli L, Hendry AP, Crandall KA, Faith DP, Starger CJ, Geeta R, Araki H, Dulloo EM, Souffreau C, Schroer S, Johnson MTJ. A link between evolution and society fostering the UN sustainable development goals. Evol Appl 2024; 17:e13728. [PMID: 38884021 PMCID: PMC11178947 DOI: 10.1111/eva.13728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/16/2024] [Accepted: 05/17/2024] [Indexed: 06/18/2024] Open
Abstract
Given the multitude of challenges Earth is facing, sustainability science is of key importance to our continued existence. Evolution is the fundamental biological process underlying the origin of all biodiversity. This phylogenetic diversity fosters the resilience of ecosystems to environmental change, and provides numerous resources to society, and options for the future. Genetic diversity within species is also key to the ability of populations to evolve and adapt to environmental change. Yet, the value of evolutionary processes and the consequences of their impairment have not generally been considered in sustainability research. We argue that biological evolution is important for sustainability and that the concepts, theory, data, and methodological approaches used in evolutionary biology can, in crucial ways, contribute to achieving the UN Sustainable Development Goals (SDGs). We discuss how evolutionary principles are relevant to understanding, maintaining, and improving Nature Contributions to People (NCP) and how they contribute to the SDGs. We highlight specific applications of evolution, evolutionary theory, and evolutionary biology's diverse toolbox, grouped into four major routes through which evolution and evolutionary insights can impact sustainability. We argue that information on both within-species evolutionary potential and among-species phylogenetic diversity is necessary to predict population, community, and ecosystem responses to global change and to make informed decisions on sustainable production, health, and well-being. We provide examples of how evolutionary insights and the tools developed by evolutionary biology can not only inspire and enhance progress on the trajectory to sustainability, but also highlight some obstacles that hitherto seem to have impeded an efficient uptake of evolutionary insights in sustainability research and actions to sustain SDGs. We call for enhanced collaboration between sustainability science and evolutionary biology to understand how integrating these disciplines can help achieve the sustainable future envisioned by the UN SDGs.
Collapse
Affiliation(s)
- Luc De Meester
- Leibniz Institute of Freshwater Ecology and Inland Fisheries (IGB) Berlin Germany
- Laboratory of Aquatic Ecology, Evolution and Conservation KU Leuven Leuven Belgium
- Institute of Biology Freie University Berlin Berlin Germany
- Berlin-Brandenburg Institute of Advanced Biodiversity Research (BBIB) Berlin Germany
| | - Ella Vázquez-Domínguez
- Departamento de Ecología de la Biodiversidad, Instituto de Ecología, Universidad Nacional Autónoma de México Ciudad Universitaria Ciudad de México Mexico
- Conservation and Evolutionary Genetics Group Estación Biológica de Doñana (EBD-CSIC) Sevilla Spain
| | - Rees Kassen
- Department of Biology McGill University Montreal Quebec Canada
| | | | - Mauricio R Bellon
- Comisión Nacional Para el Conocimiento y Uso de la Biodiversidad (CONABIO) México City Mexico
- Swette Center for Sustainable Food Systems Arizona State University Tempe Arizona USA
| | - Britt Koskella
- Department of Integrative Biology University of California Berkeley California USA
| | - Rosa A Scherson
- Laboratorio Evolución y Sistemática, Departamento de Silvicultura y Conservación de la Naturaleza Universidad de Chile Santiago Chile
| | - Licia Colli
- Dipartimento di Scienze Animali, Della Nutrizione e Degli Alimenti, BioDNA Centro di Ricerca Sulla Biodiversità e Sul DNA Antico, Facoltà di Scienze Agrarie, Alimentari e Ambientali Università Cattolica del Sacro Cuore Piacenza Italy
| | - Andrew P Hendry
- Redpath Museum & Department of Biology McGill University Montreal Quebec Canada
| | - Keith A Crandall
- Department of Biostatistics and Bioinformatics George Washington University Washington DC USA
- Department of Invertebrate Zoology, US National Museum of Natural History Smithsonian Institution Washington DC USA
| | | | - Craig J Starger
- School of Global Environmental Sustainability Colorado State University Fort Collins Colorado USA
| | - R Geeta
- Department of Botany University of Delhi New Delhi India
| | - Hitoshi Araki
- Research Faculty of Agriculture Hokkaido University Sapporo Japan
| | - Ehsan M Dulloo
- Effective Genetic Resources Conservation and Use Alliance of Bioversity International and CIAT Rome Italy
| | - Caroline Souffreau
- Laboratory of Aquatic Ecology, Evolution and Conservation KU Leuven Leuven Belgium
| | - Sibylle Schroer
- Leibniz Institute of Freshwater Ecology and Inland Fisheries (IGB) Berlin Germany
| | - Marc T J Johnson
- Department of Biology & Centre for Urban Environments University of Toronto Mississauga Mississauga Ontario Canada
| |
Collapse
|
4
|
Kaale SE, Machangu RS, Lyimo TJ. Molecular characterization and phylogenetic diversity of actinomycetota species isolated from Lake Natron sediments at Arusha, Tanzania. Microbiol Res 2024; 278:127543. [PMID: 37950928 DOI: 10.1016/j.micres.2023.127543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 11/01/2023] [Indexed: 11/13/2023]
Abstract
Soda lakes are naturally occurring ecosystems characterized by extreme environmental conditions especially high pH and salinity levels but harboring valuable microbial communities with medical and biotechnological potentials. Lake Natron is one of the soda lakes situated in eastern branch of the East African Gregory Rift valley, Tanzania. In this study, the taxonomy and phylogenetic diversity of Actinomycetota species were explored in Lake Natron using molecular techniques. The sequencing of their 16S rRNA gene resulted into 13 genera of phylum Actinomycetota namely Streptomyces, Microbacterium, Nocardiopsis, Gordonia, Dietzia, Micromonospora, Microcella, Pseudarthrobacter, Nocardioides, Actinotalea, Cellulomonas, Isoptericola, and Glutamicibacter. We describe for the first time, the isolation of Streptomyces lasalocidi, S. harbinensis, S. anthocyanicus, Microbacterium aureliae, Pseudarthrobacter sp., Nocardioides sp. and Glutamicibacter mishrai from soda lake habitats. It also reports for the first time, the isolation of Gordonia spp., Microcella sp. and Actinotalea sp. from an East African Soda Lake as well as isolation of S. pseudogriseolus, S. calidiresistens and Micromonospora spp. from a Tanzania soda lake. Furthermore, two putative novel species of the phylum Actinomycetota were identified. Given that Actinomycetota are known potential sources of important biotechnological compounds, we recommend the broadening of the scope of bioprospection in future to include the novel species from Lake Natron.
Collapse
Affiliation(s)
- Sadikiel E Kaale
- Department of Molecular Biology and Biotechnology, University of Dar es Salaam, Dar es Salaam, Tanzania; Department of Biochemistry and Molecular Biology, Saint Francis University College of Health and Allied Sciences, Ifakara-Morogoro, Tanzania
| | - Robert S Machangu
- Department of Microbiology, Saint Francis University College of Health and Allied Sciences, Ifakara-Morogoro, Tanzania
| | - Thomas J Lyimo
- Department of Molecular Biology and Biotechnology, University of Dar es Salaam, Dar es Salaam, Tanzania.
| |
Collapse
|
5
|
Teichman S, Lee MD, Willis AD. Analyzing microbial evolution through gene and genome phylogenies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.15.553440. [PMID: 37645842 PMCID: PMC10462103 DOI: 10.1101/2023.08.15.553440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Microbiome scientists critically need modern tools to explore and analyze microbial evolution. Often this involves studying the evolution of microbial genomes as a whole. However, different genes in a single genome can be subject to different evolutionary pressures, which can result in distinct gene-level evolutionary histories. To address this challenge, we propose to treat estimated gene-level phylogenies as data objects, and present an interactive method for the analysis of a collection of gene phylogenies. We use a local linear approximation of phylogenetic tree space to visualize estimated gene trees as points in low-dimensional Euclidean space, and address important practical limitations of existing related approaches, allowing an intuitive visualization of complex data objects. We demonstrate the utility of our proposed approach through microbial data analyses, including by identifying outlying gene histories in strains of Prevotella, and by contrasting Streptococcus phylogenies estimated using different gene sets. Our method is available as an open-source R package, and assists with estimating, visualizing and interacting with a collection of bacterial gene phylogenies. dimension reduction, microbiome, non-Euclidean, statistical genetics, visualization.
Collapse
Affiliation(s)
| | - Michael D Lee
- NASA Ames Research Center and Blue Marble Space Institute of Science
| | - Amy D Willis
- Department of Biostatistics, University of Washington
| |
Collapse
|
6
|
Hasan NB, Balaban M, Biswas A, Bayzid MS, Mirarab S. Distance-Based Phylogenetic Placement with Statistical Support. BIOLOGY 2022; 11:biology11081212. [PMID: 36009839 PMCID: PMC9404983 DOI: 10.3390/biology11081212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 07/30/2022] [Accepted: 08/02/2022] [Indexed: 11/16/2022]
Abstract
Simple Summary Phylogenetic placement seeks to find the optimal position for a new query species on an existing backbone tree. Fast and accurate distance-based phylogenetic placement methods lack the crucial feature of estimating the support values for various placements of a query sequence. This study presents both parametric and nonparametric methods for measuring the support values of distance-based phylogenetic placements. Abstract Phylogenetic identification of unknown sequences by placing them on a tree is routinely attempted in modern ecological studies. Such placements are often obtained from incomplete and noisy data, making it essential to augment the results with some notion of uncertainty. While the standard likelihood-based methods designed for placement naturally provide such measures of uncertainty, the newer and more scalable distance-based methods lack this crucial feature. Here, we adopt several parametric and nonparametric sampling methods for measuring the support of phylogenetic placements that have been obtained with the use of distances. Comparing the alternative strategies, we conclude that nonparametric bootstrapping is more accurate than the alternatives. We go on to show how bootstrapping can be performed efficiently using a linear algebraic formulation that makes it up to 30 times faster and implement this optimized version as part of the distance-based placement software APPLES. By examining a wide range of applications, we show that the relative accuracy of maximum likelihood (ML) support values as compared to distance-based methods depends on the application and the dataset. ML is advantageous for fragmentary queries, while distance-based support values are more accurate for full-length and multi-gene datasets. With the quantification of uncertainty, our work fills a crucial gap that prevents the broader adoption of distance-based placement tools.
Collapse
Affiliation(s)
- Navid Bin Hasan
- Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh
| | - Metin Balaban
- Bioinformatics and System Biology Program, UC San Diego, San Diego, CA 92093, USA
| | - Avijit Biswas
- Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh
| | - Md. Shamsuzzoha Bayzid
- Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh
- Correspondence: (M.S.B.); (S.M.)
| | - Siavash Mirarab
- Electrical and Computer Engineering, UC San Diego, San Diego, CA 92093, USA
- Correspondence: (M.S.B.); (S.M.)
| |
Collapse
|
7
|
Czech L, Stamatakis A, Dunthorn M, Barbera P. Metagenomic Analysis Using Phylogenetic Placement-A Review of the First Decade. FRONTIERS IN BIOINFORMATICS 2022; 2:871393. [PMID: 36304302 PMCID: PMC9580882 DOI: 10.3389/fbinf.2022.871393] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 04/11/2022] [Indexed: 12/20/2022] Open
Abstract
Phylogenetic placement refers to a family of tools and methods to analyze, visualize, and interpret the tsunami of metagenomic sequencing data generated by high-throughput sequencing. Compared to alternative (e. g., similarity-based) methods, it puts metabarcoding sequences into a phylogenetic context using a set of known reference sequences and taking evolutionary history into account. Thereby, one can increase the accuracy of metagenomic surveys and eliminate the requirement for having exact or close matches with existing sequence databases. Phylogenetic placement constitutes a valuable analysis tool per se, but also entails a plethora of downstream tools to interpret its results. A common use case is to analyze species communities obtained from metagenomic sequencing, for example via taxonomic assignment, diversity quantification, sample comparison, and identification of correlations with environmental variables. In this review, we provide an overview over the methods developed during the first 10 years. In particular, the goals of this review are 1) to motivate the usage of phylogenetic placement and illustrate some of its use cases, 2) to outline the full workflow, from raw sequences to publishable figures, including best practices, 3) to introduce the most common tools and methods and their capabilities, 4) to point out common placement pitfalls and misconceptions, 5) to showcase typical placement-based analyses, and how they can help to analyze, visualize, and interpret phylogenetic placement data.
Collapse
Affiliation(s)
- Lucas Czech
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, United States
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Micah Dunthorn
- Natural History Museum, University of Oslo, Oslo, Norway
| | | |
Collapse
|
8
|
Jiang Y, Balaban M, Zhu Q, Mirarab S. DEPP: Deep Learning Enables Extending Species Trees using Single Genes. Syst Biol 2022; 72:17-34. [PMID: 35485976 DOI: 10.1093/sysbio/syac031] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 04/13/2022] [Accepted: 04/22/2022] [Indexed: 11/13/2022] Open
Abstract
Placing new sequences onto reference phylogenies is increasingly used for analyzing environmental samples, especially microbiomes. Existing placement methods assume that query sequences have evolved under specific models directly on the reference phylogeny. For example, they assume single-gene data (e.g., 16S rRNA amplicons) have evolved under the GTR model on a gene tree. Placement, however, often has a more ambitious goal: extending a (genome-wide) species tree given data from individual genes without knowing the evolutionary model. Addressing this challenging problem requires new directions. Here, we introduce Deep-learning Enabled Phylogenetic Placement (DEPP), an algorithm that learns to extend species trees using single genes without pre-specified models. In simulations and on real data, we show that DEPP can match the accuracy of model-based methods without any prior knowledge of the model. We also show that DEPP can update the multi-locus microbial tree-of-life with single genes with high accuracy. We further demonstrate that DEPP can combine 16S and metagenomic data onto a single tree, enabling community structure analyses that take advantage of both sources of data.
Collapse
Affiliation(s)
- Yueyu Jiang
- Department of Electrical and Computer Engineering, UC San Diego, CA 92093, USA
| | - Metin Balaban
- Bioinformatics and Systems Biology Graduate Program, UC San Diego, CA 92093, USA
| | - Qiyun Zhu
- Center for Fundamental and Applied Microbiomics, Arizona State University, Tempe, AZ 85281, USA
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, UC San Diego, CA 92093, USA
| |
Collapse
|
9
|
Youngblut ND, de la Cuesta-Zuluaga J, Ley RE. Incorporating genome-based phylogeny and functional similarity into diversity assessments helps to resolve a global collection of human gut metagenomes. Environ Microbiol 2022; 24:3966-3984. [PMID: 35049120 DOI: 10.1111/1462-2920.15910] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 01/15/2022] [Indexed: 11/29/2022]
Abstract
Tree-based diversity measures incorporate phylogenetic or functional relatedness into comparisons of microbial communities. This can improve the identification of explanatory factors compared to tree-agnostic diversity measures. However, applying tree-based diversity measures to metagenome data is more challenging than for single-locus sequencing (e.g., 16S rRNA gene). Utilizing the Genome Taxonomy Database (GTDB) for species-level metagenome profiling allows for functional diversity measures based on genomic content or traits inferred from it. Still, it is unclear how metagenome-based assessments of microbiome diversity benefit from incorporating phylogeny or function into measures of diversity. We assessed this by measuring phylogeny-based, function-based, and tree-agnostic diversity measures from a large, global collection of human gut metagenomes composed of 30 studies and 2943 samples. We found tree-based measures to explain phenotypic variation (e.g., westernization, disease status, and gender) better or equivalent to tree-agnostic measures. Ecophylogenetic and functional diversity measures provided unique insight into how microbiome diversity was partitioned by phenotype. Tree-based measures greatly improved machine learning model performance for predicting westernization, disease status, and gender, relative to models trained solely on tree-agnostic measures. Our findings illustrate the usefulness of tree- and function-based measures for metagenomic assessments of microbial diversity, which is a fundamental component of microbiome science. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Nicholas D Youngblut
- Department of Microbiome Science, Max Planck Institute for Developmental Biology, Max Planck Ring 5, 72076, Tübingen, Germany
| | - Jacobo de la Cuesta-Zuluaga
- Department of Microbiome Science, Max Planck Institute for Developmental Biology, Max Planck Ring 5, 72076, Tübingen, Germany
| | - Ruth E Ley
- Department of Microbiome Science, Max Planck Institute for Developmental Biology, Max Planck Ring 5, 72076, Tübingen, Germany
| |
Collapse
|
10
|
Balaban M, Jiang Y, Roush D, Zhu Q, Mirarab S. Fast and accurate distance-based phylogenetic placement using divide and conquer. Mol Ecol Resour 2021; 22:1213-1227. [PMID: 34643995 DOI: 10.1111/1755-0998.13527] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Accepted: 10/05/2021] [Indexed: 01/04/2023]
Abstract
Phylogenetic placement of query samples on an existing phylogeny is increasingly used in molecular ecology, including sample identification and microbiome environmental sampling. As the size of available reference trees used in these analyses continues to grow, there is a growing need for methods that place sequences on ultra-large trees with high accuracy. Distance-based placement methods have recently emerged as a path to provide such scalability while allowing flexibility to analyse both assembled and unassembled environmental samples. In this study, we introduce a distance-based phylogenetic placement method, APPLES-2, that is more accurate and scalable than existing distance-based methods and even some of the leading maximum-likelihood methods. This scalability is owed to a divide-and-conquer technique that limits distance calculation and phylogenetic placement to parts of the tree most relevant to each query. The increased scalability and accuracy enables us to study the effectiveness of APPLES-2 for placing microbial genomes on a data set of 10,575 microbial species using subsets of 381 marker genes. APPLES-2 has very high accuracy in this setting, placing 97% of query genomes within three branches of the optimal position in the species tree using 50 marker genes. Our proof-of-concept results show that APPLES-2 can quickly place metagenomic scaffolds on ultra-large backbone trees with high accuracy as long as a scaffold includes tens of marker genes. These results pave the path for a more scalable and widespread use of distance-based placement in various areas of molecular ecology.
Collapse
Affiliation(s)
- Metin Balaban
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - Yueyu Jiang
- Department of Electrical and Computer Engineering, UC San Diego, La Jolla, CA, USA
| | - Daniel Roush
- Center for Fundamental and Applied Microbiomics, Arizona State University, Tempe, AZ, USA
| | - Qiyun Zhu
- Center for Fundamental and Applied Microbiomics, Arizona State University, Tempe, AZ, USA
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, UC San Diego, La Jolla, CA, USA
| |
Collapse
|
11
|
Ticlla MR, Hella J, Hiza H, Sasamalo M, Mhimbira F, Rutaihwa LK, Droz S, Schaller S, Reither K, Hilty M, Comas I, Beisel C, Schmid CD, Fenner L, Gagneux S. The Sputum Microbiome in Pulmonary Tuberculosis and Its Association With Disease Manifestations: A Cross-Sectional Study. Front Microbiol 2021; 12:633396. [PMID: 34489876 PMCID: PMC8417804 DOI: 10.3389/fmicb.2021.633396] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 07/09/2021] [Indexed: 12/31/2022] Open
Abstract
Each day, approximately 27,000 people become ill with tuberculosis (TB), and 4,000 die from this disease. Pulmonary TB is the main clinical form of TB, and affects the lungs with a considerably heterogeneous manifestation among patients. Immunomodulation by an interplay of host-, environment-, and pathogen-associated factors partially explains such heterogeneity. Microbial communities residing in the host's airways have immunomodulatory effects, but it is unclear if the inter-individual variability of these microbial communities is associated with the heterogeneity of pulmonary TB. Here, we investigated this possibility by characterizing the microbial composition in the sputum of 334 TB patients from Tanzania, and by assessing its association with three aspects of disease manifestations: sputum mycobacterial load, severe clinical findings, and chest x-ray (CXR) findings. Compositional data analysis of taxonomic profiles based on 16S-rRNA gene amplicon sequencing and on whole metagenome shotgun sequencing, and graph-based inference of microbial associations revealed that the airway microbiome of TB patients was shaped by inverse relationships between Streptococcus and two anaerobes: Selenomonas and Fusobacterium. Specifically, the strength of these microbial associations was negatively correlated with Faith's phylogenetic diversity (PD) and with the accumulation of transient genera. Furthermore, low body mass index (BMI) determined the association between abnormal CXRs and community diversity and composition. These associations were mediated by increased abundance of Selenomonas and Fusobacterium, relative to the abundance of Streptococcus, in underweight patients with lung parenchymal infiltrates and in comparison to those with normal chest x-rays. And last, the detection of herpesviruses and anelloviruses in sputum microbial assemblage was linked to co-infection with HIV. Given the anaerobic metabolism of Selenomonas and Fusobacterium, and the hypoxic environment of lung infiltrates, our results suggest that in underweight TB patients, lung tissue remodeling toward anaerobic conditions favors the growth of Selenomonas and Fusobacterium at the expense of Streptococcus. These new insights into the interplay among particular members of the airway microbiome, BMI, and lung parenchymal lesions in TB patients, add a new dimension to the long-known association between low BMI and pulmonary TB. Our results also drive attention to the airways virome in the context of HIV-TB coinfection.
Collapse
Affiliation(s)
- Monica R Ticlla
- Swiss Tropical and Public Health Institute, Basel, Switzerland.,University of Basel, Basel, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Jerry Hella
- University of Basel, Basel, Switzerland.,Ifakara Health Institute, Dar es Salaam, Tanzania
| | - Hellen Hiza
- Ifakara Health Institute, Dar es Salaam, Tanzania
| | | | | | - Liliana K Rutaihwa
- Swiss Tropical and Public Health Institute, Basel, Switzerland.,University of Basel, Basel, Switzerland.,Ifakara Health Institute, Dar es Salaam, Tanzania
| | - Sara Droz
- Institute for Infectious Diseases, University of Bern, Bern, Switzerland
| | - Sarah Schaller
- Institute for Infectious Diseases, University of Bern, Bern, Switzerland
| | - Klaus Reither
- Swiss Tropical and Public Health Institute, Basel, Switzerland.,University of Basel, Basel, Switzerland
| | - Markus Hilty
- Institute for Infectious Diseases, University of Bern, Bern, Switzerland
| | - Inaki Comas
- Tuberculosis Genomics Unit, Biomedicine Institute of Valencia, Valencia, Spain
| | - Christian Beisel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Christoph D Schmid
- Swiss Tropical and Public Health Institute, Basel, Switzerland.,University of Basel, Basel, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Lukas Fenner
- Swiss Tropical and Public Health Institute, Basel, Switzerland.,University of Basel, Basel, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland
| | - Sebastien Gagneux
- Swiss Tropical and Public Health Institute, Basel, Switzerland.,University of Basel, Basel, Switzerland
| |
Collapse
|
12
|
Topological and Thermodynamic Entropy Measures for COVID-19 Pandemic through Graph Theory. Symmetry (Basel) 2020. [DOI: 10.3390/sym12121992] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused the global pandemic, coronavirus disease-2019 (COVID-19) which has resulted in 60.4 million infections and 1.42 million deaths worldwide. Mathematical models as an integral part of artificial intelligence are designed for contact tracing, genetic network analysis for uncovering the biological evolution of the virus, understanding the underlying mechanisms of the observed disease dynamics, evaluating mitigation strategies, and predicting the COVID-19 pandemic dynamics. This paper describes mathematical techniques to exploit and understand the progression of the pandemic through a topological characterization of underlying graphs. We have obtained several topological indices for various graphs of biological interest such as pandemic trees, Cayley trees, Christmas trees, and the corona product of Christmas trees and paths. We have also obtained an analytical expression for the thermodynamic entropies of pandemic trees as a function of R0, the reproduction number, and the level of spread, using the nested wreath product groups. Our plots of entropy and logarithms of topological indices of pandemic trees accentuate the underlying severity of COVID-19 over the 1918 Spanish flu pandemic.
Collapse
|
13
|
Zhou J, Zhao YT, Dai YY, Jiang YJ, Lin LH, Li H, Li P, Qu YF, Ji X. Captivity affects diversity, abundance, and functional pathways of gut microbiota in the northern grass lizard Takydromus septentrionalis. Microbiologyopen 2020; 9:e1095. [PMID: 32666685 PMCID: PMC7520994 DOI: 10.1002/mbo3.1095] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Revised: 05/15/2020] [Accepted: 05/30/2020] [Indexed: 12/14/2022] Open
Abstract
Animals in captivity undergo a range of environmental changes from wild animals. An increasing number of studies show that captivity significantly affects the abundance and community structure of gut microbiota. The northern grass lizard (Takydromus septentrionalis) is an extensively studied lacertid lizard and has a distributional range covering the central and southeastern parts of China. Nonetheless, little is known about the gut microbiota of this species, which may play a certain role in nutrient and energy metabolism as well as immune homeostasis. Here, we examined the differences in the gut microbiota between two groups (wild and captive) of lizards through 16S rRNA sequencing using the Illumina HiSeq platform. The results demonstrated that the dominant microbial components in both groups consisted of Proteobacteria, Firmicutes, and Tenericutes. The two groups did not differ in the abundance of these three phyla. Citrobacter was the most dominant genus in wild lizards, while Morganella was the most dominant genus in captive lizards. Moreover, gene function predictions showed that genes at the KEGG pathway levels2 were more abundant in wild lizards than in captive lizards but, at the KEGG pathway levels1, the differences in gene abundances between wild and captive lizards were not significant. In summary, captivity exerted a significant impact on the gut microbial community structure and diversity in T. septentrionalis, and future work could usefully investigate the causes of these changes using a comparative approach.
Collapse
Affiliation(s)
- Jin Zhou
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Yu-Tian Zhao
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Ying-Yu Dai
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Yi-Jin Jiang
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Long-Hui Lin
- Hangzhou Key Laboratory for Ecosystem Protection and Restoration, College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, China
| | - Hong Li
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Peng Li
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Yan-Fu Qu
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Xiang Ji
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| |
Collapse
|
14
|
Bohmann K, Mirarab S, Bafna V, Gilbert MTP. Beyond DNA barcoding: The unrealized potential of genome skim data in sample identification. Mol Ecol 2020; 29:2521-2534. [PMID: 32542933 PMCID: PMC7496323 DOI: 10.1111/mec.15507] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Revised: 06/03/2020] [Accepted: 06/05/2020] [Indexed: 02/06/2023]
Abstract
Genetic tools are increasingly used to identify and discriminate between species. One key transition in this process was the recognition of the potential of the ca 658bp fragment of the organelle cytochrome c oxidase I (COI) as a barcode region, which revolutionized animal bioidentification and lead, among others, to the instigation of the Barcode of Life Database (BOLD), containing currently barcodes from >7.9 million specimens. Following this discovery, suggestions for other organellar regions and markers, and the primers with which to amplify them, have been continuously proposed. Most recently, the field has taken the leap from PCR-based generation of DNA references into shotgun sequencing-based "genome skimming" alternatives, with the ultimate goal of assembling organellar reference genomes. Unfortunately, in genome skimming approaches, much of the nuclear genome (as much as 99% of the sequence data) is discarded, which is not only wasteful, but can also limit the power of discrimination at, or below, the species level. Here, we advocate that the full shotgun sequence data can be used to assign an identity (that we term for convenience its "DNA-mark") for both voucher and query samples, without requiring any computationally intensive pretreatment (e.g. assembly) of reads. We argue that if reference databases are populated with such "DNA-marks," it will enable future DNA-based taxonomic identification to complement, or even replace PCR of barcodes with genome skimming, and we discuss how such methodology ultimately could enable identification to population, or even individual, level.
Collapse
Affiliation(s)
- Kristine Bohmann
- Section for Evolutionary GenomicsThe GLOBE InstituteUniversity of CopenhagenCopenhagenDenmark
| | - Siavash Mirarab
- Department of Electrical and Computer EngineeringUniversity of CaliforniaSan DiegoCAUSA
| | - Vineet Bafna
- Department of Computer Science and EngineeringUniversity of CaliforniaSan DiegoCAUSA
| | - M. Thomas P. Gilbert
- Section for Evolutionary GenomicsThe GLOBE InstituteUniversity of CopenhagenCopenhagenDenmark
- Center for Evolutionary HologenomicsThe GLOBE InstituteUniversity of CopenhagenCopenhagenDenmark
- NTNU University MuseumTrondheimNorway
| |
Collapse
|
15
|
Sayyari E, Kawas B, Mirarab S. TADA: phylogenetic augmentation of microbiome samples enhances phenotype classification. Bioinformatics 2019; 35:i31-i40. [PMID: 31510701 PMCID: PMC6612822 DOI: 10.1093/bioinformatics/btz394] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
MOTIVATION Learning associations of traits with the microbial composition of a set of samples is a fundamental goal in microbiome studies. Recently, machine learning methods have been explored for this goal, with some promise. However, in comparison to other fields, microbiome data are high-dimensional and not abundant; leading to a high-dimensional low-sample-size under-determined system. Moreover, microbiome data are often unbalanced and biased. Given such training data, machine learning methods often fail to perform a classification task with sufficient accuracy. Lack of signal is especially problematic when classes are represented in an unbalanced way in the training data; with some classes under-represented. The presence of inter-correlations among subsets of observations further compounds these issues. As a result, machine learning methods have had only limited success in predicting many traits from microbiome. Data augmentation consists of building synthetic samples and adding them to the training data and is a technique that has proved helpful for many machine learning tasks. RESULTS In this paper, we propose a new data augmentation technique for classifying phenotypes based on the microbiome. Our algorithm, called TADA, uses available data and a statistical generative model to create new samples augmenting existing ones, addressing issues of low-sample-size. In generating new samples, TADA takes into account phylogenetic relationships between microbial species. On two real datasets, we show that adding these synthetic samples to the training set improves the accuracy of downstream classification, especially when the training data have an unbalanced representation of classes. AVAILABILITY AND IMPLEMENTATION TADA is available at https://github.com/tada-alg/TADA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Erfan Sayyari
- Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, CA, USA
| | - Ban Kawas
- IBM Research—Almaden Research Center, San Jose, CA, USA
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, CA, USA
| |
Collapse
|
16
|
Czech L, Stamatakis A. Scalable methods for analyzing and visualizing phylogenetic placement of metagenomic samples. PLoS One 2019; 14:e0217050. [PMID: 31136592 PMCID: PMC6538146 DOI: 10.1371/journal.pone.0217050] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Accepted: 05/05/2019] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND The exponential decrease in molecular sequencing cost generates unprecedented amounts of data. Hence, scalable methods to analyze these data are required. Phylogenetic (or Evolutionary) Placement methods identify the evolutionary provenance of anonymous sequences with respect to a given reference phylogeny. This increasingly popular method is deployed for scrutinizing metagenomic samples from environments such as water, soil, or the human gut. NOVEL METHODS Here, we present novel and, more importantly, highly scalable methods for analyzing phylogenetic placements of metagenomic samples. More specifically, we introduce methods for (a) visualizing differences between samples and their correlation with associated meta-data on the reference phylogeny, (b) clustering similar samples using a variant of the k-means method, and (c) finding phylogenetic factors using an adaptation of the Phylofactorization method. These methods enable to interpret metagenomic data in a phylogenetic context, to find patterns in the data, and to identify branches of the phylogeny that are driving these patterns. RESULTS To demonstrate the scalability and utility of our methods, as well as to provide exemplary interpretations of our methods, we applied them to 3 publicly available datasets comprising 9782 samples with a total of approximately 168 million sequences. The results indicate that new biological insights can be attained via our methods.
Collapse
Affiliation(s)
- Lucas Czech
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| |
Collapse
|
17
|
Heintz-Buschart A, Wilmes P. Human Gut Microbiome: Function Matters. Trends Microbiol 2018; 26:563-574. [DOI: 10.1016/j.tim.2017.11.002] [Citation(s) in RCA: 296] [Impact Index Per Article: 49.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2017] [Revised: 10/29/2017] [Accepted: 11/03/2017] [Indexed: 12/16/2022]
|
18
|
Chen EB, Cason C, Gilbert JA, Ho KJ. Current State of Knowledge on Implications of Gut Microbiome for Surgical Conditions. J Gastrointest Surg 2018; 22:1112-1123. [PMID: 29623674 PMCID: PMC5966332 DOI: 10.1007/s11605-018-3755-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/12/2018] [Accepted: 03/20/2018] [Indexed: 02/06/2023]
Abstract
The role of the microbiome in human health has become a central tenant of current medical research, infiltrating a diverse disciplinary base whereby microbiology, computer science, ecology, gastroenterology, immunology, neurophysiology and psychology, metabolism, and cardiovascular medicine all intersect. Traditionally, commensal gut microbiota have been assumed to play a significant role only in the metabolic processing of dietary nutrients and host metabolites, the fortification of gut epithelial barrier function, and the development of mucosal immunity. However, over the last 20 years, new technologies and renewed interest have uncovered a considerably broader influence of the microbiota on health maintenance and disease development, many of which are of particular relevance for surgeons. This article provides a broad overview of the current state of knowledge and a review of the technology that helped in their formation.
Collapse
Affiliation(s)
- Edmund B Chen
- Department of Surgery, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Cori Cason
- Department of Surgery, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Jack A Gilbert
- Department of Surgery, University of Chicago, Chicago, IL, USA
| | - Karen J Ho
- Department of Surgery, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
- Division of Vascular Surgery, Feinberg School of Medicine, Northwestern University, 676 North St. Clair Street, Suite 650, Chicago, IL, 60611, USA.
| |
Collapse
|
19
|
Whidden C, Matsen F. Calculating the Unrooted Subtree Prune-and-Regraft Distance. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 16:898-911. [PMID: 29994585 DOI: 10.1109/tcbb.2018.2802911] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The subtree prune-and-regraft (SPR) distance metric is a fundamental way of comparing evolutionary trees. It has wide-ranging applications, such as to study lateral genetic transfer, viral recombination, and Markov chain Monte Carlo phylogenetic inference. Although the rooted version of SPR distance can be computed relatively efficiently between rooted trees using fixed-parameter-tractable maximum agreement forest (MAF) algorithms, no MAF formulation is known for the unrooted case. Correspondingly, previous algorithms are unable to compute unrooted SPR distances larger than 7.
Collapse
|
20
|
Silverman JD, Washburne AD, Mukherjee S, David LA. A phylogenetic transform enhances analysis of compositional microbiota data. eLife 2017; 6:e21887. [PMID: 28198697 PMCID: PMC5328592 DOI: 10.7554/elife.21887] [Citation(s) in RCA: 171] [Impact Index Per Article: 24.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Accepted: 02/13/2017] [Indexed: 12/17/2022] Open
Abstract
Surveys of microbial communities (microbiota), typically measured as relative abundance of species, have illustrated the importance of these communities in human health and disease. Yet, statistical artifacts commonly plague the analysis of relative abundance data. Here, we introduce the PhILR transform, which incorporates microbial evolutionary models with the isometric log-ratio transform to allow off-the-shelf statistical tools to be safely applied to microbiota surveys. We demonstrate that analyses of community-level structure can be applied to PhILR transformed data with performance on benchmarks rivaling or surpassing standard tools. Additionally, by decomposing distance in the PhILR transformed space, we identified neighboring clades that may have adapted to distinct human body sites. Decomposing variance revealed that covariation of bacterial clades within human body sites increases with phylogenetic relatedness. Together, these findings illustrate how the PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges and enable evolutionary insights relevant to microbial communities.
Collapse
Affiliation(s)
- Justin D Silverman
- Program in Computational Biology and Bioinformatics, Duke University, Durham, United States
- Medical Scientist Training Program, Duke University, Durham, United States
- Center for Genomic and Computational Biology, Duke University, Durham, United States
| | - Alex D Washburne
- Nicholas School of the Environment, Duke University, Durham, United States
- Cooperative Institute for Research in Environmental Sciences (CIRES), University of Colorado, Boulder, United States
| | - Sayan Mukherjee
- Program in Computational Biology and Bioinformatics, Duke University, Durham, United States
- Department of Statistical Science, Duke University, Durham, United States
- Department of Mathematics, Duke University, Durham, United States
- Department of Biostatistics and Bioinformatics, Duke University, Durham, United States
- Department of Computer Science, Duke University, Durham, United States
| | - Lawrence A David
- Program in Computational Biology and Bioinformatics, Duke University, Durham, United States
- Center for Genomic and Computational Biology, Duke University, Durham, United States
- Department of Molecular Genetics and Microbiology, Duke University, Durham, United States
| |
Collapse
|
21
|
Frenkel Z, Kiat Y, Izhaki I, Snir S. Convex recoloring as an evolutionary marker. Mol Phylogenet Evol 2016; 107:209-220. [PMID: 27818264 DOI: 10.1016/j.ympev.2016.10.018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2016] [Revised: 10/16/2016] [Accepted: 10/25/2016] [Indexed: 11/27/2022]
Abstract
With the availability of enormous quantities of genetic data it has become common to construct very accurate trees describing the evolutionary history of the species under study, as well as every single gene of these species. These trees allow us to examine the evolutionary compliance of given markers (characters). A marker compliant with the history of the species investigated, has undergone mutations along the species tree branches, such that every subtree of that tree exhibits a different state. Convex recoloring (CR) uses combinatorial representation to measure the adequacy of a taxonomic classifier to a given tree. Despite its biological origins, research on CR has been almost exclusively dedicated to mathematical properties of the problem, or variants of it with little, if any, relationship to taxonomy. In this work we return to the origins of CR. We put CR in a statistical framework and introduce and learn the notion of the statistical significance of a character. We apply this measure to two data sets - Passerine birds and prokaryotes, and four examples. These examples demonstrate various applications of CR, from evolutionary relatedness, through lateral evolution, to supertree construction. The above study was done with a new software that we provide, containing algorithmic improvement with a graphical output of a (optimally) recolored tree. AVAILABILITY A code implementing the features and a README is available at http://research.haifa.ac.il/ssagi/software/convexrecoloring.zip.
Collapse
Affiliation(s)
- Zeev Frenkel
- Department of Ecology and Evolutionary Biology, University of Haifa, Israel
| | - Yosef Kiat
- Israeli Bird Ringing Center, Society for the Protection of Nature in Israel, Israel
| | - Ido Izhaki
- Department of Ecology and Evolutionary Biology, University of Haifa, Israel
| | - Sagi Snir
- Department of Ecology and Evolutionary Biology, University of Haifa, Israel
| |
Collapse
|
22
|
Ludvigsen J, Svihus B, Rudi K. Rearing Room Affects the Non-dominant Chicken Cecum Microbiota, While Diet Affects the Dominant Microbiota. Front Vet Sci 2016; 3:16. [PMID: 26942187 PMCID: PMC4766280 DOI: 10.3389/fvets.2016.00016] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Accepted: 02/10/2016] [Indexed: 12/18/2022] Open
Abstract
The combined effect of environment and diet in shaping the gut microbiota remains largely unknown. This knowledge, however, is important for animal welfare and safe food production. For these reasons, we determined the effect of experimental units on the chicken cecum microbiota for a full factorial experiment where we tested the combined effect of room, diet, and antimicrobial treatment. By Illumina Deep sequencing of the 16S rRNA gene, we found that diet mainly affected the dominant microbiota, while the room as a proxy for environment had major effects on the non-dominant microbiota (p = 0.006, Kruskal–Wallis test). We, therefore, propose that the dominant and non-dominant microbiotas are shaped by different experimental units. These findings have implications both for our general understanding of the host-associated microbiota and for setting up experiments related to specific targeting of pathogens.
Collapse
Affiliation(s)
- Jane Ludvigsen
- Department of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences , Ås , Norway
| | - Birger Svihus
- Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences , Ås , Norway
| | - Knut Rudi
- Department of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences , Ås , Norway
| |
Collapse
|
23
|
Tatusova T. Update on Genomic Databases and Resources at the National Center for Biotechnology Information. Methods Mol Biol 2016; 1415:3-30. [PMID: 27115625 DOI: 10.1007/978-1-4939-3572-7_1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2022]
Abstract
The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI website, text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.Comparative genome analysis tools lead to further understanding of evolution processes quickening the pace of discovery. Recent technological innovations have ignited an explosion in genome sequencing that has fundamentally changed our understanding of the biology of living organisms. This huge increase in DNA sequence data presents new challenges for the information management system and the visualization tools. New strategies have been designed to bring an order to this genome sequence shockwave and improve the usability of associated data.
Collapse
Affiliation(s)
- Tatiana Tatusova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD, 20894, USA.
| |
Collapse
|
24
|
Molecular Phylogenetics: Concepts for a Newcomer. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2016; 160:185-196. [DOI: 10.1007/10_2016_49] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
25
|
Avershina E, Rudi K. Confusion about the species richness of human gut microbiota. Benef Microbes 2015; 6:657-9. [DOI: 10.3920/bm2015.0007] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
A key message from a range of high profile next generation sequencing studies on the human microbiota is that it composes a tremendously rich community of more than 1000 species within each one of us. Although more recent studies have shown estimates of between 100 and 200 species per individual, this has not yet been made clear in the literature. Currently, the most widely accepted estimate of species richness is therefore five to ten times too high. Here, we will review the different estimates of species richness in the literature, address potential sources of artefacts, the reluctance to correct these, and provide suggestions for future directions.
Collapse
Affiliation(s)
- E. Avershina
- Department of Chemistry, Biotechnology and Food Science, Norwegian University for Life Sciences, 1430 Ås, Norway
| | - K. Rudi
- Department of Chemistry, Biotechnology and Food Science, Norwegian University for Life Sciences, 1430 Ås, Norway
| |
Collapse
|
26
|
Sanderson MJ, McMahon MM, Stamatakis A, Zwickl DJ, Steel M. Impacts of Terraces on Phylogenetic Inference. Syst Biol 2015; 64:709-26. [DOI: 10.1093/sysbio/syv024] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2014] [Accepted: 04/15/2015] [Indexed: 11/14/2022] Open
|