1
|
Sánchez Reyes LL, McTavish EJ, O’Meara B. DateLife: Leveraging Databases and Analytical Tools to Reveal the Dated Tree of Life. Syst Biol 2024; 73:470-485. [PMID: 38507308 PMCID: PMC11282365 DOI: 10.1093/sysbio/syae015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 03/09/2024] [Accepted: 03/18/2024] [Indexed: 03/22/2024] Open
Abstract
Chronograms-phylogenies with branch lengths proportional to time-represent key data on timing of evolutionary events, allowing us to study natural processes in many areas of biological research. Chronograms also provide valuable information that can be used for education, science communication, and conservation policy decisions. Yet, achieving a high-quality reconstruction of a chronogram is a difficult and resource-consuming task. Here we present DateLife, a phylogenetic software implemented as an R package and an R Shiny web application available at www.datelife.org, that provides services for efficient and easy discovery, summary, reuse, and reanalysis of node age data mined from a curated database of expert, peer-reviewed, and openly available chronograms. The main DateLife workflow starts with one or more scientific taxon names provided by a user. Names are processed and standardized to a unified taxonomy, allowing DateLife to run a name match across its local chronogram database that is curated from Open Tree of Life's phylogenetic repository, and extract all chronograms that contain at least two queried taxon names, along with their metadata. Finally, node ages from matching chronograms are mapped using the congruification algorithm to corresponding nodes on a tree topology, either extracted from Open Tree of Life's synthetic phylogeny or one provided by the user. Congruified node ages are used as secondary calibrations to date the chosen topology, with or without initial branch lengths, using different phylogenetic dating methods such as BLADJ, treePL, PATHd8, and MrBayes. We performed a cross-validation test to compare node ages resulting from a DateLife analysis (i.e, phylogenetic dating using secondary calibrations) to those from the original chronograms (i.e, obtained with primary calibrations), and found that DateLife's node age estimates are consistent with the age estimates from the original chronograms, with the largest variation in ages occurring around topologically deeper nodes. Because the results from any software for scientific analysis can only be as good as the data used as input, we highlight the importance of considering the results of a DateLife analysis in the context of the input chronograms. DateLife can help to increase awareness of the existing disparities among alternative hypotheses of dates for the same diversification events, and to support exploration of the effect of alternative chronogram hypotheses on downstream analyses, providing a framework for a more informed interpretation of evolutionary results.
Collapse
Affiliation(s)
- Luna L Sánchez Reyes
- Department of Life and Environmental Sciences, University of California, Merced, CA 95343, USA
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, 446 Hesler Biology Building, Knoxville, TN 37996, USA
| | - Emily Jane McTavish
- Department of Life and Environmental Sciences, University of California, Merced, CA 95343, USA
| | - Brian O’Meara
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, 446 Hesler Biology Building, Knoxville, TN 37996, USA
| |
Collapse
|
2
|
Jhwueng DC, Wu CY. A Novel Phylogenetic Negative Binomial Regression Model for Count-Dependent Variables. BIOLOGY 2023; 12:1148. [PMID: 37627032 PMCID: PMC10452298 DOI: 10.3390/biology12081148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Revised: 08/16/2023] [Accepted: 08/18/2023] [Indexed: 08/27/2023]
Abstract
Regression models are extensively used to explore the relationship between a dependent variable and its covariates. These models work well when the dependent variable is categorical and the data are supposedly independent, as is the case with generalized linear models (GLMs). However, trait data from related species do not operate under these conditions due to their shared common ancestry, leading to dependence that can be illustrated through a phylogenetic tree. In response to the analytical challenges of count-dependent variables in phylogenetically related species, we have developed a novel phylogenetic negative binomial regression model that allows for overdispersion, a limitation present in the phylogenetic Poisson regression model in the literature. This model overcomes limitations of conventional GLMs, which overlook the inherent dependence arising from shared lineage. Instead, our proposed model acknowledges this factor and uses the generalized estimating equation (GEE) framework for precise parameter estimation. The effectiveness of the proposed model was corroborated by a rigorous simulation study, which, despite the need for careful convergence monitoring, demonstrated its reasonable efficacy. The empirical application of the model to lizard egg-laying count and mammalian litter size data further highlighted its practical relevance. In particular, our results identified negative correlations between increases in egg mass, litter size, ovulation rate, and gestation length with respective yearly counts, while a positive correlation was observed with species lifespan. This study underscores the importance of our proposed model in providing nuanced and accurate analyses of count-dependent variables in related species, highlighting the often overlooked impact of shared ancestry. The model represents a critical advance in research methodologies, opening new avenues for interpretation of related species data in the field.
Collapse
|
3
|
Wong Y, Rosindell J. Dynamic visualisation of million‐tip trees: The OneZoom project. Methods Ecol Evol 2021. [DOI: 10.1111/2041-210x.13766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Yan Wong
- OneZoom CIO London UK
- Big Data Institute University of Oxford Oxford UK
| | - James Rosindell
- OneZoom CIO London UK
- Department of Life Sciences Silwood Park Campus Imperial College London London UK
| |
Collapse
|
4
|
Mctavish EJ, Sánchez-Reyes LL, Holder MT. OpenTree: A Python Package for Accessing and Analyzing Data from the Open Tree of Life. Syst Biol 2021; 70:1295-1301. [PMID: 33970279 PMCID: PMC8513759 DOI: 10.1093/sysbio/syab033] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 04/27/2021] [Accepted: 05/03/2021] [Indexed: 11/14/2022] Open
Abstract
The Open Tree of Life project constructs a comprehensive, dynamic, and digitally available tree of life by synthesizing published phylogenetic trees along with taxonomic data. Open Tree of Life provides web-service application programming interfaces (APIs) to make the tree estimate, unified taxonomy, and input phylogenetic data available to anyone. Here, we describe the Python package opentree, which provides a user friendly Python wrapper for these APIs and a set of scripts and tutorials for straightforward downstream data analyses. We demonstrate the utility of these tools by generating an estimate of the phylogenetic relationships of all bird families, and by capturing a phylogenetic estimate for all taxa observed at the University of California Merced Vernal Pools and Grassland Reserve.[Evolution; open science; phylogenetics; Python; taxonomy.].
Collapse
Affiliation(s)
- Emily Jane Mctavish
- Department of Life and Environmental Sciences, University of California, Merced, CA 95343, USA
| | | | - Mark T Holder
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045, USA
- Biodiversity Institute, University of Kansas, Lawrence, KS 66045, USA
| |
Collapse
|
5
|
Sánchez-Reyes LL, Kandziora M, McTavish EJ. Physcraper: a Python package for continually updated phylogenetic trees using the Open Tree of Life. BMC Bioinformatics 2021; 22:355. [PMID: 34187366 PMCID: PMC8244228 DOI: 10.1186/s12859-021-04274-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Accepted: 06/16/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Phylogenies are a key part of research in many areas of biology. Tools that automate some parts of the process of phylogenetic reconstruction, mainly molecular character matrix assembly, have been developed for the advantage of both specialists in the field of phylogenetics and non-specialists. However, interpretation of results, comparison with previously available phylogenetic hypotheses, and selection of one phylogeny for downstream analyses and discussion still impose difficulties to one that is not a specialist either on phylogenetic methods or on a particular group of study. RESULTS Physcraper is a command-line Python program that automates the update of published phylogenies by adding public DNA sequences to underlying alignments of previously published phylogenies. It also provides a framework for straightforward comparison of published phylogenies with their updated versions, by leveraging upon tools from the Open Tree of Life project to link taxonomic information across databases. The program can be used by the nonspecialist, as a tool to generate phylogenetic hypotheses based on publicly available expert phylogenetic knowledge. Phylogeneticists and taxonomic group specialists will find it useful as a tool to facilitate molecular dataset gathering and comparison of alternative phylogenetic hypotheses (topologies). CONCLUSION The Physcraper workflow showcases the benefits of doing open science for phylogenetics, encouraging researchers to strive for better scientific sharing practices. Physcraper can be used with any OS and is released under an open-source license. Detailed instructions for installation and usage are available at https://physcraper.readthedocs.io.
Collapse
Affiliation(s)
| | - Martha Kandziora
- School of Natural Sciences, University of California, Merced, USA.,Department of Botany, Faculty of Science, Charles University, Prague, Czech Republic
| | | |
Collapse
|
6
|
Nguyen VD, Nguyen TH, Tayeen ASM, Laughinghouse HD, Sánchez-Reyes LL, Wiggins J, Pontelli E, Mozzherin D, O’Meara B, Stoltzfus A. Phylotastic: Improving Access to Tree-of-Life Knowledge With Flexible, on-the-Fly Delivery of Trees. Evol Bioinform Online 2020; 16:1176934319899384. [PMID: 32372858 PMCID: PMC7192527 DOI: 10.1177/1176934319899384] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 11/20/2019] [Indexed: 11/15/2022] Open
Abstract
A comprehensive phylogeny of species, i.e., a tree of life, has potential uses in a variety of contexts, including research, education, and public policy. Yet, accessing the tree of life typically requires special knowledge, complex software, or long periods of training. The Phylotastic project aims make it as easy to get a phylogeny of species as it is to get driving directions from mapping software. In prior work, we presented a design for an open system to validate and manage taxon names, find phylogeny resources, extract subtrees matching a user's taxon list, scale trees to time, and integrate related resources such as species images. Here, we report the implementation of a set of tools that together represent a robust, accessible system for on-the-fly delivery of phylogenetic knowledge. This set of tools includes a web portal to execute several customizable workflows to obtain species phylogenies (scaled by geologic time and decorated with thumbnail images); more than 30 underlying web services (accessible via a common registry); and code toolkits in R and Python (allowing others to develop custom applications using Phylotastic services). The Phylotastic system, accessible via http://www.phylotastic.org, provides a unique resource to access the current state of phylogenetic knowledge, useful for a variety of cases in which a tree extracted quickly from online resources (as distinct from a tree custom-made from character data) is sufficient, as it is for many casual uses of trees identified here.
Collapse
Affiliation(s)
- Van D Nguyen
- Department of Computer Science, New Mexico State University, Las Cruces, NM, USA
| | - Thanh H Nguyen
- Department of Computer Science, New Mexico State University, Las Cruces, NM, USA
| | - Abu Saleh Md Tayeen
- Department of Computer Science, New Mexico State University, Las Cruces, NM, USA
| | - H Dail Laughinghouse
- Institute for Bioscience and Biotechnology Research, Rockville, MD, USA
- Fort Lauderdale Research and Education Center, University of Florida/IFAS, Davie, FL, USA
| | - Luna L Sánchez-Reyes
- Department of Ecology and Evolutionary Biology, The University of Tennessee, Knoxville, Knoxville, TN, USA
| | - Jodie Wiggins
- Department of Ecology and Evolutionary Biology, The University of Tennessee, Knoxville, Knoxville, TN, USA
| | - Enrico Pontelli
- Department of Computer Science, New Mexico State University, Las Cruces, NM, USA
| | - Dmitry Mozzherin
- Illinois Natural History Survey, Species File Group, University of Illinois at Urbana–Champaign, Champaign, IL, USA
| | - Brian O’Meara
- Department of Ecology and Evolutionary Biology, The University of Tennessee, Knoxville, Knoxville, TN, USA
| | - Arlin Stoltzfus
- Institute for Bioscience and Biotechnology Research, Rockville, MD, USA
- Office of Data and Informatics, Material Measurement Laboratory, NIST, Gaithersburg, MD, USA
| |
Collapse
|
7
|
Vos RA. DBTree: Very large phylogenies in portable databases. Methods Ecol Evol 2020. [DOI: 10.1111/2041-210x.13337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Rutger A. Vos
- Understanding Evolution Naturalis Biodiversity Center Leiden The Netherlands
- Institute of Biology Leiden Leiden University Leiden The Netherlands
| |
Collapse
|
8
|
Vos RA, Katayama T, Mishima H, Kawano S, Kawashima S, Kim JD, Moriya Y, Tokimatsu T, Yamaguchi A, Yamamoto Y, Wu H, Amstutz P, Antezana E, Aoki NP, Arakawa K, Bolleman JT, Bolton E, Bonnal RJP, Bono H, Burger K, Chiba H, Cohen KB, Deutsch EW, Fernández-Breis JT, Fu G, Fujisawa T, Fukushima A, García A, Goto N, Groza T, Hercus C, Hoehndorf R, Itaya K, Juty N, Kawashima T, Kim JH, Kinjo AR, Kotera M, Kozaki K, Kumagai S, Kushida T, Lütteke T, Matsubara M, Miyamoto J, Mohsen A, Mori H, Naito Y, Nakazato T, Nguyen-Xuan J, Nishida K, Nishida N, Nishide H, Ogishima S, Ohta T, Okuda S, Paten B, Perret JL, Prathipati P, Prins P, Queralt-Rosinach N, Shinmachi D, Suzuki S, Tabata T, Takatsuki T, Taylor K, Thompson M, Uchiyama I, Vieira B, Wei CH, Wilkinson M, Yamada I, Yamanaka R, Yoshitake K, Yoshizawa AC, Dumontier M, Kosaki K, Takagi T. BioHackathon 2015: Semantics of data for life sciences and reproducible research. F1000Res 2020; 9:136. [PMID: 32308977 PMCID: PMC7141167 DOI: 10.12688/f1000research.18236.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/05/2020] [Indexed: 01/08/2023] Open
Abstract
We report on the activities of the 2015 edition of the BioHackathon, an annual event that brings together researchers and developers from around the world to develop tools and technologies that promote the reusability of biological data. We discuss issues surrounding the representation, publication, integration, mining and reuse of biological data and metadata across a wide range of biomedical data types of relevance for the life sciences, including chemistry, genotypes and phenotypes, orthology and phylogeny, proteomics, genomics, glycomics, and metabolomics. We describe our progress to address ongoing challenges to the reusability and reproducibility of research results, and identify outstanding issues that continue to impede the progress of bioinformatics research. We share our perspective on the state of the art, continued challenges, and goals for future research and development for the life sciences Semantic Web.
Collapse
Affiliation(s)
- Rutger A. Vos
- Institute of Biology Leiden, Leiden University, Leiden, The Netherlands
- Naturalis Biodiversity Center, Leiden, The Netherlands
| | | | - Hiroyuki Mishima
- Department of Human Genetics, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Shin Kawano
- Database Center for Life Science, Tokyo, Japan
| | | | | | - Yuki Moriya
- Database Center for Life Science, Tokyo, Japan
| | | | | | | | - Hongyan Wu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | | | - Erick Antezana
- Department of Biology, Norwegian University of Science and Technology, Trondheim, Norway
| | - Nobuyuki P. Aoki
- Faculty of Science and Engineering, SOKA University, Tokyo, Japan
| | - Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Tokyo, Japan
| | - Jerven T. Bolleman
- SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Lausanne, Switzerland
| | - Evan Bolton
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, USA
| | - Raoul J. P. Bonnal
- Istituto Nazionale Genetica Molecolare, Romeo ed Enrica Invernizzi, Milan, Italy
| | | | - Kees Burger
- Dutch Techcentre for Life Sciences, Utrecht, The Netherlands
| | - Hirokazu Chiba
- National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Japan
| | - Kevin B. Cohen
- Computational Bioscience Program, University of Colorado School of Medicine, Denver, USA
- Université Paris-Saclay, LIMSI, CNRS, Paris, France
| | | | | | - Gang Fu
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, USA
| | | | | | | | - Naohisa Goto
- Research Institute for Microbial Diseases, Osaka University, Osaka, Japan
| | - Tudor Groza
- St Vincent's Clinical School, Faculty of Medicine, University of New South Wales, Darlinghurst, Australia
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, Australia
| | - Colin Hercus
- Novocraft Technologies Sdn. Bhd., Selangor, Malaysia
| | - Robert Hoehndorf
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Kotone Itaya
- Institute for Advanced Biosciences, Keio University, Tokyo, Japan
| | - Nick Juty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | | | - Jee-Hyub Kim
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Akira R. Kinjo
- Institute for Protein Research, Osaka University, Osaka, Japan
| | - Masaaki Kotera
- School of Life Science and Technology, Tokyo Institute of Technology, Tokyo, Japan
| | - Kouji Kozaki
- The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan
| | | | - Tatsuya Kushida
- National Bioscience Database Center, Japan Science and Technology Agency, Tokyo, Japan
| | - Thomas Lütteke
- Institute of Veterinary Physiology and Biochemistry, Justus-Liebig University Giessen, Giessen, Germany
- Gesellschaft für innovative Personalwirtschaftssysteme mbH (GIP GmbH), Offenbach, Germany
| | | | | | - Attayeb Mohsen
- National Institutes of Biomedical Innovation, Health and Nutrition, Osaka, Japan
| | - Hiroshi Mori
- Center for Information Biology, National Institute of Genetics, Mishima, Japan
| | - Yuki Naito
- Database Center for Life Science, Tokyo, Japan
| | | | | | | | - Naoki Nishida
- Department of Systems Science, Osaka University, Osaka, Japan
| | - Hiroyo Nishide
- National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Japan
| | - Soichi Ogishima
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan
| | - Tazro Ohta
- Database Center for Life Science, Tokyo, Japan
| | - Shujiro Okuda
- Niigata University Graduate School of Medical and Dental Sciences, Niigata, Japan
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, USA
| | | | - Philip Prathipati
- National Institutes of Biomedical Innovation, Health and Nutrition, Osaka, Japan
| | - Pjotr Prins
- University Medical Center Utrecht, Utrecht, The Netherlands
- University of Tennessee Health Science Center, Memphis, USA
| | - Núria Queralt-Rosinach
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
| | | | - Shinya Suzuki
- School of Life Science and Technology, Tokyo Institute of Technology, Tokyo, Japan
| | - Tsuyosi Tabata
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Japan
| | | | - Kieron Taylor
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Mark Thompson
- Leiden University Medical Center, Leiden, The Netherlands
| | - Ikuo Uchiyama
- National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Japan
| | - Bruno Vieira
- WurmLab, School of Biological & Chemical Sciences, Queen Mary University of London, London, UK
| | - Chih-Hsuan Wei
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, USA
| | - Mark Wilkinson
- Escuela Técnica Superior de Ingeniería Agronómica, Alimentaria y de Biosistemas, Universidad Politécnica de Madrid, Madrid, Spain
| | | | | | - Kazutoshi Yoshitake
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan
| | | | - Michel Dumontier
- Institute of Data Science, Maastricht University, Maastricht, The Netherlands
| | - Kenjiro Kosaki
- Center for Medical Genetics, Keio University School of Medicine, Tokyo, Japan
| | - Toshihisa Takagi
- National Bioscience Database Center, Japan Science and Technology Agency, Tokyo, Japan
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
9
|
Jhwueng DC, O'Meara BC. On the Matrix Condition of Phylogenetic Tree. Evol Bioinform Online 2020; 16:1176934320901721. [PMID: 32109980 PMCID: PMC7019399 DOI: 10.1177/1176934320901721] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Accepted: 12/26/2019] [Indexed: 11/16/2022] Open
Abstract
Phylogenetic comparative analyses use trees of evolutionary relationships between
species to understand their evolution and ecology. A phylogenetic tree of
n taxa can be algebraically transformed into an
n by n squared symmetric phylogenetic
covariance matrix C where each element cij in C represents the affinity between extant species i and
extant species j. This matrix C is used internally in several comparative methods: for example, it is
often inverted to compute the likelihood of the data under a model. However, if
the matrix is ill-conditioned (ie, if κ, defined by the ratio of the maximum eigenvalue of C to the minimum eigenvalue of C, is too high), this inversion may not be stable, and thus neither will
be the calculation of the likelihood or parameter estimates that are based on
optimizing the likelihood. We investigate this potential issue and propose
several methods to attempt to remedy this issue.
Collapse
Affiliation(s)
| | - Brian C O'Meara
- Department of Ecology and Evolutionary Biology, The University of Tennessee, Knoxville, Knoxville, TN, USA
| |
Collapse
|
10
|
Stöver BC, Wiechers S, Müller KF. JPhyloIO: a Java library for event-based reading and writing of different phylogenetic file formats through a common interface. BMC Bioinformatics 2019; 20:402. [PMID: 31331268 PMCID: PMC6647125 DOI: 10.1186/s12859-019-2982-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Accepted: 07/02/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Today a variety of phylogenetic file formats exists, some of which are well-established but limited in their data model, while other more recently introduced ones offer advanced features for metadata representation. Although most currently available software only supports the classical formats with a limited metadata model, it would be desirable to have support for the more advanced formats. This is necessary for users to produce richly annotated data that can be efficiently reused and make underlying workflows easily reproducible. A programming library that abstracts over the data and metadata models of the different formats and allows supporting all of them in one step would significantly simplify the development of new and the extension of existing software to address the need for better metadata annotation. RESULTS We developed the Java library JPhyloIO, which allows event-based reading and writing of the most common alignment and tree/network formats. It allows full access to all features of the nine currently supported formats. By implementing a single JPhyloIO-based reader and writer, application developers can support all of these formats. Due to the event-based architecture, JPhyloIO can be combined with any application data structure, and is memory efficient for large datasets. JPhyloIO is distributed under LGPL. Detailed documentation and example applications (available on http://bioinfweb.info/JPhyloIO/ ) significantly lower the entry barrier for bioinformaticians who wish to benefit from JPhyloIO's features in their own software. CONCLUSION JPhyloIO enables simplified development of new and extension of existing applications that support various standard formats simultaneously. This has the potential to improve interoperability between phylogenetic software tools and at the same time motivate usage of more recent metadata-rich formats such as NeXML or phyloXML.
Collapse
Affiliation(s)
- Ben C Stöver
- Institute for Evolution and Biodiversity, WWU Münster, Hüfferstraße 1, 48149, Münster, Germany.
| | - Sarah Wiechers
- Institute for Evolution and Biodiversity, WWU Münster, Hüfferstraße 1, 48149, Münster, Germany
| | - Kai F Müller
- Institute for Evolution and Biodiversity, WWU Münster, Hüfferstraße 1, 48149, Münster, Germany
| |
Collapse
|
11
|
de Almeida C, Scheer H, Gobert A, Fileccia V, Martinelli F, Zuber H, Gagliardi D. RNA uridylation and decay in plants. Philos Trans R Soc Lond B Biol Sci 2018; 373:rstb.2018.0163. [PMID: 30397100 DOI: 10.1098/rstb.2018.0163] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/18/2018] [Indexed: 12/13/2022] Open
Abstract
RNA uridylation consists of the untemplated addition of uridines at the 3' extremity of an RNA molecule. RNA uridylation is catalysed by terminal uridylyltransferases (TUTases), which form a subgroup of the terminal nucleotidyltransferase family, to which poly(A) polymerases also belong. The key role of RNA uridylation is to regulate RNA degradation in a variety of eukaryotes, including fission yeast, plants and animals. In plants, RNA uridylation has been mostly studied in two model species, the green algae Chlamydomonas reinhardtii and the flowering plant Arabidopsis thaliana Plant TUTases target a variety of RNA substrates, differing in size and function. These RNA substrates include microRNAs (miRNAs), small interfering silencing RNAs (siRNAs), ribosomal RNAs (rRNAs), messenger RNAs (mRNAs) and mRNA fragments generated during post-transcriptional gene silencing. Viral RNAs can also get uridylated during plant infection. We describe here the evolutionary history of plant TUTases and we summarize the diverse molecular functions of uridylation during RNA degradation processes in plants. We also outline key points of future research.This article is part of the theme issue '5' and 3' modifications controlling RNA degradation'.
Collapse
Affiliation(s)
- Caroline de Almeida
- Institut de biologie moléculaire des plantes (IBMP), Centre national de la recherche scientifique (CNRS), Université de Strasbourg, 12 rue Zimmer, 67000 Strasbourg, France
| | - Hélène Scheer
- Institut de biologie moléculaire des plantes (IBMP), Centre national de la recherche scientifique (CNRS), Université de Strasbourg, 12 rue Zimmer, 67000 Strasbourg, France
| | - Anthony Gobert
- Institut de biologie moléculaire des plantes (IBMP), Centre national de la recherche scientifique (CNRS), Université de Strasbourg, 12 rue Zimmer, 67000 Strasbourg, France
| | - Veronica Fileccia
- Dipartimento di Scienze Agrarie Alimentari Forestali, Università degli Studi di Palermo, viale delle scienze ed. 4, Palermo 90128, Italy
| | - Federico Martinelli
- Dipartimento di Scienze Agrarie Alimentari Forestali, Università degli Studi di Palermo, viale delle scienze ed. 4, Palermo 90128, Italy
| | - Hélène Zuber
- Institut de biologie moléculaire des plantes (IBMP), Centre national de la recherche scientifique (CNRS), Université de Strasbourg, 12 rue Zimmer, 67000 Strasbourg, France
| | - Dominique Gagliardi
- Institut de biologie moléculaire des plantes (IBMP), Centre national de la recherche scientifique (CNRS), Université de Strasbourg, 12 rue Zimmer, 67000 Strasbourg, France
| |
Collapse
|
12
|
John GP, Henry C, Sack L. Leaf rehydration capacity: Associations with other indices of drought tolerance and environment. PLANT, CELL & ENVIRONMENT 2018; 41:2638-2653. [PMID: 29978483 DOI: 10.1111/pce.13390] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2017] [Accepted: 06/19/2018] [Indexed: 06/08/2023]
Abstract
Clarifying the mechanisms of leaf and whole plant drought responses is critical to predict the impacts of ongoing climate change. The loss of rehydration capacity has been used for decades as a metric of leaf dehydration tolerance but has not been compared with other aspects of drought tolerance. We refined methods for quantifying the percent loss of rehydration capacity (PLRC), and for 18 Southern California woody species, we determined the relative water content and leaf water potential at PLRC of 10%, 25%, and 50%, and, additionally, the PLRC at important stages of dehydration including stomatal closure and turgor loss. On average, PLRC of 10% occurred below turgor loss point and at similar water status to 80% decline of stomatal conductance. As hypothesized, the sensitivity to loss of leaf rehydration capacity varied across species, leaf habits, and ecosystems and correlated with other drought tolerance traits, including the turgor loss point and structural traits including leaf mass per area. A new database of PLRC for 89 species from the global literature indicated greater leaf rehydration capacity in ecosystems with lower growing season moisture availability, indicating an adaptive role of leaf cell dehydration tolerance within the complex of drought tolerance traits.
Collapse
Affiliation(s)
- Grace P John
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, California, USA
- Department of Integrative Biology, University of Texas at Austin, Austin, Texas, USA
| | - Christian Henry
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, California, USA
| | - Lawren Sack
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, California, USA
| |
Collapse
|
13
|
ASP Applications in Bio-informatics: A Short Tour. KUNSTLICHE INTELLIGENZ 2018. [DOI: 10.1007/s13218-018-0551-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
14
|
Eiserhardt WL, Antonelli A, Bennett DJ, Botigué LR, Burleigh JG, Dodsworth S, Enquist BJ, Forest F, Kim JT, Kozlov AM, Leitch IJ, Maitner BS, Mirarab S, Piel WH, Pérez-Escobar OA, Pokorny L, Rahbek C, Sandel B, Smith SA, Stamatakis A, Vos RA, Warnow T, Baker WJ. A roadmap for global synthesis of the plant tree of life. AMERICAN JOURNAL OF BOTANY 2018; 105:614-622. [PMID: 29603138 DOI: 10.1002/ajb2.1041] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2017] [Accepted: 11/08/2017] [Indexed: 06/08/2023]
Abstract
Providing science and society with an integrated, up-to-date, high quality, open, reproducible and sustainable plant tree of life would be a huge service that is now coming within reach. However, synthesizing the growing body of DNA sequence data in the public domain and disseminating the trees to a diverse audience are often not straightforward due to numerous informatics barriers. While big synthetic plant phylogenies are being built, they remain static and become quickly outdated as new data are published and tree-building methods improve. Moreover, the body of existing phylogenetic evidence is hard to navigate and access for non-experts. We propose that our community of botanists, tree builders, and informaticians should converge on a modular framework for data integration and phylogenetic analysis, allowing easy collaboration, updating, data sourcing and flexible analyses. With support from major institutions, this pipeline should be re-run at regular intervals, storing trees and their metadata long-term. Providing the trees to a diverse global audience through user-friendly front ends and application development interfaces should also be a priority. Interactive interfaces could be used to solicit user feedback and thus improve data quality and to coordinate the generation of new data. We conclude by outlining a number of steps that we suggest the scientific community should take to achieve global phylogenetic synthesis.
Collapse
Affiliation(s)
- Wolf L Eiserhardt
- Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK
- Department of Bioscience, Aarhus University, Ny Munkegade 116, 8000, Aarhus C, Denmark
| | - Alexandre Antonelli
- Gothenburg Global Biodiversity Centre, Box 461, 405 30, Gothenburg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 405 30, Gothenburg, Sweden
- Gothenburg Botanical Garden, Carl Skottsbergs Gata 22B, SE-413 19, Gothenburg, Sweden
| | - Dominic J Bennett
- Gothenburg Global Biodiversity Centre, Box 461, 405 30, Gothenburg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 405 30, Gothenburg, Sweden
- Gothenburg Botanical Garden, Carl Skottsbergs Gata 22B, SE-413 19, Gothenburg, Sweden
| | | | | | | | - Brian J Enquist
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, 85721, USA
- The Santa Fe Institute, Santa Fe, NM, 87501, USA
| | - Félix Forest
- Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK
| | - Jan T Kim
- Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK
| | - Alexey M Kozlov
- Scientific Computing Group, Heidelberg Institute for Theoretical Studies, 69118, Heidelberg, Germany
| | - Ilia J Leitch
- Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK
| | - Brian S Maitner
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, 85721, USA
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, University of California, San Diego, San Diego, CA, 92093, USA
| | - William H Piel
- Yale-NUS College, 16 College Avenue West, Singapore, 138527, Republic of Singapore
| | | | - Lisa Pokorny
- Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK
| | - Carsten Rahbek
- Center for Macroecology, Evolution and Climate, University of Copenhagen, Universitetsparken 15, DK-2100, Copenhagen O, Denmark
- Imperial College London, Silwood Park, Buckhurst Road, Ascot, Berkshire, SL5 7PY, UK
| | - Brody Sandel
- Department of Biology, Santa Clara University, Santa Clara, CA, 95053, USA
| | - Stephen A Smith
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Alexandros Stamatakis
- Scientific Computing Group, Heidelberg Institute for Theoretical Studies, 69118, Heidelberg, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, 76128, Karlsruhe, Germany
| | - Rutger A Vos
- Naturalis Biodiversity Center, P.O. Box 9517, 2300RA, Leiden, The Netherlands
- Institute of Biology Leiden, P.O. Box 9505, 2300RA, Leiden, The Netherlands
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | | |
Collapse
|
15
|
Antonelli A, Hettling H, Condamine FL, Vos K, Nilsson RH, Sanderson MJ, Sauquet H, Scharn R, Silvestro D, Töpel M, Bacon CD, Oxelman B, Vos RA. Toward a Self-Updating Platform for Estimating Rates of Speciation and Migration, Ages, and Relationships of Taxa. Syst Biol 2018; 66:152-166. [PMID: 27616324 PMCID: PMC5410925 DOI: 10.1093/sysbio/syw066] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 07/19/2016] [Indexed: 01/06/2023] Open
Abstract
Rapidly growing biological data—including molecular sequences and fossils—hold an unprecedented potential to reveal how evolutionary processes generate and maintain biodiversity. However, researchers often have to develop their own idiosyncratic workflows to integrate and analyze these data for reconstructing time-calibrated phylogenies. In addition, divergence times estimated under different methods and assumptions, and based on data of various quality and reliability, should not be combined without proper correction. Here we introduce a modular framework termed SUPERSMART (Self-Updating Platform for Estimating Rates of Speciation and Migration, Ages, and Relationships of Taxa), and provide a proof of concept for dealing with the moving targets of evolutionary and biogeographical research. This framework assembles comprehensive data sets of molecular and fossil data for any taxa and infers dated phylogenies using robust species tree methods, also allowing for the inclusion of genomic data produced through next-generation sequencing techniques. We exemplify the application of our method by presenting phylogenetic and dating analyses for the mammal order Primates and for the plant family Arecaceae (palms). We believe that this framework will provide a valuable tool for a wide range of hypothesis-driven research questions in systematics, biogeography, and evolution. SUPERSMART will also accelerate the inference of a “Dated Tree of Life” where all node ages are directly comparable.
Collapse
Affiliation(s)
- Alexandre Antonelli
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden.,Gothenburg Botanical Garden, Carl Skottsbergs Gata 22A, SE-41319 Göteborg, Sweden
| | - Hannes Hettling
- Naturalis Biodiversity Center, Darwinweg 4, 2333 CR Leiden, The Netherlands
| | - Fabien L Condamine
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden.,CNRS, UMR 5554 Institut des Sciences de l'Evolution (Université de Montpellier), Place Eugéne Bataillon, 34095 Montpellier, France
| | - Karin Vos
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden
| | - R Henrik Nilsson
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden
| | - Michael J Sanderson
- Department of Ecology and Evolutionary Biology, University of Arizona, 1041 E. Lowell, Tucson, AZ 85721, USA
| | - Hervé Sauquet
- Université Paris-Sud, Laboratoire Écologie, Systématique, Évolution, CNRS UMR 8079, 91405 Orsay, France
| | - Ruud Scharn
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden
| | - Daniele Silvestro
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden.,Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
| | - Mats Töpel
- Swedish Bioinformatics Infrastructure for Life Sciences, Department of Biological and Environmental Sciences, University of Gothenburg, Box 463, SE-405 30, Göteborg, Sweden.,Department of Marine Sciences, University of Gothenburg, Box 460, SE-405 30 Göteborg, Sweden
| | - Christine D Bacon
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden
| | - Bengt Oxelman
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden
| | - Rutger A Vos
- Naturalis Biodiversity Center, Darwinweg 4, 2333 CR Leiden, The Netherlands
| |
Collapse
|
16
|
Boufahja F, Semprucci F, Beyrem H, Bhadury P. Marine Nematode Taxonomy in Africa: Promising Prospects Against Scarcity of Information. J Nematol 2015; 47:198-206. [PMID: 26527841 PMCID: PMC4612190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2015] [Indexed: 06/05/2023] Open
Abstract
From the late 19th century, Africa has faced heavy exploitation of its natural resources with increasing land/water pollution, and several described species have already become extinct or close to extinction. This could also be the case for marine nematodes, which are the most abundant and diverse benthic group in marine sediments, and play major roles in ecosystem functioning. Compared to Europe and North America, only a handful of investigations on marine nematodes have been conducted to date in Africa. This is due to the scarcity of experienced taxonomists, absence of identification guides, as well as local appropriate infrastructures. A pivotal project has started recently between nematologists from Africa (Tunisia), India, and Europe (Italy) to promote taxonomic study and biodiversity estimation of marine nematodes in the African continent. To do this, as a first step, collection of permanent slides of marine nematodes (235 nominal species and 14 new to science but not yet described) was recently established at the Faculty of Sciences of Bizerte (Tunisia). Capacity building of next generation of African taxonomists have been carried out at level of both traditional and molecular taxonomy (DNA barcoding and next-generation sequencing [NGS]), but they need to be implemented. Indeed, the integration of these two approaches appears crucial to overcome lack of information on the taxonomy, ecology, and biodiversity of marine nematodes from African coastal waters.
Collapse
Affiliation(s)
- Fehmi Boufahja
- Laboratory of Biomonitoring of the Environment, Coastal Ecology and Ecotoxicology Unit, Carthage University, Faculty of Sciences of Bizerte, Zarzouna 7021, Tunisia
| | - Federica Semprucci
- Dipartimento di Scienze della Terra, della Vita e dell'Ambiente (DiSTeVA), Università di Urbino, Località Crocicchia, 61029 Urbino, Italy
| | - Hamouda Beyrem
- Laboratory of Biomonitoring of the Environment, Coastal Ecology and Ecotoxicology Unit, Carthage University, Faculty of Sciences of Bizerte, Zarzouna 7021, Tunisia
| | - Punyasloke Bhadury
- Integrative Taxonomy and Microbial Ecology Research Group, Department of Biological Sciences, Indian Institute of Science Education and Research Kolkata, Mohanpur 741246, Nadia, West Bengal, India
| |
Collapse
|
17
|
Lind EM, Vincent JB, Weiblen GD, Cavender-Bares J, Borer ET. Trophic phylogenetics: evolutionary influences on body size, feeding, and species associations in grassland arthropods. Ecology 2015; 96:998-1009. [PMID: 26230020 DOI: 10.1890/14-0784.1] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Contemporary animal-plant interactions such as herbivory are widely understood to be shaped by evolutionary history. Yet questions remain about the role of plant phylogenetic diversity in generating and maintaining herbivore diversity, and whether evolutionary relatedness of producers might predict the composition of consumer communities. We tested for evidence of evolutionary associations among arthropods and the plants on which they were found, using phylogenetic analysis of naturally occurring arthropod assemblages sampled from a plant-diversity manipulation experiment. Considering phylogenetic relationships among more than 900 arthropod consumer taxa and 29 plant species in the experiment, we addressed several interrelated questions. First, our results support the hypothesis that arthropod functional traits such as body size and trophic role are phylogenetically conserved in community ecological samples. Second, herbivores tended to cooccur with closer phylogenetic relatives than would be expected at random, whereas predators and parasitoids did not show phylogenetic association patterns. Consumer specialization, as measured by association through time with monocultures of particular host plant species, showed significant phylogenetic signal, although the. strength of this association varied among plant species. Polycultures of phylogenetically dissimilar plant species supported more phylogenetically dissimilar consumer communities than did phylogenetically similar polycultures. Finally, we separated the effects of plant species richness and relatedness in predicting the phylogenetic distribution of the arthropod assemblages in this experiment. The phylogenetic diversity of plant communities predicted the phylogenetic diversity of herbivore communities even after accounting for plant species richness. The phylogenetic diversity of secondary consumers differed by guild, with predator phylogenetic diversity responding to herbivore relatedness, while parasitoid phylogenetic diversity was driven by plant relatedness. Evolutionary associations between plants and their consumers are apparent in plots only meters apart in a single field, indicating a strong role for host-plant phylogenetic diversity in sustaining landscape consumer biodiversity.
Collapse
|
18
|
Little SA, Green WA, Wing SL, Wilf P. Reinvestigation of Leaf Rank, an Underappreciated Component of Leo Hickey's Legacy. BULLETIN OF THE PEABODY MUSEUM OF NATURAL HISTORY 2014. [DOI: 10.3374/014.055.0202] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
| | - Walton A. Green
- Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge MA 02138 USA — :
| | - Scott L. Wing
- Department of Paleobiology, Natural Museum of Natural History, Smithsonian Institution, P.O. Box 37012, MRC 121, Washington, DC 20013-7012 USA — :
| | - Peter Wilf
- Department of Geosciences, Pennsylvania State University, University Park, PA 16802 USA — :
| |
Collapse
|
19
|
Vos RA, Biserkov JV, Balech B, Beard N, Blissett M, Brenninkmeijer C, van Dooren T, Eades D, Gosline G, Groom QJ, Hamann TD, Hettling H, Hoehndorf R, Holleman A, Hovenkamp P, Kelbert P, King D, Kirkup D, Lammers Y, DeMeulemeester T, Mietchen D, Miller JA, Mounce R, Nicolson N, Page R, Pawlik A, Pereira S, Penev L, Richards K, Sautter G, Shorthouse DP, Tähtinen M, Weiland C, Williams AR, Sierra S. Enriched biodiversity data as a resource and service. Biodivers Data J 2014:e1125. [PMID: 25057255 PMCID: PMC4092319 DOI: 10.3897/bdj.2.e1125] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2014] [Accepted: 06/11/2014] [Indexed: 11/28/2022] Open
Abstract
Background: Recent years have seen a surge in projects that produce large volumes of structured, machine-readable biodiversity data. To make these data amenable to processing by generic, open source “data enrichment” workflows, they are increasingly being represented in a variety of standards-compliant interchange formats. Here, we report on an initiative in which software developers and taxonomists came together to address the challenges and highlight the opportunities in the enrichment of such biodiversity data by engaging in intensive, collaborative software development: The Biodiversity Data Enrichment Hackathon. Results: The hackathon brought together 37 participants (including developers and taxonomists, i.e. scientific professionals that gather, identify, name and classify species) from 10 countries: Belgium, Bulgaria, Canada, Finland, Germany, Italy, the Netherlands, New Zealand, the UK, and the US. The participants brought expertise in processing structured data, text mining, development of ontologies, digital identification keys, geographic information systems, niche modeling, natural language processing, provenance annotation, semantic integration, taxonomic name resolution, web service interfaces, workflow tools and visualisation. Most use cases and exemplar data were provided by taxonomists. One goal of the meeting was to facilitate re-use and enhancement of biodiversity knowledge by a broad range of stakeholders, such as taxonomists, systematists, ecologists, niche modelers, informaticians and ontologists. The suggested use cases resulted in nine breakout groups addressing three main themes: i) mobilising heritage biodiversity knowledge; ii) formalising and linking concepts; and iii) addressing interoperability between service platforms. Another goal was to further foster a community of experts in biodiversity informatics and to build human links between research projects and institutions, in response to recent calls to further such integration in this research domain. Conclusions: Beyond deriving prototype solutions for each use case, areas of inadequacy were discussed and are being pursued further. It was striking how many possible applications for biodiversity data there were and how quickly solutions could be put together when the normal constraints to collaboration were broken down for a week. Conversely, mobilising biodiversity knowledge from their silos in heritage literature and natural history collections will continue to require formalisation of the concepts (and the links between them) that define the research domain, as well as increased interoperability between the software platforms that operate on these concepts.
Collapse
Affiliation(s)
| | | | - Bachir Balech
- Institute of Biomembranes and Bioenergetics, National Research Council, Bari, Italy
| | - Niall Beard
- University of Manchester, Manchester, United Kingdom
| | | | | | | | - David Eades
- The Illinois Natural History Survey, Champaign, United States of America
| | | | | | | | | | | | | | | | - Patricia Kelbert
- Botanic Garden and Botanical Museum Berlin-Dahlem, Freie Universität Berlin, Berlin, Germany
| | - David King
- The Open University, Milton Keynes, United Kingdom
| | - Don Kirkup
- Royal Botanic Gardens, Kew, United Kingdom
| | | | | | | | | | | | | | - Rod Page
- University Of Glasgow, Glasgow, United Kingdom
| | | | | | | | - Kevin Richards
- Biodiversity Informatics Consultant, Christchurch, New Zealand
| | | | | | | | - Claus Weiland
- Biodiversity and Climate Research Centre, Senckenberg Gesellschaft für Naturforschung, Frankfurt, Germany
| | | | | |
Collapse
|
20
|
Lammers Y, Peelen T, Vos RA, Gravendeel B. The HTS barcode checker pipeline, a tool for automated detection of illegally traded species from high-throughput sequencing data. BMC Bioinformatics 2014; 15:44. [PMID: 24502833 PMCID: PMC3922334 DOI: 10.1186/1471-2105-15-44] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2013] [Accepted: 01/30/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Mixtures of internationally traded organic substances can contain parts of species protected by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). These mixtures often raise the suspicion of border control and customs offices, which can lead to confiscation, for example in the case of Traditional Chinese medicines (TCMs). High-throughput sequencing of DNA barcoding markers obtained from such samples provides insight into species constituents of mixtures, but manual cross-referencing of results against the CITES appendices is labor intensive. Matching DNA barcodes against NCBI GenBank using BLAST may yield misleading results both as false positives, due to incorrectly annotated sequences, and false negatives, due to spurious taxonomic re-assignment. Incongruence between the taxonomies of CITES and NCBI GenBank can result in erroneous estimates of illegal trade. RESULTS The HTS barcode checker pipeline is an application for automated processing of sets of 'next generation' barcode sequences to determine whether these contain DNA barcodes obtained from species listed on the CITES appendices. This analytical pipeline builds upon and extends existing open-source applications for BLAST matching against the NCBI GenBank reference database and for taxonomic name reconciliation. In a single operation, reads are converted into taxonomic identifications matched with names on the CITES appendices. By inclusion of a blacklist and additional names databases, the HTS barcode checker pipeline prevents false positives and resolves taxonomic heterogeneity. CONCLUSIONS The HTS barcode checker pipeline can detect and correctly identify DNA barcodes of CITES-protected species from reads obtained from TCM samples in just a few minutes. The pipeline facilitates and improves molecular monitoring of trade in endangered species, and can aid in safeguarding these species from extinction in the wild. The HTS barcode checker pipeline is available at https://github.com/naturalis/HTS-barcode-checker.
Collapse
Affiliation(s)
| | | | | | - Barbara Gravendeel
- Naturalis Biodiversity Center, Darwinweg 4, 2333 CR Leiden, The Netherlands.
| |
Collapse
|