1
|
Abstract
Abstract
Biodiversity research studies the variability and diversity of organisms, including variability within and between species with particular focus on the functional diversity of traits and their relationship to environment. Managing biodiversity data implies dealing with its heterogeneous nature using semantics and tailored ontologies. These are themselves differently conceived, and combining them in semantically enabled applications necessitates an effective alignment between their concepts. This paper describes the ontology matching of biodiversity- and ecology-related ontologies. We illustrate diverse challenges introduced by this kind of ontologies to ontology matching in general. Real use cases requiring pairwise alignments between environment and trait ontologies are introduced. We describe our experience creating a new track at the Ontology Alignment Evaluation Initiative designed for this specific domain and report on the results obtained by state-of-the-art participating systems. The biodiversity and ecology use case turns out to be a strong one for ontology matching, introducing new interesting challenges. Even if most of the matching systems perform relatively well in the proposed matching tasks, there is still room for improvement. We highlight possible directions in that matter and elaborate on our plan to further progress with the track.
Collapse
|
2
|
Vos RA, Katayama T, Mishima H, Kawano S, Kawashima S, Kim JD, Moriya Y, Tokimatsu T, Yamaguchi A, Yamamoto Y, Wu H, Amstutz P, Antezana E, Aoki NP, Arakawa K, Bolleman JT, Bolton E, Bonnal RJP, Bono H, Burger K, Chiba H, Cohen KB, Deutsch EW, Fernández-Breis JT, Fu G, Fujisawa T, Fukushima A, García A, Goto N, Groza T, Hercus C, Hoehndorf R, Itaya K, Juty N, Kawashima T, Kim JH, Kinjo AR, Kotera M, Kozaki K, Kumagai S, Kushida T, Lütteke T, Matsubara M, Miyamoto J, Mohsen A, Mori H, Naito Y, Nakazato T, Nguyen-Xuan J, Nishida K, Nishida N, Nishide H, Ogishima S, Ohta T, Okuda S, Paten B, Perret JL, Prathipati P, Prins P, Queralt-Rosinach N, Shinmachi D, Suzuki S, Tabata T, Takatsuki T, Taylor K, Thompson M, Uchiyama I, Vieira B, Wei CH, Wilkinson M, Yamada I, Yamanaka R, Yoshitake K, Yoshizawa AC, Dumontier M, Kosaki K, Takagi T. BioHackathon 2015: Semantics of data for life sciences and reproducible research. F1000Res 2020; 9:136. [PMID: 32308977 PMCID: PMC7141167 DOI: 10.12688/f1000research.18236.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/05/2020] [Indexed: 01/08/2023] Open
Abstract
We report on the activities of the 2015 edition of the BioHackathon, an annual event that brings together researchers and developers from around the world to develop tools and technologies that promote the reusability of biological data. We discuss issues surrounding the representation, publication, integration, mining and reuse of biological data and metadata across a wide range of biomedical data types of relevance for the life sciences, including chemistry, genotypes and phenotypes, orthology and phylogeny, proteomics, genomics, glycomics, and metabolomics. We describe our progress to address ongoing challenges to the reusability and reproducibility of research results, and identify outstanding issues that continue to impede the progress of bioinformatics research. We share our perspective on the state of the art, continued challenges, and goals for future research and development for the life sciences Semantic Web.
Collapse
Affiliation(s)
- Rutger A. Vos
- Institute of Biology Leiden, Leiden University, Leiden, The Netherlands
- Naturalis Biodiversity Center, Leiden, The Netherlands
| | | | - Hiroyuki Mishima
- Department of Human Genetics, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Shin Kawano
- Database Center for Life Science, Tokyo, Japan
| | | | | | - Yuki Moriya
- Database Center for Life Science, Tokyo, Japan
| | | | | | | | - Hongyan Wu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | | | - Erick Antezana
- Department of Biology, Norwegian University of Science and Technology, Trondheim, Norway
| | - Nobuyuki P. Aoki
- Faculty of Science and Engineering, SOKA University, Tokyo, Japan
| | - Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Tokyo, Japan
| | - Jerven T. Bolleman
- SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Lausanne, Switzerland
| | - Evan Bolton
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, USA
| | - Raoul J. P. Bonnal
- Istituto Nazionale Genetica Molecolare, Romeo ed Enrica Invernizzi, Milan, Italy
| | | | - Kees Burger
- Dutch Techcentre for Life Sciences, Utrecht, The Netherlands
| | - Hirokazu Chiba
- National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Japan
| | - Kevin B. Cohen
- Computational Bioscience Program, University of Colorado School of Medicine, Denver, USA
- Université Paris-Saclay, LIMSI, CNRS, Paris, France
| | | | | | - Gang Fu
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, USA
| | | | | | | | - Naohisa Goto
- Research Institute for Microbial Diseases, Osaka University, Osaka, Japan
| | - Tudor Groza
- St Vincent's Clinical School, Faculty of Medicine, University of New South Wales, Darlinghurst, Australia
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, Australia
| | - Colin Hercus
- Novocraft Technologies Sdn. Bhd., Selangor, Malaysia
| | - Robert Hoehndorf
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Kotone Itaya
- Institute for Advanced Biosciences, Keio University, Tokyo, Japan
| | - Nick Juty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | | | - Jee-Hyub Kim
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Akira R. Kinjo
- Institute for Protein Research, Osaka University, Osaka, Japan
| | - Masaaki Kotera
- School of Life Science and Technology, Tokyo Institute of Technology, Tokyo, Japan
| | - Kouji Kozaki
- The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan
| | | | - Tatsuya Kushida
- National Bioscience Database Center, Japan Science and Technology Agency, Tokyo, Japan
| | - Thomas Lütteke
- Institute of Veterinary Physiology and Biochemistry, Justus-Liebig University Giessen, Giessen, Germany
- Gesellschaft für innovative Personalwirtschaftssysteme mbH (GIP GmbH), Offenbach, Germany
| | | | | | - Attayeb Mohsen
- National Institutes of Biomedical Innovation, Health and Nutrition, Osaka, Japan
| | - Hiroshi Mori
- Center for Information Biology, National Institute of Genetics, Mishima, Japan
| | - Yuki Naito
- Database Center for Life Science, Tokyo, Japan
| | | | | | | | - Naoki Nishida
- Department of Systems Science, Osaka University, Osaka, Japan
| | - Hiroyo Nishide
- National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Japan
| | - Soichi Ogishima
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan
| | - Tazro Ohta
- Database Center for Life Science, Tokyo, Japan
| | - Shujiro Okuda
- Niigata University Graduate School of Medical and Dental Sciences, Niigata, Japan
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, USA
| | | | - Philip Prathipati
- National Institutes of Biomedical Innovation, Health and Nutrition, Osaka, Japan
| | - Pjotr Prins
- University Medical Center Utrecht, Utrecht, The Netherlands
- University of Tennessee Health Science Center, Memphis, USA
| | - Núria Queralt-Rosinach
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
| | | | - Shinya Suzuki
- School of Life Science and Technology, Tokyo Institute of Technology, Tokyo, Japan
| | - Tsuyosi Tabata
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Japan
| | | | - Kieron Taylor
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Mark Thompson
- Leiden University Medical Center, Leiden, The Netherlands
| | - Ikuo Uchiyama
- National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Japan
| | - Bruno Vieira
- WurmLab, School of Biological & Chemical Sciences, Queen Mary University of London, London, UK
| | - Chih-Hsuan Wei
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, USA
| | - Mark Wilkinson
- Escuela Técnica Superior de Ingeniería Agronómica, Alimentaria y de Biosistemas, Universidad Politécnica de Madrid, Madrid, Spain
| | | | | | - Kazutoshi Yoshitake
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan
| | | | - Michel Dumontier
- Institute of Data Science, Maastricht University, Maastricht, The Netherlands
| | - Kenjiro Kosaki
- Center for Medical Genetics, Keio University School of Medicine, Tokyo, Japan
| | - Toshihisa Takagi
- National Bioscience Database Center, Japan Science and Technology Agency, Tokyo, Japan
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
3
|
Gkoutos GV, Schofield PN, Hoehndorf R. The anatomy of phenotype ontologies: principles, properties and applications. Brief Bioinform 2018; 19:1008-1021. [PMID: 28387809 PMCID: PMC6169674 DOI: 10.1093/bib/bbx035] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2017] [Revised: 02/05/2017] [Indexed: 12/14/2022] Open
Abstract
The past decade has seen an explosion in the collection of genotype data in domains as diverse as medicine, ecology, livestock and plant breeding. Along with this comes the challenge of dealing with the related phenotype data, which is not only large but also highly multidimensional. Computational analysis of phenotypes has therefore become critical for our ability to understand the biological meaning of genomic data in the biological sciences. At the heart of computational phenotype analysis are the phenotype ontologies. A large number of these ontologies have been developed across many domains, and we are now at a point where the knowledge captured in the structure of these ontologies can be used for the integration and analysis of large interrelated data sets. The Phenotype And Trait Ontology framework provides a method for formal definitions of phenotypes and associated data sets and has proved to be key to our ability to develop methods for the integration and analysis of phenotype data. Here, we describe the development and products of the ontological approach to phenotype capture, the formal content of phenotype ontologies and how their content can be used computationally.
Collapse
Affiliation(s)
| | | | - Robert Hoehndorf
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, King Abdullah University of Science and Technology, Thuwal
| |
Collapse
|
4
|
Römbke J, Bernard J, Martin-Laurent F. Standard methods for the assessment of structural and functional diversity of soil organisms: A review. INTEGRATED ENVIRONMENTAL ASSESSMENT AND MANAGEMENT 2018; 14:463-479. [PMID: 29603577 DOI: 10.1002/ieam.4046] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Revised: 11/14/2017] [Accepted: 03/28/2018] [Indexed: 06/08/2023]
Abstract
The lack of standardized methods to study soil organisms prevents comparisons across data sets and the development of new global and regional experiments and assessments. Moreover, standardized methods are needed to evaluate the impact of anthropogenic stressors, such as chemicals, on soil organism communities in the regulatory context. The goal of this contribution is to summarize current methodological approaches to measure structural and functional diversity of soil organisms, and to identify gaps and methodological improvements so as to cross data sets generated worldwide. This is urgently needed because several currently ongoing regional and global soil biodiversity studies are not coordinated with one another in terms of methodology, including database development. Therefore, we evaluated the standard methods to sample, identify, determine, and assess soil organisms currently applied or proposed, using well-accepted criteria such as ecological relevance; practicability of usage in terms of resources, time, and costs; and the level of standardization. Methods addressing both the structure and the functions of soil organisms (populations or communities) are included, with a special focus on new molecular methods based on nucleic acid extraction and further analyses by polymerase chain reaction (PCR)-based approaches for microorganisms and invertebrates. We particularly highlight the activities of the Technical Committee (TC) 190 of the International Organization for Standardization (ISO) because ISO guidelines are legally accredited by many national or international authorities when they put conservation laws and regulations into practice. Finally, we propose detailed recommendations regarding gaps in the available set of standards, in order to identify a list of new methods to be standardized. We propose to organize this whole process under the Global Soil Biodiversity Initiative (GSBI) in order to ensure a truly global approach for the assessment of soil biodiversity. Integr Environ Assess Manag 2018;14:463-479. © 2018 SETAC.
Collapse
Affiliation(s)
- Jörg Römbke
- ECT Oekotoxikologie GmbH, Flörsheim, Germany
| | | | | |
Collapse
|
5
|
Haque MM, Nipperess DA, Gallagher RV, Beaumont LJ. How well documented is Australia's flora? Understanding spatial bias in vouchered plant specimens. AUSTRAL ECOL 2017. [DOI: 10.1111/aec.12487] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Md. Mohasinul Haque
- Department of Biological Sciences; Macquarie University; Sydney New South Wales 2109 Australia
| | - David A. Nipperess
- Department of Biological Sciences; Macquarie University; Sydney New South Wales 2109 Australia
| | - Rachael V. Gallagher
- Department of Biological Sciences; Macquarie University; Sydney New South Wales 2109 Australia
| | - Linda J. Beaumont
- Department of Biological Sciences; Macquarie University; Sydney New South Wales 2109 Australia
| |
Collapse
|
6
|
Hausen J, Scholz-Starke B, Burkhardt U, Lesch S, Rick S, Russell D, Roß-Nickoll M, Ottermanns R. Edaphostat: interactive ecological analysis of soil organism occurrences and preferences from the Edaphobase data warehouse. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2017; 2017:4564813. [PMID: 29220469 PMCID: PMC5737075 DOI: 10.1093/database/bax080] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Accepted: 10/06/2017] [Indexed: 11/21/2022]
Abstract
The Edaphostat web application allows interactive and dynamic analyses of soil organism data stored in the Edaphobase data warehouse. It is part of the Edaphobase web application and can be accessed by any modern browser. The tool combines data from different sources (publications, field studies and museum collections) and allows species preferences along various environmental gradients (i.e. C/N ratio and pH) and classification systems (habitat type and soil type) to be analyzed. Database URL: Edaphostat is part of the Edaphobase Web Application available at https://portal.edaphobase.org
Collapse
Affiliation(s)
- Jonas Hausen
- Institute for Environmental Research, RWTH Aachen University, Worringerweg 1, 52074 Aachen, Germany
| | - Björn Scholz-Starke
- Institute for Environmental Research, RWTH Aachen University, Worringerweg 1, 52074 Aachen, Germany
| | - Ulrich Burkhardt
- Senckenberg Museum of Natural History Görlitz, P.O. Box 300154, 02806 Görlitz, Germany
| | - Stephan Lesch
- Senckenberg Museum of Natural History Görlitz, P.O. Box 300154, 02806 Görlitz, Germany
| | - Sebastian Rick
- Senckenberg Museum of Natural History Görlitz, P.O. Box 300154, 02806 Görlitz, Germany
| | - David Russell
- Senckenberg Museum of Natural History Görlitz, P.O. Box 300154, 02806 Görlitz, Germany
| | - Martina Roß-Nickoll
- Institute for Environmental Research, RWTH Aachen University, Worringerweg 1, 52074 Aachen, Germany
| | - Richard Ottermanns
- Institute for Environmental Research, RWTH Aachen University, Worringerweg 1, 52074 Aachen, Germany
| |
Collapse
|
7
|
Guralnick RP, Zermoglio PF, Wieczorek J, LaFrance R, Bloom D, Russell L. The importance of digitized biocollections as a source of trait data and a new VertNet resource. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw158. [PMID: 28025346 PMCID: PMC5199146 DOI: 10.1093/database/baw158] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/18/2016] [Revised: 11/06/2016] [Accepted: 11/06/2016] [Indexed: 02/02/2023]
Abstract
For vast areas of the globe and large parts of the tree of life, data needed to inform trait diversity is incomplete. Such trait data, when fully assembled, however, form the link between the evolutionary history of organisms, their assembly into communities, and the nature and functioning of ecosystems. Recent efforts to close data gaps have focused on collating trait-by-species databases, which only provide species-level, aggregated value ranges for traits of interest and often lack the direct observations on which those ranges are based. Perhaps under-appreciated is that digitized biocollection records collectively contain a vast trove of trait data measured directly from individuals, but this content remains hidden and highly heterogeneous, impeding discoverability and use. We developed and deployed a suite of openly accessible software tools in order to collate a full set of trait descriptions and extract two key traits, body length and mass, from >18 million specimen records in VertNet, a global biodiversity data publisher and aggregator. We tested success rate of these tools against hand-checked validation data sets and characterized quality and quantity. A post-processing toolkit was developed to standardize and harmonize data sets, and to integrate this improved content into VertNet for broadest reuse. The result of this work was to add more than 1.5 million harmonized measurements on vertebrate body mass and length directly to specimen records. Rates of false positives and negatives for extracted data were extremely low. We also created new tools for filtering, querying, and assembling this research-ready vertebrate trait content for view and download. Our work has yielded a novel database and platform for harmonized trait content that will grow as tools introduced here become part of publication workflows. We close by noting how this effort extends to new communities already developing similar digitized content. Database URL: http://portal.vertnet.org/search?advanced=1
Collapse
Affiliation(s)
- Robert P Guralnick
- University of Florida Museum of Natural History University of Florida at Gainesville, Gainesville, FL, USA
| | - Paula F Zermoglio
- Departamento de Ecología, Genética y Evolución, Instituto IEGEBA (CONICET-UBA), Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina.,Institut de Recherche sur la Biologie de l'Insecte, UMR 7261 CNRS, Université François Rabelais, Tours, France
| | - John Wieczorek
- Museum of Vertebrate Zoology University of California, Berkeley, CA, USA
| | - Raphael LaFrance
- University of Florida Museum of Natural History University of Florida at Gainesville, Gainesville, FL, USA
| | - David Bloom
- University of Florida Museum of Natural History University of Florida at Gainesville, Gainesville, FL, USA
| | - Laura Russell
- University of Florida Museum of Natural History University of Florida at Gainesville, Gainesville, FL, USA.,Biodiversity Institute University of Kansas, Lawrence, KS, USA
| |
Collapse
|
8
|
Hoehndorf R, Alshahrani M, Gkoutos GV, Gosline G, Groom Q, Hamann T, Kattge J, de Oliveira SM, Schmidt M, Sierra S, Smets E, Vos RA, Weiland C. The flora phenotype ontology (FLOPO): tool for integrating morphological traits and phenotypes of vascular plants. J Biomed Semantics 2016; 7:65. [PMID: 27842607 PMCID: PMC5109718 DOI: 10.1186/s13326-016-0107-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Accepted: 11/01/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The systematic analysis of a large number of comparable plant trait data can support investigations into phylogenetics and ecological adaptation, with broad applications in evolutionary biology, agriculture, conservation, and the functioning of ecosystems. Floras, i.e., books collecting the information on all known plant species found within a region, are a potentially rich source of such plant trait data. Floras describe plant traits with a focus on morphology and other traits relevant for species identification in addition to other characteristics of plant species, such as ecological affinities, distribution, economic value, health applications, traditional uses, and so on. However, a key limitation in systematically analyzing information in Floras is the lack of a standardized vocabulary for the described traits as well as the difficulties in extracting structured information from free text. RESULTS We have developed the Flora Phenotype Ontology (FLOPO), an ontology for describing traits of plant species found in Floras. We used the Plant Ontology (PO) and the Phenotype And Trait Ontology (PATO) to extract entity-quality relationships from digitized taxon descriptions in Floras, and used a formal ontological approach based on phenotype description patterns and automated reasoning to generate the FLOPO. The resulting ontology consists of 25,407 classes and is based on the PO and PATO. The classified ontology closely follows the structure of Plant Ontology in that the primary axis of classification is the observed plant anatomical structure, and more specific traits are then classified based on parthood and subclass relations between anatomical structures as well as subclass relations between phenotypic qualities. CONCLUSIONS The FLOPO is primarily intended as a framework based on which plant traits can be integrated computationally across all species and higher taxa of flowering plants. Importantly, it is not intended to replace established vocabularies or ontologies, but rather serve as an overarching framework based on which different application- and domain-specific ontologies, thesauri and vocabularies of phenotypes observed in flowering plants can be integrated.
Collapse
Affiliation(s)
- Robert Hoehndorf
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955–6900 Kingdom of Saudi Arabia
- Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955–6900 Kingdom of Saudi Arabia
| | - Mona Alshahrani
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955–6900 Kingdom of Saudi Arabia
- Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955–6900 Kingdom of Saudi Arabia
| | - Georgios V. Gkoutos
- College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham, B15 2TT United Kingdom
- Institute of Translational Medicine, University Hospitals Birmingham, NHS Foundation Trust, Birmingham, B15 2TT United Kingdom
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, SY23 2AX United Kingdom
| | - George Gosline
- Royal Botanical Gardens, Kew, Richmond, Surrey, TW9 3AB United Kingdom
| | - Quentin Groom
- Botanic Garden Meise, Nieuwelaan 38, Meise, 1860 Belgium
| | - Thomas Hamann
- Naturalis Biodiversity Center, P.O. Box 9517, Leiden, 2300 RA The Netherlands
| | - Jens Kattge
- Max Planck Institute for Biogeochemistry, Hans Knoell Str. 10, Jena, 07745 Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, Leipzig, 04103 Germany
| | | | - Marco Schmidt
- Senckenberg Biodiversity and Climate Research Centre (BiK-F), Senckenberganlage 25, Frankfurt am Main, 60325 Germany
| | - Soraya Sierra
- Naturalis Biodiversity Center, P.O. Box 9517, Leiden, 2300 RA The Netherlands
| | - Erik Smets
- Naturalis Biodiversity Center, P.O. Box 9517, Leiden, 2300 RA The Netherlands
| | - Rutger A. Vos
- Naturalis Biodiversity Center, P.O. Box 9517, Leiden, 2300 RA The Netherlands
| | - Claus Weiland
- Senckenberg Biodiversity and Climate Research Centre (BiK-F), Senckenberganlage 25, Frankfurt am Main, 60325 Germany
| |
Collapse
|
9
|
Benson EE, Harding K, Mackenzie-dodds J. A new quality management perspective for biodiversity conservation and research: Investigating Biospecimen Reporting for Improved Study Quality (BRISQ) and the Standard PRE-analytical Code (SPREC) using Natural History Museum and culture collections as case studies. SYST BIODIVERS 2016. [DOI: 10.1080/14772000.2016.1201167] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Erica E. Benson
- Damar Research Scientists, Damar, Drum Road, Cuparmuir, Fife, Scotland KY15 5RJ, UK
| | - Keith Harding
- Damar Research Scientists, Damar, Drum Road, Cuparmuir, Fife, Scotland KY15 5RJ, UK
| | - Jacqueline Mackenzie-dodds
- Molecular Collections, Department of Life Sciences, Natural History Museum, Cromwell Road, London SW7 5BD, UK
| |
Collapse
|
10
|
Thessen AE, Bunker DE, Buttigieg PL, Cooper LD, Dahdul WM, Domisch S, Franz NM, Jaiswal P, Lawrence-Dill CJ, Midford PE, Mungall CJ, Ramírez MJ, Specht CD, Vogt L, Vos RA, Walls RL, White JW, Zhang G, Deans AR, Huala E, Lewis SE, Mabee PM. Emerging semantics to link phenotype and environment. PeerJ 2015; 3:e1470. [PMID: 26713234 PMCID: PMC4690371 DOI: 10.7717/peerj.1470] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Accepted: 11/12/2015] [Indexed: 11/20/2022] Open
Abstract
Understanding the interplay between environmental conditions and phenotypes is a fundamental goal of biology. Unfortunately, data that include observations on phenotype and environment are highly heterogeneous and thus difficult to find and integrate. One approach that is likely to improve the status quo involves the use of ontologies to standardize and link data about phenotypes and environments. Specifying and linking data through ontologies will allow researchers to increase the scope and flexibility of large-scale analyses aided by modern computing methods. Investments in this area would advance diverse fields such as ecology, phylogenetics, and conservation biology. While several biological ontologies are well-developed, using them to link phenotypes and environments is rare because of gaps in ontological coverage and limits to interoperability among ontologies and disciplines. In this manuscript, we present (1) use cases from diverse disciplines to illustrate questions that could be answered more efficiently using a robust linkage between phenotypes and environments, (2) two proof-of-concept analyses that show the value of linking phenotypes to environments in fishes and amphibians, and (3) two proposed example data models for linking phenotypes and environments using the extensible observation ontology (OBOE) and the Biological Collections Ontology (BCO); these provide a starting point for the development of a data model linking phenotypes and environments.
Collapse
Affiliation(s)
- Anne E. Thessen
- Ronin Institute for Independent Scholarship, Monclair, NJ, United States
- The Data Detektiv, Waltham, MA, United States
| | - Daniel E. Bunker
- Department of Biological Sciences, New Jersey Institute of Technology, Newark, NJ, United States
| | - Pier Luigi Buttigieg
- HGF-MPG Group for Deep Sea Ecology and Technology, Alfred-Wegener-Institut, Helmholtz-Zentrum für Polar-und Meeresforschung, Bremerhaven, Germany
| | - Laurel D. Cooper
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, United States
| | - Wasila M. Dahdul
- Department of Biology, University of South Dakota, Vermillion, SD, United States
| | - Sami Domisch
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, United States
| | - Nico M. Franz
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, United States
| | - Carolyn J. Lawrence-Dill
- Departments of Genetics, Development and Cell Biology and Agronomy, Iowa State University, Ames, IA, United States
| | | | | | - Martín J. Ramírez
- Division of Arachnology, Museo Argentino de Ciencias Naturales–CONICET, Buenos Aires, Argentina
| | - Chelsea D. Specht
- Departments of Plant and Microbial Biology & Integrative Biology, University of California, Berkeley, CA, United States
| | - Lars Vogt
- Institut für Evolutionsbiologie und Ökologie, Universität Bonn, Bonn, Germany
| | | | - Ramona L. Walls
- iPlant Collaborative, University of Arizona, Tucson, AZ, United States
| | - Jeffrey W. White
- US Arid Land Agricultural Research Center, United States Department of Agriculture—ARS, Maricopa, AZ, United States
| | - Guanyang Zhang
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| | - Andrew R. Deans
- Department of Entomology, Pennsylvania State University, University Park, PA, United States
| | - Eva Huala
- Phoenix Bioinformatics, Redwood City, CA, United States
| | - Suzanna E. Lewis
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA, United States
| | - Paula M. Mabee
- Department of Biology, University of South Dakota, Vermillion, SD, United States
| |
Collapse
|
11
|
Hoehndorf R, Schofield PN, Gkoutos GV. The role of ontologies in biological and biomedical research: a functional perspective. Brief Bioinform 2015; 16:1069-80. [PMID: 25863278 PMCID: PMC4652617 DOI: 10.1093/bib/bbv011] [Citation(s) in RCA: 119] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2014] [Revised: 01/20/2015] [Indexed: 12/19/2022] Open
Abstract
Ontologies are widely used in biological and biomedical research. Their success lies in their combination of four main features present in almost all ontologies: provision of standard identifiers for classes and relations that represent the phenomena within a domain; provision of a vocabulary for a domain; provision of metadata that describes the intended meaning of the classes and relations in ontologies; and the provision of machine-readable axioms and definitions that enable computational access to some aspects of the meaning of classes and relations. While each of these features enables applications that facilitate data integration, data access and analysis, a great potential lies in the possibility of combining these four features to support integrative analysis and interpretation of multimodal data. Here, we provide a functional perspective on ontologies in biology and biomedicine, focusing on what ontologies can do and describing how they can be used in support of integrative research. We also outline perspectives for using ontologies in data-driven science, in particular their application in structured data mining and machine learning applications.
Collapse
|
12
|
Miller JA, Agosti D, Penev L, Sautter G, Georgiev T, Catapano T, Patterson D, King D, Pereira S, Vos RA, Sierra S. Integrating and visualizing primary data from prospective and legacy taxonomic literature. Biodivers Data J 2015; 3:e5063. [PMID: 26023286 PMCID: PMC4442254 DOI: 10.3897/bdj.3.e5063] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2015] [Accepted: 05/06/2015] [Indexed: 11/24/2022] Open
Abstract
Specimen data in taxonomic literature are among the highest quality primary biodiversity data. Innovative cybertaxonomic journals are using workflows that maintain data structure and disseminate electronic content to aggregators and other users; such structure is lost in traditional taxonomic publishing. Legacy taxonomic literature is a vast repository of knowledge about biodiversity. Currently, access to that resource is cumbersome, especially for non-specialist data consumers. Markup is a mechanism that makes this content more accessible, and is especially suited to machine analysis. Fine-grained XML (Extensible Markup Language) markup was applied to all (37) open-access articles published in the journal Zootaxa containing treatments on spiders (Order: Araneae). The markup approach was optimized to extract primary specimen data from legacy publications. These data were combined with data from articles containing treatments on spiders published in Biodiversity Data Journal where XML structure is part of the routine publication process. A series of charts was developed to visualize the content of specimen data in XML-tagged taxonomic treatments, either singly or in aggregate. The data can be filtered by several fields (including journal, taxon, institutional collection, collecting country, collector, author, article and treatment) to query particular aspects of the data. We demonstrate here that XML markup using GoldenGATE can address the challenge presented by unstructured legacy data, can extract structured primary biodiversity data which can be aggregated with and jointly queried with data from other Darwin Core-compatible sources, and show how visualization of these data can communicate key information contained in biodiversity literature. We complement recent studies on aspects of biodiversity knowledge using XML structured data to explore 1) the time lag between species discovry and description, and 2) the prevelence of rarity in species descriptions.
Collapse
Affiliation(s)
- Jeremy A. Miller
- Naturalis Biodiversity Center, Leiden, Netherlands
- www.Plazi.org, Bern, Switzerland
| | | | | | | | | | | | | | - David King
- The Open University, Milton Keynes, United Kingdom
| | | | | | | |
Collapse
|
13
|
Paprocki H, França D. Brazilian Trichoptera Checklist II. Biodivers Data J 2014:e1557. [PMID: 25349524 PMCID: PMC4206778 DOI: 10.3897/bdj.2.e1557] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2014] [Accepted: 10/02/2014] [Indexed: 11/17/2022] Open
Abstract
A second assessment of Brazilian Trichoptera species records is presented here. A total of 625 species were recorded for Brazil. This represents an increase of 65.34% new species recorded during the last decade. The Hydropsychidae (124 spp.), followed by the Hydroptilidae (102 spp.) and Polycentropodidae (97 spp.), are the families with the greatest richness recorded for Brazil. The knowledge on Trichoptera biodiversity in Brazil is geographically unequal. The majority of the species is recorded for the southeastern region.
Collapse
Affiliation(s)
- Henrique Paprocki
- Pontifícia Universidade Católica de Minas Gerais, Museu de Ciências Naturais, Coleção de Invertebrados. Av. Dom José Gaspar, 290, sala 104, Coração Eucarístico, CEP 30535-901, Belo Horizonte, Minas Gerais, Brazil, Belo Horizonte, Brazil
| | - Diogo França
- Pontifícia Universidade Católica de Minas Gerais, Museu de Ciências Naturais, Coleção de Invertebrados. Av. Dom José Gaspar, 290, sala 104, Coração Eucarístico, CEP 30535-901, Belo Horizonte, Minas Gerais, Brazil, Belo Horizonte, Brazil
| |
Collapse
|