1
|
Rivera-Quiroz FA, Petcharad B, Miller JA. Mining data from legacy taxonomic literature and application for sampling spiders of the Teutamus group (Araneae; Liocranidae) in Southeast Asia. Sci Rep 2020; 10:15787. [PMID: 32978432 PMCID: PMC7519673 DOI: 10.1038/s41598-020-72549-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Accepted: 09/02/2020] [Indexed: 11/12/2022] Open
Abstract
Taxonomic literature contains information about virtually ever known species on Earth. In many cases, all that is known about a taxon is contained in this kind of literature, particularly for the most diverse and understudied groups. Taxonomic publications in the aggregate have documented a vast amount of specimen data. Among other things, these data constitute evidence of the existence of a particular taxon within a spatial and temporal context. When knowledge about a particular taxonomic group is rudimentary, investigators motivated to contribute new knowledge can use legacy records to guide them in their search for new specimens in the field. However, these legacy data are in the form of unstructured text, making it difficult to extract and analyze without a human interpreter. Here, we used a combination of semi-automatic tools to extract and categorize specimen data from taxonomic literature of one family of ground spiders (Liocranidae). We tested the application of these data on fieldwork optimization, using the relative abundance of adult specimens reported in literature as a proxy to find the best times and places for collecting the species (Teutamus politus) and its relatives (Teutamus group, TG) within Southeast Asia. Based on these analyses we decided to collect in three provinces in Thailand during the months of June and August. With our approach, we were able to collect more specimens of T. politus (188 specimens, 95 adults) than all the previous records in literature combined (102 specimens). Our approach was also effective for sampling other representatives of the TG, yielding at least one representative of every TG genus previously reported for Thailand. In total, our samples contributed 231 specimens (134 adults) to the 351 specimens previously reported in the literature for this country. Our results exemplify one application of mined literature data that allows investigators to more efficiently allocate effort and resources for the study of neglected, endangered, or interesting taxa and geographic areas. Furthermore, the integrative workflow demonstrated here shares specimen data with global online resources like Plazi and GBIF, meaning that others can freely reuse these data and contribute to them in the future. The contributions of the present study represent an increase of more than 35% on the taxonomic coverage of the TG in GBIF based on the number of species. Also, our extracted data represents 72% of the occurrences now available through GBIF for the TG and more than 85% of occurrences of T. politus. Taxonomic literature is a key source of undigitized biodiversity data for taxonomic groups that are underrepresented in the current biodiversity data sphere. Mobilizing these data is key to understanding and protecting some of the less well-known domains of biodiversity.
Collapse
Affiliation(s)
- F Andres Rivera-Quiroz
- Department of Terrestrial Zoology, Understanding Evolution group, Naturalis Biodiversity Center, Darwinweg 2, 2333CR, Leiden, The Netherlands.
- Institute of Biology Leiden (IBL), Leiden University, Sylviusweg 72, 2333BE, Leiden, The Netherlands.
| | - Booppa Petcharad
- Faculty of Science and Technology, Thammasat University, Rangsit, 12121, Pathum Thani, Thailand
| | - Jeremy A Miller
- Department of Terrestrial Zoology, Understanding Evolution group, Naturalis Biodiversity Center, Darwinweg 2, 2333CR, Leiden, The Netherlands
- Plazi, Zinggstrasse 16, CH 3007, Bern, Switzerland
| |
Collapse
|
2
|
Silva TSR, Feitosa RM. Using controlled vocabularies in anatomical terminology: A case study with Strumigenys (Hymenoptera: Formicidae). ARTHROPOD STRUCTURE & DEVELOPMENT 2019; 52:100877. [PMID: 31357032 DOI: 10.1016/j.asd.2019.100877] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Revised: 07/23/2019] [Accepted: 07/24/2019] [Indexed: 06/10/2023]
Abstract
Morphological studies of insects can help us to understand the concomitant or sequential functionality of complex structures and may be used to hypothetize distinct levels of phylogenetic relationship among groups. Traditional morphological works, generally, have encompassed a set of elements, including descriptions of structures and their respective conditions, literature references and images, all combined in a single document. Fast forward to the digital era, it is now possible to release this information simultaneously but also independently as data sets linked to the original publication in an external environment. In order to link data from various fields of knowledge, disseminating morphological information in an open environment, it is important to use tools that enhance interoperability. For example, semantic annotations facilitate the dissemination and retrieval of phenotypic data in digital environments. The integration of semantic (i.e. web-based) components with anatomic treatments can be used to generate a traditional description in natural language along with a set of semantic annotations. The ant genus Strumigenys currently comprises about 840 described species distributed worldwide. In the Neotropical region, almost 200 species are currently known, but it is possible that much of the species' diversity there remains unexplored and undescribed. The morphological diversity in the genus is high, reflecting an extreme generic reclassification that occurred in the late 20th and early 21st centuries. Here we define the anatomical concepts in this highly diverse group of ants using semantic annotations to enrich the anatomical ontologies available online, focussing on the definition of terms through subjacent conceptualization.
Collapse
Affiliation(s)
- Thiago S R Silva
- Department of Zoology, Universidade Federal do Paraná, Francisco Heráclito dos Santos Ave., Curitiba, PR, Brazil.
| | - Rodrigo M Feitosa
- Department of Zoology, Universidade Federal do Paraná, Francisco Heráclito dos Santos Ave., Curitiba, PR, Brazil.
| |
Collapse
|
3
|
OpenBiodiv: A Knowledge Graph for Literature-Extracted Linked Open Data in Biodiversity Science. PUBLICATIONS 2019. [DOI: 10.3390/publications7020038] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Hundreds of years of biodiversity research have resulted in the accumulation of a substantial pool of communal knowledge; however, most of it is stored in silos isolated from each other, such as published articles or monographs. The need for a system to store and manage collective biodiversity knowledge in a community-agreed and interoperable open format has evolved into the concept of the Open Biodiversity Knowledge Management System (OBKMS). This paper presents OpenBiodiv: An OBKMS that utilizes semantic publishing workflows, text and data mining, common standards, ontology modelling and graph database technologies to establish a robust infrastructure for managing biodiversity knowledge. It is presented as a Linked Open Dataset generated from scientific literature. OpenBiodiv encompasses data extracted from more than 5000 scholarly articles published by Pensoft and many more taxonomic treatments extracted by Plazi from journals of other publishers. The data from both sources are converted to Resource Description Framework (RDF) and integrated in a graph database using the OpenBiodiv-O ontology and an RDF version of the Global Biodiversity Information Facility (GBIF) taxonomic backbone. Through the application of semantic technologies, the project showcases the value of open publishing of Findable, Accessible, Interoperable, Reusable (FAIR) data towards the establishment of open science practices in the biodiversity domain.
Collapse
|
4
|
Muñoz G, Kissling WD, van Loon EE. Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature. Biodivers Data J 2019:e28737. [PMID: 30692868 PMCID: PMC6344444 DOI: 10.3897/bdj.7.e28737] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2018] [Accepted: 12/19/2018] [Indexed: 11/28/2022] Open
Abstract
Background A considerable portion of primary biodiversity data is digitally locked inside published literature which is often stored as pdf files. Large-scale approaches to biodiversity science could benefit from retrieving this information and making it digitally accessible and machine-readable. Nonetheless, the amount and diversity of digitally published literature pose many challenges for knowledge discovery and retrieval. Text mining has been extensively used for data discovery tasks in large quantities of documents. However, text mining approaches for knowledge discovery and retrieval have been limited in biodiversity science compared to other disciplines. New information Here, we present a novel, open source text mining tool, the Biodiversity Observations Miner (BOM). This web application, written in R, allows the semi-automated discovery of punctual biodiversity observations (e.g. biotic interactions, functional or behavioural traits and natural history descriptions) associated with the scientific names present inside a corpus of scientific literature. Furthermore, BOM enable users the rapid screening of large quantities of literature based on word co-occurrences that match custom biodiversity dictionaries. This tool aims to increase the digital mobilisation of primary biodiversity data and is freely accessible via GitHub or through a web server.
Collapse
Affiliation(s)
- Gabriel Muñoz
- NASUA, Biodiversity research and conservation section, Quito, Ecuador NASUA, Biodiversity research and conservation section Quito Ecuador.,Faculty of Arts and Science, Department of Biology, Concordia University, Montreal, Canada Faculty of Arts and Science, Department of Biology, Concordia University Montreal Canada
| | - W Daniel Kissling
- Faculty of Science, Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, Netherlands Faculty of Science, Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam Amsterdam Netherlands
| | - E Emiel van Loon
- Faculty of Science, Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, Netherlands Faculty of Science, Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam Amsterdam Netherlands
| |
Collapse
|
5
|
Côtez E, Mabille A, Chester C, Rocklin E, Deroin T, Desutter-Grandcolas L, Lesur J, Merle D, Robillard T, Bénichou L. 1802–2018: 220 ans d'histoire des périodiques au Muséum. ZOOSYSTEMA 2018. [DOI: 10.5252/zoosystema2018v40a1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Affiliation(s)
- Emmanuel Côtez
- Muséum national d'Histoire naturelle, Service des Publications scientifiques, case postale 41, 57 ru
| | - Anne Mabille
- Muséum national d'Histoire naturelle, Service des Publications scientifiques, case postale 41, 57 ru
| | - Chloë Chester
- Muséum national d'Histoire naturelle, Service des Publications scientifiques, case postale 41, 57 ru
| | - Emmanuelle Rocklin
- Muséum national d'Histoire naturelle, Service des Publications scientifiques, case postale 41, 57 ru
| | - Thierry Deroin
- Muséum national d'Histoire naturelle, Service des Publications scientifiques, case postale 41, 57 ru
| | - Laure Desutter-Grandcolas
- Muséum national d'Histoire naturelle, Service des Publications scientifiques, case postale 41, 57 ru
| | - Joséphine Lesur
- Muséum national d'Histoire naturelle, Service des Publications scientifiques, case postale 41, 57 ru
| | - Didier Merle
- Muséum national d'Histoire naturelle, Service des Publications scientifiques, case postale 41, 57 ru
| | - Tony Robillard
- Muséum national d'Histoire naturelle, Service des Publications scientifiques, case postale 41, 57 ru
| | - Laurence Bénichou
- Muséum national d'Histoire naturelle, Service des Publications scientifiques, case postale 41, 57 ru
| |
Collapse
|
6
|
Faulwetter S, Pafilis E, Fanini L, Bailly N, Agosti D, Arvanitidis C, Boicenco L, Capatano T, Claus S, Dekeyzer S, Georgiev T, Legaki A, Mavraki D, Oulas A, Papastefanou G, Penev L, Sautter G, Schigel D, Senderov V, Teaca A, Tsompanou M. EMODnet Workshop on mechanisms and guidelines to mobilise historical data into biogeographic databases. RESEARCH IDEAS AND OUTCOMES 2016. [DOI: 10.3897/rio.2.e9774] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
7
|
Senderov V, Penev L. The Open Biodiversity Knowledge Management System in Scholarly Publishing. RESEARCH IDEAS AND OUTCOMES 2016. [DOI: 10.3897/rio.2.e7757] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
|
8
|
Dikow T, Agosti D. Utilizing online resources for taxonomy: a cybercatalog of Afrotropical apiocerid flies (Insecta: Diptera: Apioceridae). Biodivers Data J 2015:e5707. [PMID: 26491392 PMCID: PMC4609823 DOI: 10.3897/bdj.3.e5707] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2015] [Accepted: 09/30/2015] [Indexed: 11/12/2022] Open
Abstract
A cybercatalog to the Apioceridae (apiocerid flies) of the Afrotropical Region is provided. Each taxon entry includes links to open-access, online repositories such as ZooBank, BHL/BioStor/BLR, Plazi, GBIF, Morphbank, EoL, and a research web-site to access taxonomic information, digitized literature, morphological descriptions, specimen occurrence data, and images. Cybercatalogs as the one presented here will need to become the future of taxonomic catalogs taking advantage of the growing number of online repositories, linked data, and be easily updatable. Comments on the deposition of the holotype of Apiocera braunsi Melander, 1907 are made.
Collapse
Affiliation(s)
- Torsten Dikow
- National Museum of Natural History, Smithsonian Institution, Washington, DC, United States of America
| | | |
Collapse
|
9
|
The application of “-omics” technologies for the classification and identification of animals. ORG DIVERS EVOL 2015. [DOI: 10.1007/s13127-015-0234-6] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
10
|
Miller JA, Agosti D, Penev L, Sautter G, Georgiev T, Catapano T, Patterson D, King D, Pereira S, Vos RA, Sierra S. Integrating and visualizing primary data from prospective and legacy taxonomic literature. Biodivers Data J 2015; 3:e5063. [PMID: 26023286 PMCID: PMC4442254 DOI: 10.3897/bdj.3.e5063] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2015] [Accepted: 05/06/2015] [Indexed: 11/24/2022] Open
Abstract
Specimen data in taxonomic literature are among the highest quality primary biodiversity data. Innovative cybertaxonomic journals are using workflows that maintain data structure and disseminate electronic content to aggregators and other users; such structure is lost in traditional taxonomic publishing. Legacy taxonomic literature is a vast repository of knowledge about biodiversity. Currently, access to that resource is cumbersome, especially for non-specialist data consumers. Markup is a mechanism that makes this content more accessible, and is especially suited to machine analysis. Fine-grained XML (Extensible Markup Language) markup was applied to all (37) open-access articles published in the journal Zootaxa containing treatments on spiders (Order: Araneae). The markup approach was optimized to extract primary specimen data from legacy publications. These data were combined with data from articles containing treatments on spiders published in Biodiversity Data Journal where XML structure is part of the routine publication process. A series of charts was developed to visualize the content of specimen data in XML-tagged taxonomic treatments, either singly or in aggregate. The data can be filtered by several fields (including journal, taxon, institutional collection, collecting country, collector, author, article and treatment) to query particular aspects of the data. We demonstrate here that XML markup using GoldenGATE can address the challenge presented by unstructured legacy data, can extract structured primary biodiversity data which can be aggregated with and jointly queried with data from other Darwin Core-compatible sources, and show how visualization of these data can communicate key information contained in biodiversity literature. We complement recent studies on aspects of biodiversity knowledge using XML structured data to explore 1) the time lag between species discovry and description, and 2) the prevelence of rarity in species descriptions.
Collapse
Affiliation(s)
- Jeremy A. Miller
- Naturalis Biodiversity Center, Leiden, Netherlands
- www.Plazi.org, Bern, Switzerland
| | | | | | | | | | | | | | - David King
- The Open University, Milton Keynes, United Kingdom
| | | | | | | |
Collapse
|
11
|
Miller JA, Georgiev T, Stoev P, Sautter G, Penev L. Corrected data re-harvested: curating literature in the era of networked biodiversity informatics. Biodivers Data J 2015:e4552. [PMID: 25632264 PMCID: PMC4304254 DOI: 10.3897/bdj.3.e4552] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Accepted: 01/21/2015] [Indexed: 11/12/2022] Open
Affiliation(s)
- Jeremy A Miller
- Naturalis Biodiversity Center, Leiden, Netherlands ; www.Plazi.org, Bern, Switzerland
| | | | - Pavel Stoev
- National Museum of Natural History and Pensoft Publishers, Sofia, Bulgaria
| | | | - Lyubomir Penev
- Institute of Biodiversity & Ecosystem Research, Bulgarian Academy of Sciences and Pensoft Publishers, Sofia, Bulgaria
| |
Collapse
|
12
|
Thomas WW, Tulig M. Hard Copy to Digital: Flora Neotropica and the World Flora Online. RODRIGUÉSIA 2015. [DOI: 10.1590/2175-7860201566404] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Abstract One of the greatest challenges in achieving the goals of the World Flora Online (WFO) will be to make available the huge amount of botanical information that is not yet available digitally. The New York Botanical Garden is using the Flora Neotropica monograph series as a model for digitization. We describe our efforts at digitizing Flora Neotropica monographs and why digitization of hardcopy descriptions must be a priority for the WFO project.
Collapse
|
13
|
Rosenberg MS. Contextual cross-referencing of species names for fiddler crabs (genus Uca): an experiment in cyber-taxonomy. PLoS One 2014; 9:e101704. [PMID: 25004097 PMCID: PMC4086947 DOI: 10.1371/journal.pone.0101704] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2014] [Accepted: 06/10/2014] [Indexed: 11/19/2022] Open
Abstract
Cyber-taxonomy of name usage has focused primarily on producing authoritative lists of names or cross-linking names and data across disparate databases. A feature missing from much of this work is the recording and analysis of the context in which a name was used—context which can be critical for understanding not only what name an author used, but to which currently recognized species they actually refer. An experiment on recording contextual information associated with name usage was conducted for the fiddler crabs (genus Uca). Data from approximately one quarter of all publications that mention fiddler crabs, including 95% of those published prior to 1924 and 67% of those published prior to 1976, have currently been recorded in a database. Approaches and difficulties in recording and analyzing the context of name use are discussed. These results are not meant to be a full solution, rather to highlight problems which have not been previously investigated and may act as a springboard for broader approaches and discussion. Some data on the accessibility of the literature, including in particular electronic forms of publication, are also presented. The resulting data has been integrated for general browsing into the website http://www.fiddlercrab.info; the raw data and code used to construct the website is available at https://github.com/msrosenberg/fiddlercrab.info.
Collapse
Affiliation(s)
- Michael S Rosenberg
- School of Life Sciences and Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, Arizona, United States of America
| |
Collapse
|
14
|
Parr CS, Wilson N, Leary P, Schulz KS, Lans K, Walley L, Hammock JA, Goddard A, Rice J, Studer M, Holmes JTG, Corrigan, Jr. RJ. The Encyclopedia of Life v2: Providing Global Access to Knowledge About Life on Earth. Biodivers Data J 2014; 2:e1079. [PMID: 24891832 PMCID: PMC4031434 DOI: 10.3897/bdj.2.e1079] [Citation(s) in RCA: 93] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2014] [Accepted: 04/24/2014] [Indexed: 11/24/2022] Open
Abstract
The Encyclopedia of Life (EOL, http://eol.org) aims to provide unprecedented global access to a broad range of information about life on Earth. It currently contains 3.5 million distinct pages for taxa and provides content for 1.3 million of those pages. The content is primarily contributed by EOL content partners (providers) that have a more limited geographic, taxonomic or topical scope. EOL aggregates these data and automatically integrates them based on associated scientific names and other classification information. EOL also provides interfaces for curation and direct content addition. All materials in EOL are either in the public domain or licensed under a Creative Commons license. In addition to the web interface, EOL is also accessible through an Application Programming Interface. In this paper, we review recent developments added for Version 2 of the web site and subsequent releases through Version 2.2, which have made EOL more engaging, personal, accessible and internationalizable. We outline the core features and technical architecture of the system. We summarize milestones achieved so far by EOL to present results of the current system implementation and establish benchmarks upon which to judge future improvements. We have shown that it is possible to successfully integrate large amounts of descriptive biodiversity data from diverse sources into a robust, standards-based, dynamic, and scalable infrastructure. Increasing global participation and the emergence of EOL-powered applications demonstrate that EOL is becoming a significant resource for anyone interested in biological diversity.
Collapse
Affiliation(s)
- Cynthia S. Parr
- National Museum of Natural History, Smithsonian Institution, Washington DC, United States of America
| | - Nathan Wilson
- Marine Biological Laboratory, Woods Hole, MA, United States of America
| | - Patrick Leary
- Marine Biological Laboratory, Woods Hole, MA, United States of America
| | - Katja S. Schulz
- Smithsonian Institution, Washington, DC, United States of America
| | - Kristen Lans
- Marine Biological Laboratory, Woods Hole, MA, United States of America
| | - Lisa Walley
- Marine Biological Laboratory, Woods Hole, MA, United States of America
| | | | - Anthony Goddard
- Marine Biological Laboratory, Woods Hole, MA, United States of America
| | - Jeremy Rice
- Marine Biological Laboratory, Woods Hole, MA, United States of America
| | - Marie Studer
- Harvard University, Cambridge, MA, United States of America
| | | | | |
Collapse
|
15
|
Liew TS, Vermeulen JJ, Marzuki MEB, Schilthuizen M. A cybertaxonomic revision of the micro-landsnail genus Plectostoma Adam (Mollusca, Caenogastropoda, Diplommatinidae), from Peninsular Malaysia, Sumatra and Indochina. Zookeys 2014:1-107. [PMID: 24715783 PMCID: PMC3974427 DOI: 10.3897/zookeys.393.6717] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2013] [Accepted: 02/27/2014] [Indexed: 11/12/2022] Open
Abstract
Plectostoma is a micro land snail restricted to limestone outcrops in Southeast Asia. Plectostoma was previously classified as a subgenus of Opisthostoma because of the deviation from regular coiling in many species in both taxa. This paper is the first of a two-part revision of the genus Plectostoma, and includes all non-Borneo species. In the present paper, we examined 214 collection samples of 31 species, and obtained 62 references, 290 pictures, and 155 3D-models of 29 Plectostoma species and 51 COI sequences of 19 species. To work with such a variety of taxonomic data, and then to represent it in an integrated, scaleable and accessible manner, we adopted up-to-date cybertaxonomic tools. All the taxonomic information, such as references, classification, species descriptions, specimen images, genetic data, and distribution data, were tagged and linked with cyber tools and web servers (e.g. Lifedesks, Google Earth, and Barcoding of Life Database). We elevated Plectostoma from subgenus to genus level based on morphological, ecological and genetic evidence. We revised the existing 21 Plectostoma species and described 10 new species, namely, P. dindingensissp. n., P. mengaburensissp. n., P. whittenisp. n., P. kayianisp. n., P. davisonisp. n., P. relauensissp. n., P. kubuensissp. n., P. tohchinyawisp. n., P. tenggekensissp. n., and P. ikanensissp. n. All the synthesised, semantic-tagged, and linked taxonomic information is made freely and publicly available online.
Collapse
Affiliation(s)
- Thor-Seng Liew
- Naturalis Biodiversity Center, P.O. Box 9517, 2300 RA Leiden, The Netherlands ; Institute Biology Leiden, Leiden University, P.O. Box 9516, 2300 RA Leiden, The Netherlands ; Institute for Tropical Biology and Conservation, Universiti Malaysia Sabah, Jalan UMS, 88400, Kota Kinabalu, Sabah, Malaysia ; Rimba, 4 Jalan 1/9D, 43650, Bandar Baru Bangi, Selangor, Malaysia
| | - Jaap Jan Vermeulen
- Naturalis Biodiversity Center, P.O. Box 9517, 2300 RA Leiden, The Netherlands ; jk.artandscience, Lauwerbes 8, 2318 AT, Leiden, The Netherlandss
| | | | - Menno Schilthuizen
- Naturalis Biodiversity Center, P.O. Box 9517, 2300 RA Leiden, The Netherlands ; Institute Biology Leiden, Leiden University, P.O. Box 9516, 2300 RA Leiden, The Netherlands ; Institute for Tropical Biology and Conservation, Universiti Malaysia Sabah, Jalan UMS, 88400, Kota Kinabalu, Sabah, Malaysia
| |
Collapse
|
16
|
Thessen AE, Parr CS. Knowledge extraction and semantic annotation of text from the encyclopedia of life. PLoS One 2014; 9:e89550. [PMID: 24594988 PMCID: PMC3940440 DOI: 10.1371/journal.pone.0089550] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2013] [Accepted: 01/21/2014] [Indexed: 11/19/2022] Open
Abstract
Numerous digitization and ontological initiatives have focused on translating biological knowledge from narrative text to machine-readable formats. In this paper, we describe two workflows for knowledge extraction and semantic annotation of text data objects featured in an online biodiversity aggregator, the Encyclopedia of Life. One workflow tags text with DBpedia URIs based on keywords. Another workflow finds taxon names in text using GNRD for the purpose of building a species association network. Both workflows work well: the annotation workflow has an F1 Score of 0.941 and the association algorithm has an F1 Score of 0.885. Existing text annotators such as Terminizer and DBpedia Spotlight performed well, but require some optimization to be useful in the ecology and evolution domain. Important future work includes scaling up and improving accuracy through the use of distributional semantics.
Collapse
Affiliation(s)
- Anne E. Thessen
- Arizona State University, School of Life Sciences, Tempe, Arizona, United States of America
- * E-mail:
| | - Cynthia Sims Parr
- National Museum of Natural History, Smithsonian Institution, Washington, District of Columbia, United States of America
| |
Collapse
|
17
|
Hormiga G, Griswold CE. Systematics, phylogeny, and evolution of orb-weaving spiders. ANNUAL REVIEW OF ENTOMOLOGY 2013; 59:487-512. [PMID: 24160416 DOI: 10.1146/annurev-ento-011613-162046] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
The orb-weaving spiders (Orbiculariae) comprise more than 25% of the approximately 44,000 known living spider species and produce a remarkable variety of webs. The wheel-shaped orb web is primitive to this clade, but most Orbiculariae make webs hardly recognizable as orbs. Orb-weavers date at least to the Jurassic. With no evidence for convergence of the orb web, the monophyly of the two typical orb web taxa, the cribellate Deinopoidea and ecribellate Araneoidea, remains problematic, supported only weakly by molecular studies. The sister group of the Orbiculariae also remains elusive. Despite more than 15 years of phylogenetic scrutiny, a fully resolved cladogram of the Orbiculariae families is not yet possible. More comprehensive taxon sampling, comparative morphology, and new molecular markers are required for a better understanding of orb-weaver evolution.
Collapse
Affiliation(s)
- Gustavo Hormiga
- Department of Biological Sciences, The George Washington University, Washington, DC 20052;
| | | |
Collapse
|