Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Patterson DJ, Remsen D, Marino WA, Norton C. Taxonomic indexing--extending the role of taxonomy. Syst Biol 2006;55:367-73. [PMID: 16861205 DOI: 10.1080/10635150500541680] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open

For:	Patterson DJ, Remsen D, Marino WA, Norton C. Taxonomic indexing--extending the role of taxonomy. Syst Biol 2006;55:367-73. [PMID: 16861205 DOI: 10.1080/10635150500541680] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open

Number

Cited by Other Article(s)

Remsen D. The use and limits of scientific names in biological informatics. Zookeys 2016:207-23. [PMID: 26877660 PMCID: PMC4741222 DOI: 10.3897/zookeys.550.9546] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2015] [Accepted: 03/09/2015] [Indexed: 11/21/2022] Open

Abstract

Scientific names serve to label biodiversity information: information related to species. Names, and their underlying taxonomic definitions, however, are unstable and ambiguous. This negatively impacts the utility of names as identifiers and as effective indexing tools in biological informatics where names are commonly utilized for searching, retrieving and integrating information about species. Semiotics provides a general model for describing the relationship between taxon names and taxon concepts. It distinguishes syntactics, which governs relationships among names, from semantics, which represents the relations between those labels and the taxa to which they refer. In the semiotic context, changes in semantics (i.e., taxonomic circumscription) do not consistently result in a corresponding and reflective change in syntax. Further, when syntactic changes do occur, they may be in response to semantic changes or in response to syntactic rules. This lack of consistency in the cardinal relationship between names and taxa places limits on how scientific names may be used in biological informatics in initially anchoring, and in the subsequent retrieval and integration, of relevant biodiversity information. Precision and recall are two measures of relevance. In biological taxonomy, recall is negatively impacted by changes or ambiguity in syntax while precision is negatively impacted when there are changes or ambiguity in semantics. Because changes in syntax are not correlated with changes in semantics, scientific names may be used, singly or conflated into synonymous sets, to improve recall in pattern recognition or search and retrieval. Names cannot be used, however, to improve precision. This is because changes in syntax do not uniquely identify changes in circumscription.

These observations place limits on the utility of scientific names within biological informatics applications that rely on names as identifiers for taxa. Taxonomic systems and services used to organize and integrate information about taxa must accommodate the inherent semantic ambiguity of scientific names. The capture and articulation of circumscription differences (i.e., multiple taxon concepts) within such systems must be accompanied with distinct concept identifiers that can be employed in association with, or in replacement of, traditional scientific names.

Collapse

Akella LM, Norton CN, Miller H. NetiNeti: discovery of scientific names from text using machine learning methods. BMC Bioinformatics 2012;13:211. [PMID: 22913485 PMCID: PMC3542245 DOI: 10.1186/1471-2105-13-211] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2010] [Accepted: 08/06/2012] [Indexed: 12/12/2022] Open

Abstract

Background

A scientific name for an organism can be associated with almost all biological data. Name identification is an important step in many text mining tasks aiming to extract useful information from biological, biomedical and biodiversity text sources. A scientific name acts as an important metadata element to link biological information.

Results

We present NetiNeti (Name Extraction from Textual Information-Name Extraction for Taxonomic Indexing), a machine learning based approach for recognition of scientific names including the discovery of new species names from text that will also handle misspellings, OCR errors and other variations in names. The system generates candidate names using rules for scientific names and applies probabilistic machine learning methods to classify names based on structural features of candidate names and features derived from their contexts. NetiNeti can also disambiguate scientific names from other names using the contextual information. We evaluated NetiNeti on legacy biodiversity texts and biomedical literature (MEDLINE). NetiNeti performs better (precision = 98.9% and recall = 70.5%) compared to a popular dictionary based approach (precision = 97.5% and recall = 54.3%) on a 600-page biodiversity book that was manually marked by an annotator. On a small set of PubMed Central’s full text articles annotated with scientific names, the precision and recall values are 98.5% and 96.2% respectively. NetiNeti found more than 190,000 unique binomial and trinomial names in more than 1,880,000 PubMed records when used on the full MEDLINE database. NetiNeti also successfully identifies almost all of the new species names mentioned within web pages.

Conclusions

We present NetiNeti, a machine learning based approach for identification and discovery of scientific names. The system implementing the approach can be accessed at http://namefinding.ubio.org.

Collapse

Wheeler Q, Bourgoin T, Coddington J, Gostony T, Hamilton A, Larimer R, Polaszek A, Schauff M, Solis MA. Nomenclatural benchmarking: the roles of digital typification and telemicroscopy. Zookeys 2012;209:193-202. [PMID: 22859888 PMCID: PMC3406476 DOI: 10.3897/zookeys.209.3486] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2012] [Accepted: 07/13/2012] [Indexed: 11/12/2022] Open

Abstract

Nomenclatural benchmarking is the periodic realignment of species names with species theories and is necessary for the accurate and uniform use of Linnaean binominals in the face of changing species limits. Gaining access to types, often for little more than a cursory examination by an expert, is a major bottleneck in the advance and availability of biodiversity informatics. For the nearly two million described species it has been estimated that five to six million name-bearing type specimens exist, including those for synonymized binominals. Recognizing that examination of types in person will remain necessary in special cases, we propose a four-part strategy for opening access to types that relies heavily on digitization and that would eliminate much of the bottleneck: (1) modify codes of nomenclature to create registries of nomenclatural acts, such as the proposed ZooBank, that include a requirement for digital representations (e-types) for all newly described species to avoid adding to backlog; (2) an "r" strategy that would engineer and deploy a network of automated instruments capable of rapidly creating 3-D images of type specimens not requiring participation of taxon experts; (3) a "K" strategy using remotely operable microscopes to engage taxon experts in targeting and annotating informative characters of types to supplement and extend information content of rapidly acquired e-types, a process that can be done on an as-needed basis as in the normal course of revisionary taxonomy; and (4) creation of a global e-type archive associated with the commissions on nomenclature and species registries providing one-stop-shopping for e-types. We describe a first generation implementation of the "K" strategy that adapts current technology to create a network of Remotely Operable Benchmarkers Of Types (ROBOT) specifically engineered to handle the largest backlog of types, pinned insect specimens. The three initial instruments will be in the Smithsonian Institution(Washington, DC), Natural History Museum (London), and Museum National d'Histoire Naturelle (Paris), networking the three largest insect collections in the world with entomologists worldwide. These three instruments make possible remote examination, manipulation, and photography of types for more than 600,000 species. This is a cybertaxonomy demonstration project that we anticipate will lead to similar instruments for a wide range of museum specimens and objects as well as revolutionary changes in collaborative taxonomy and formal and public taxonomic education.

Collapse

Ningthoujam SS, Talukdar AD, Potsangbam KS, Choudhury MD. Challenges in developing medicinal plant databases for sharing ethnopharmacological knowledge. JOURNAL OF ETHNOPHARMACOLOGY 2012;141:9-32. [PMID: 22401841 DOI: 10.1016/j.jep.2012.02.042] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2011] [Revised: 02/19/2012] [Accepted: 02/25/2012] [Indexed: 05/31/2023]

Abstract

ETHNOPHARMACOLOGICAL RELEVANCE

Major research contributions in ethnopharmacology have generated vast amount of data associated with medicinal plants. Computerized databases facilitate data management and analysis making coherent information available to researchers, planners and other users. Web-based databases also facilitate knowledge transmission and feed the circle of information exchange between the ethnopharmacological studies and public audience. However, despite the development of many medicinal plant databases, a lack of uniformity is still discernible. Therefore, it calls for defining a common standard to achieve the common objectives of ethnopharmacology.

AIM OF THE STUDY

The aim of the study is to review the diversity of approaches in storing ethnopharmacological information in databases and to provide some minimal standards for these databases.

MATERIALS AND METHODS

Survey for articles on medicinal plant databases was done on the Internet by using selective keywords. Grey literatures and printed materials were also searched for information. Listed resources were critically analyzed for their approaches in content type, focus area and software technology.

RESULTS

Necessity for rapid incorporation of traditional knowledge by compiling primary data has been felt. While citation collection is common approach for information compilation, it could not fully assimilate local literatures which reflect traditional knowledge. Need for defining standards for systematic evaluation, checking quality and authenticity of the data is felt. Databases focussing on thematic areas, viz., traditional medicine system, regional aspect, disease and phytochemical information are analyzed. Issues pertaining to data standard, data linking and unique identification need to be addressed in addition to general issues like lack of update and sustainability. In the background of the present study, suggestions have been made on some minimum standards for development of medicinal plant database.

CONCLUSION

In spite of variations in approaches, existence of many overlapping features indicates redundancy of resources and efforts. As the development of global data in a single database may not be possible in view of the culture-specific differences, efforts can be given to specific regional areas. Existing scenario calls for collaborative approach for defining a common standard in medicinal plant database for knowledge sharing and scientific advancement.

Collapse

Wheeler QD, Knapp S, Stevenson DW, Stevenson J, Blum SD, Boom BM, Borisy GG, Buizer JL, De Carvalho MR, Cibrian A, Donoghue MJ, Doyle V, Gerson EM, Graham CH, Graves P, Graves SJ, Guralnick RP, Hamilton AL, Hanken J, Law W, Lipscomb DL, Lovejoy TE, Miller H, Miller JS, Naeem S, Novacek MJ, Page LM, Platnick NI, Porter-Morgan H, Raven PH, Solis MA, Valdecasas AG, Van Der Leeuw S, Vasco A, Vermeulen N, Vogel J, Walls RL, Wilson EO, Woolley JB. Mapping the biosphere: exploring species to understand the origin, organization and sustainability of biodiversity. SYST BIODIVERS 2012. [DOI: 10.1080/14772000.2012.665095] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]

Huang X, Qiao G. Biodiversity data sharing is not just about species names: response to Santos and Branco. Trends Ecol Evol 2012. [DOI: 10.1016/j.tree.2011.10.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Jones AC, White RJ, Orme ER. Identifying and relating biological concepts in the Catalogue of Life. J Biomed Semantics 2011;2:7. [PMID: 22004596 PMCID: PMC3245425 DOI: 10.1186/2041-1480-2-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2011] [Accepted: 10/17/2011] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

In this paper we describe our experience of adding globally unique identifiers to the Species 2000 and ITIS Catalogue of Life, an on-line index of organisms which is intended, ultimately, to cover all the world's known species. The scientific species names held in the Catalogue are names that already play an extensive role as terms in the organisation of information about living organisms in bioinformatics and other domains, but the effectiveness of their use is hindered by variation in individuals' opinions and understanding of these terms; indeed, in some cases more than one name will have been used to refer to the same organism. This means that it is desirable to be able to give unique labels to each of these differing concepts within the catalogue and to be able to determine which concepts are being used in other systems, in order that they can be associated with the concepts in the catalogue. Not only is this needed, but it is also necessary to know the relationships between alternative concepts that scientists might have employed, as these determine what can be inferred when data associated with related concepts is being processed. A further complication is that the catalogue itself is evolving as scientific opinion changes due to an increasing understanding of life.

RESULTS

We describe how we are using Life Science Identifiers (LSIDs) as globally unique identifiers in the Catalogue of Life, explaining how the mapping to species concepts is performed, how concepts are associated with specific editions of the catalogue, and how the Taxon Concept Schema has been adopted in order to express information about concepts and their relationships. We explore the implications of using globally unique identifiers in order to refer to abstract concepts such as species, which incorporate at least a measure of subjectivity in their definition, in contrast with the more traditional use of such identifiers to refer to more tangible entities, events, documents, observations, etc.

CONCLUSIONS

A major reason for adopting identifiers such as LSIDs is to facilitate data integration. We have demonstrated the incorporation of LSIDs into the Catalogue of Life, in a manner consistent with the biodiversity informatics community's conventions for LSID use. The Catalogue of Life is therefore available as a taxonomy of organisms for use within various disciplines, including biomedical research, by software written with an awareness of these conventions.

Collapse

Gerner M, Nenadic G, Bergman CM. LINNAEUS: a species name identification system for biomedical literature. BMC Bioinformatics 2010;11:85. [PMID: 20149233 PMCID: PMC2836304 DOI: 10.1186/1471-2105-11-85] [Citation(s) in RCA: 153] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2009] [Accepted: 02/11/2010] [Indexed: 11/25/2022] Open

Lughadha EN, Miller C. Accelerating global access to plant diversity information. TRENDS IN PLANT SCIENCE 2009;14:622-628. [PMID: 19836991 DOI: 10.1016/j.tplants.2009.08.014] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2009] [Revised: 08/26/2009] [Accepted: 08/27/2009] [Indexed: 05/28/2023]

Paton A. Biodiversity informatics and the plant conservation baseline. TRENDS IN PLANT SCIENCE 2009;14:629-637. [PMID: 19783196 DOI: 10.1016/j.tplants.2009.08.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2009] [Revised: 07/24/2009] [Accepted: 08/12/2009] [Indexed: 05/28/2023]

Clark BR, Godfray HCJ, Kitching IJ, Mayo SJ, Scoble MJ. Taxonomy as an eScience. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2009;367:953-966. [PMID: 19087937 DOI: 10.1098/rsta.2008.0190] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]

The future role of bio-ontologies for developing a general data standard in biology: chance and challenge for zoo-morphology. ZOOMORPHOLOGY 2008. [DOI: 10.1007/s00435-008-0081-5] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Sarkar IN, Schenk R, Norton CN. Exploring historical trends using taxonomic name metadata. BMC Evol Biol 2008;8:144. [PMID: 18477399 PMCID: PMC2408592 DOI: 10.1186/1471-2148-8-144] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2007] [Accepted: 05/13/2008] [Indexed: 11/10/2022] Open

Page RDM. Biodiversity informatics: the challenge of linking data and the role of shared identifiers. Brief Bioinform 2008;9:345-54. [PMID: 18445641 DOI: 10.1093/bib/bbn022] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Godfray HCJ, Clark BR, Kitching IJ, Mayo SJ, Scoble MJ. The Web and the Structure of Taxonomy. Syst Biol 2007;56:943-55. [DOI: 10.1080/10635150701777521] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open

Page RDM. TBMap: a taxonomic perspective on the phylogenetic database TreeBASE. BMC Bioinformatics 2007;8:158. [PMID: 17511869 PMCID: PMC1885449 DOI: 10.1186/1471-2105-8-158] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2007] [Accepted: 05/18/2007] [Indexed: 11/10/2022] Open

Leary PR, Remsen DP, Norton CN, Patterson DJ, Sarkar IN. uBioRSS: Tracking taxonomic literature using RSS. Bioinformatics 2007;23:1434-6. [PMID: 17392332 DOI: 10.1093/bioinformatics/btm109] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Padial JM, de la Riva I. Taxonomic Inflation and the Stability of Species Lists: The Perils of Ostrich's Behavior. Syst Biol 2006;55:859-67. [PMID: 17060206 DOI: 10.1080/1063515060081588] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open