1
|
Jing X. The Unified Medical Language System at 30 Years and How It Is Used and Published: Systematic Review and Content Analysis. JMIR Med Inform 2021; 9:e20675. [PMID: 34236337 PMCID: PMC8433943 DOI: 10.2196/20675] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 11/25/2020] [Accepted: 07/02/2021] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND The Unified Medical Language System (UMLS) has been a critical tool in biomedical and health informatics, and the year 2021 marks its 30th anniversary. The UMLS brings together many broadly used vocabularies and standards in the biomedical field to facilitate interoperability among different computer systems and applications. OBJECTIVE Despite its longevity, there is no comprehensive publication analysis of the use of the UMLS. Thus, this review and analysis is conducted to provide an overview of the UMLS and its use in English-language peer-reviewed publications, with the objective of providing a comprehensive understanding of how the UMLS has been used in English-language peer-reviewed publications over the last 30 years. METHODS PubMed, ACM Digital Library, and the Nursing & Allied Health Database were used to search for studies. The primary search strategy was as follows: UMLS was used as a Medical Subject Headings term or a keyword or appeared in the title or abstract. Only English-language publications were considered. The publications were screened first, then coded and categorized iteratively, following the grounded theory. The review process followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. RESULTS A total of 943 publications were included in the final analysis. Moreover, 32 publications were categorized into 2 categories; hence the total number of publications before duplicates are removed is 975. After analysis and categorization of the publications, UMLS was found to be used in the following emerging themes or areas (the number of publications and their respective percentages are given in parentheses): natural language processing (230/975, 23.6%), information retrieval (125/975, 12.8%), terminology study (90/975, 9.2%), ontology and modeling (80/975, 8.2%), medical subdomains (76/975, 7.8%), other language studies (53/975, 5.4%), artificial intelligence tools and applications (46/975, 4.7%), patient care (35/975, 3.6%), data mining and knowledge discovery (25/975, 2.6%), medical education (20/975, 2.1%), degree-related theses (13/975, 1.3%), digital library (5/975, 0.5%), and the UMLS itself (150/975, 15.4%), as well as the UMLS for other purposes (27/975, 2.8%). CONCLUSIONS The UMLS has been used successfully in patient care, medical education, digital libraries, and software development, as originally planned, as well as in degree-related theses, the building of artificial intelligence tools, data mining and knowledge discovery, foundational work in methodology, and middle layers that may lead to advanced products. Natural language processing, the UMLS itself, and information retrieval are the 3 most common themes that emerged among the included publications. The results, although largely related to academia, demonstrate that UMLS achieves its intended uses successfully, in addition to achieving uses broadly beyond its original intentions.
Collapse
Affiliation(s)
- Xia Jing
- Department of Public Health Sciences, College of Behavioral, Social and Health Sciences, Clemson University, Clemson, SC, United States
| |
Collapse
|
2
|
Boyd AD, Dunn Lopez K, Lugaresi C, Macieira T, Sousa V, Acharya S, Balasubramanian A, Roussi K, Keenan GM, Lussier YA, Li J'J, Burton M, Di Eugenio B. Physician nurse care: A new use of UMLS to measure professional contribution: Are we talking about the same patient a new graph matching algorithm? Int J Med Inform 2018; 113:63-71. [PMID: 29602435 PMCID: PMC5909845 DOI: 10.1016/j.ijmedinf.2018.02.002] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Revised: 12/22/2017] [Accepted: 02/03/2018] [Indexed: 02/07/2023]
Abstract
BACKGROUND Physician and nurses have worked together for generations; however, their language and training are vastly different; comparing and contrasting their work and their joint impact on patient outcomes is difficult in light of this difference. At the same time, the EHR only includes the physician perspective via the physician-authored discharge summary, but not nurse documentation. Prior research in this area has focused on collaboration and the usage of similar terminology. OBJECTIVE The objective of the study is to gain insight into interprofessional care by developing a computational metric to identify similarities, related concepts and differences in physician and nurse work. METHODS 58 physician discharge summaries and the corresponding nurse plans of care were transformed into Unified Medical Language System (UMLS) Concept Unique Identifiers (CUIs). MedLEE, a Natural Language Processing (NLP) program, extracted "physician terms" from free-text physician summaries. The nursing plans of care were constructed using the HANDS© nursing documentation software. HANDS© utilizes structured terminologies: nursing diagnosis (NANDA-I), outcomes (NOC), and interventions (NIC) to create "nursing terms". The physician's and nurse's terms were compared using the UMLS network for relatedness, overlaying the physician and nurse terms for comparison. Our overarching goal is to provide insight into the care, by innovatively applying graph algorithms to the UMLS network. We reveal the relationships between the care provided by each professional that is specific to the patient level. RESULTS We found that only 26% of patients had synonyms (identical UMLS CUIs) between the two professions' documentation. On average, physicians' discharge summaries contain 27 terms and nurses' documentation, 18. Traversing the UMLS network, we found an average of 4 terms related (distance less than 2) between the professions, leaving most concepts as unrelated between nurse and physician care. CONCLUSION Our hypothesis that physician's and nurse's practice domains are markedly different is supported by the preliminary, quantitative evidence we found. Leveraging the UMLS network and graph traversal algorithms, allows us to compare and contrast nursing and physician care on a single patient, enabling a more complete picture of patient care. We can differentiate professional contributions to patient outcomes and related and divergent concepts by each profession.
Collapse
Affiliation(s)
- Andrew D Boyd
- Department of Biomedical and Health Information Sciences, College of Applied Health Sciences, University of Illinois at Chicago, 1919 W Taylor St., Chicago, IL 60612, United States.
| | - Karen Dunn Lopez
- Department of Health System Science, College of Nursing, University of Illinois at Chicago, 845 South Damen Ave, Chicago, IL 60612, United States
| | - Camillo Lugaresi
- Department of Computer Science, College of Engineering, University of Illinois at Chicago, 851 South Morgan Street, Chicago, IL 60607, United States
| | - Tamara Macieira
- Department of Health System Science, College of Nursing, University of Illinois at Chicago, 845 South Damen Ave, Chicago, IL 60612, United States
| | - Vanessa Sousa
- Department of Health System Science, College of Nursing, University of Illinois at Chicago, 845 South Damen Ave, Chicago, IL 60612, United States
| | - Sabita Acharya
- Department of Computer Science, College of Engineering, University of Illinois at Chicago, 851 South Morgan Street, Chicago, IL 60607, United States
| | - Abhinaya Balasubramanian
- Department of Computer Science, College of Engineering, University of Illinois at Chicago, 851 South Morgan Street, Chicago, IL 60607, United States
| | - Khawllah Roussi
- Department of Biomedical and Health Information Sciences, College of Applied Health Sciences, University of Illinois at Chicago, 1919 W Taylor St., Chicago, IL 60612, United States
| | - Gail M Keenan
- Department of Health Care Environments and Systems, College of Nursing, University of Florida, PO Box 100187, Gainesville, FL 32610, United States
| | - Yves A Lussier
- Department of Medicine, College of Medicine, University of Arizona, 1501 N. Campbell Dr, Tucson, AZ 85724, United States; The University of Arizona Health Sciences Center, 1295 North Martin Ave, Tucson, AZ 85721, United States
| | - Jianrong 'John' Li
- Department of Medicine, College of Medicine, University of Arizona, 1501 N. Campbell Dr, Tucson, AZ 85724, United States; The University of Arizona Health Sciences Center, 1295 North Martin Ave, Tucson, AZ 85721, United States
| | - Michel Burton
- Department of Biomedical and Health Information Sciences, College of Applied Health Sciences, University of Illinois at Chicago, 1919 W Taylor St., Chicago, IL 60612, United States
| | - Barbara Di Eugenio
- Department of Computer Science, College of Engineering, University of Illinois at Chicago, 851 South Morgan Street, Chicago, IL 60607, United States
| |
Collapse
|
3
|
Madkour M, Benhaddou D, Tao C. Temporal data representation, normalization, extraction, and reasoning: A review from clinical domain. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2016; 128:52-68. [PMID: 27040831 PMCID: PMC4837648 DOI: 10.1016/j.cmpb.2016.02.007] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Accepted: 02/16/2016] [Indexed: 05/04/2023]
Abstract
BACKGROUND AND OBJECTIVE We live our lives by the calendar and the clock, but time is also an abstraction, even an illusion. The sense of time can be both domain-specific and complex, and is often left implicit, requiring significant domain knowledge to accurately recognize and harness. In the clinical domain, the momentum gained from recent advances in infrastructure and governance practices has enabled the collection of tremendous amount of data at each moment in time. Electronic health records (EHRs) have paved the way to making these data available for practitioners and researchers. However, temporal data representation, normalization, extraction and reasoning are very important in order to mine such massive data and therefore for constructing the clinical timeline. The objective of this work is to provide an overview of the problem of constructing a timeline at the clinical point of care and to summarize the state-of-the-art in processing temporal information of clinical narratives. METHODS This review surveys the methods used in three important area: modeling and representing of time, medical NLP methods for extracting time, and methods of time reasoning and processing. The review emphasis on the current existing gap between present methods and the semantic web technologies and catch up with the possible combinations. RESULTS The main findings of this review are revealing the importance of time processing not only in constructing timelines and clinical decision support systems but also as a vital component of EHR data models and operations. CONCLUSIONS Extracting temporal information in clinical narratives is a challenging task. The inclusion of ontologies and semantic web will lead to better assessment of the annotation task and, together with medical NLP techniques, will help resolving granularity and co-reference resolution problems.
Collapse
Affiliation(s)
- Mohcine Madkour
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, 7000 Fannin St, Houston, TX 77030, United States.
| | - Driss Benhaddou
- Department of Engineering Technology, University of Houston, 4800 Calhoun Rd, Houston, TX 77004, United States.
| | - Cui Tao
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, 7000 Fannin St, Houston, TX 77030, United States.
| |
Collapse
|
4
|
Building Integrated Ontological Knowledge Structures with Efficient Approximation Algorithms. BIOMED RESEARCH INTERNATIONAL 2015; 2015:501528. [PMID: 26550571 PMCID: PMC4621328 DOI: 10.1155/2015/501528] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2014] [Revised: 12/30/2014] [Accepted: 01/01/2015] [Indexed: 11/29/2022]
Abstract
The integration of ontologies builds knowledge structures which brings new understanding on existing
terminologies and their associations. With the steady increase in the number of ontologies, automatic
integration of ontologies is preferable over manual solutions in many applications. However, available
works on ontology integration are largely heuristic without guarantees on the quality of the integration
results. In this work, we focus on the integration of ontologies with hierarchical structures. We identified
optimal structures in this problem and proposed optimal and efficient approximation algorithms for
integrating a pair of ontologies. Furthermore, we extend the basic problem to address the integration
of a large number of ontologies, and correspondingly we proposed an efficient approximation algorithm
for integrating multiple ontologies. The empirical study on both real ontologies and synthetic data
demonstrates the effectiveness of our proposed approaches. In addition, the results of integration between
gene ontology and National Drug File Reference Terminology suggest that our method provides a novel
way to perform association studies between biomedical terms.
Collapse
|
5
|
Albin A, Ji X, Borlawsky TB, Ye Z, Lin S, Payne PR, Huang K, Xiang Y. Enabling online studies of conceptual relationships between medical terms: developing an efficient web platform. JMIR Med Inform 2014; 2:e23. [PMID: 25600290 PMCID: PMC4288067 DOI: 10.2196/medinform.3387] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2014] [Revised: 08/15/2014] [Accepted: 08/16/2014] [Indexed: 11/13/2022] Open
Abstract
Background The Unified Medical Language System (UMLS) contains many important ontologies in which terms are connected by semantic relations. For many studies on the relationships between biomedical concepts, the use of transitively associated information from ontologies and the UMLS has been shown to be effective. Although there are a few tools and methods available for extracting transitive relationships from the UMLS, they usually have major restrictions on the length of transitive relations or on the number of data sources. Objective Our goal was to design an efficient online platform that enables efficient studies on the conceptual relationships between any medical terms. Methods To overcome the restrictions of available methods and to facilitate studies on the conceptual relationships between medical terms, we developed a Web platform, onGrid, that supports efficient transitive queries and conceptual relationship studies using the UMLS. This framework uses the latest technique in converting natural language queries into UMLS concepts, performs efficient transitive queries, and visualizes the result paths. It also dynamically builds a relationship matrix for two sets of input biomedical terms. We are thus able to perform effective studies on conceptual relationships between medical terms based on their relationship matrix. Results The advantage of onGrid is that it can be applied to study any two sets of biomedical concept relations and the relations within one set of biomedical concepts. We use onGrid to study the disease-disease relationships in the Online Mendelian Inheritance in Man (OMIM). By crossvalidating our results with an external database, the Comparative Toxicogenomics Database (CTD), we demonstrated that onGrid is effective for the study of conceptual relationships between medical terms. Conclusions onGrid is an efficient tool for querying the UMLS for transitive relations, studying the relationship between medical terms, and generating hypotheses.
Collapse
Affiliation(s)
- Aaron Albin
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
| | | | | | | | | | | | | | | |
Collapse
|
6
|
Ren K, Lai AM, Mukhopadhyay A, Machiraju R, Huang K, Xiang Y. Effectively processing medical term queries on the UMLS Metathesaurus by layered dynamic programming. BMC Med Genomics 2014; 7 Suppl 1:S11. [PMID: 25079259 PMCID: PMC4101532 DOI: 10.1186/1755-8794-7-s1-s11] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
Background Mapping medical terms to standardized UMLS concepts is a basic step for leveraging biomedical texts in data management and analysis. However, available methods and tools have major limitations in handling queries over the UMLS Metathesaurus that contain inaccurate query terms, which frequently appear in real world applications. Methods To provide a practical solution for this task, we propose a layered dynamic programming mapping (LDPMap) approach, which can efficiently handle these queries. LDPMap uses indexing and two layers of dynamic programming techniques to efficiently map a biomedical term to a UMLS concept. Results Our empirical study shows that LDPMap achieves much faster query speeds than LCS. In comparison to the UMLS Metathesaurus Browser and MetaMap, LDPMap is much more effective in querying the UMLS Metathesaurus for inaccurately spelled medical terms, long medical terms, and medical terms with special characters. Conclusions These results demonstrate that LDPMap is an efficient and effective method for mapping medical terms to the UMLS Metathesaurus.
Collapse
|
7
|
Smith J, Denny J, Chen Q, Nian H, Spickard III A, Rosenbloom ST, Miller RA. Lessons learned from developing a drug evidence base to support pharmacovigilance. Appl Clin Inform 2013; 4:596-617. [PMID: 24454585 PMCID: PMC3885918 DOI: 10.4338/aci-2013-08-ra-0062] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2013] [Accepted: 11/06/2013] [Indexed: 12/14/2022] Open
Abstract
OBJECTIVE This work identified challenges associated with extraction and representation of medication-related information from publicly available electronic sources. METHODS We gained direct observational experience through creating and evaluating the Drug Evidence Base (DEB), a repository of drug indications and adverse effects (ADEs), and supplemented this through literature review. We extracted DEB content from the National Drug File Reference Terminology, from aggregated MEDLINE co-occurrence data, and from the National Library of Medicine's DailyMed. To understand better the similarities, differences and problems with the content of DEB and the SIDER Side Effect Resource, and Vanderbilt's MEDI Indication Resource, we carried out statistical evaluations and human expert reviews. RESULTS While DEB, SIDER, and MEDI often agreed on medication indications and side effects, cross-system shortcomings limit their current utility. The drug information resources we evaluated frequently employed multiple, disparate vaguely related UMLS concepts to represent a single specific clinical drug indication or adverse effect. Thus, evaluations comparing drug-indication and drug-ADE coverage for such resources will encounter substantial numbers of false negative and false positive matches. Furthermore, our review found that many indication and ADE relationships are too complex - logically and temporally - to represent within existing systems. CONCLUSION To enhance applicability and utility, future drug information systems deriving indications and ADEs from public resources must represent clinical concepts uniformly and as precisely as possible. Future systems must also better represent the inherent complexity of indications and ADEs.
Collapse
Affiliation(s)
- J.C. Smith
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
| | | | | | - H. Nian
- Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA;
4School of Nursing, Vanderbilt University, Nashville, Tennessee, USA
| | | | | | | |
Collapse
|
8
|
Leveraging concept-based approaches to identify potential phyto-therapies. J Biomed Inform 2013; 46:602-14. [PMID: 23665360 DOI: 10.1016/j.jbi.2013.04.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2012] [Revised: 04/24/2013] [Accepted: 04/25/2013] [Indexed: 02/04/2023]
Abstract
The potential of plant-based remedies has been documented in both traditional and contemporary biomedical literature. Such types of text sources may thus be sources from which one might identify potential plant-based therapies ("phyto-therapies"). Concept-based analytic approaches have been shown to uncover knowledge embedded within biomedical literature. However, to date there has been limited attention towards leveraging such techniques for the identification of potential phyto-therapies. This study presents concept-based analytic approaches for the retrieval and ranking of associations between plants and human diseases. Focusing on identification of phyto-therapies described in MEDLINE, both MeSH descriptors used for indexing and MetaMap inferred UMLS concepts are considered. Furthermore, the identification and ranking consider both direct (i.e., plant concepts directly correlated with disease concepts) and inferred (i.e., plant concepts associated with disease concepts based on shared signs and symptoms) relationships. Based on the two scoring methodologies used in this study, it was found that a Vector Space Model approach outperformed probabilistic reliability based inferences. An evaluation of the approach is provided based on therapeutic interventions catalogued in both ClinicalTrials.gov and NDF-RT. The promising findings from this feasibility study highlight the challenges and applicability of concept-based analytic strategies for distilling phyto-therapeutic knowledge from text based knowledge sources like MEDLINE.
Collapse
|