Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

104
(from Reference Citation Analysis)

Article PDFs (7)

Cited by > 0 (71)

Searched Name

Yehoshua Perl

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Zheng L, Perl Y, He Y. Big knowledge visualization of the COVID-19 CIDO ontology evolution. BMC Med Inform Decis Mak 2023;23:88. [PMID: 37161560 PMCID: PMC10169115 DOI: 10.1186/s12911-023-02184-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 04/20/2023] [Indexed: 05/11/2023] Open

Abstract

BACKGROUND

The extensive international research for medications and vaccines for the devastating COVID-19 pandemic requires a standard reference ontology. Among the current COVID-19 ontologies, the Coronavirus Infectious Disease Ontology (CIDO) is the largest one. Furthermore, it keeps growing very frequently. Researchers using CIDO as a reference ontology, need a quick update about the content added in a recent release to know how relevant the new concepts are to their research needs. Although CIDO is only a medium size ontology, it is still a large knowledge base posing a challenge for a user interested in obtaining the "big picture" of content changes between releases. Both a theoretical framework and a proper visualization are required to provide such a "big picture".

METHODS

The child-of-based layout of the weighted aggregate partial-area taxonomy summarization network (WAT) provides a "big picture" convenient visualization of the content of an ontology. In this paper we address the "big picture" of content changes between two releases of an ontology. We introduce a new DIFF framework named Diff Weighted Aggregate Taxonomy (DWAT) to display the differences between the WATs of two releases of an ontology. We use a layered approach which consists first of a DWAT of major subjects in CIDO, and then drill down a major subject of interest in the top-level DWAT to obtain a DWAT of secondary subjects and even further refined layers.

RESULTS

A visualization of the Diff Weighted Aggregate Taxonomy is demonstrated on the CIDO ontology. The evolution of CIDO between 2020 and 2022 is demonstrated in two perspectives. Drilling down for a DWAT of secondary subject networks is also demonstrated. We illustrate how the DWAT of CIDO provides insight into its evolution.

CONCLUSIONS

The new Diff Weighted Aggregate Taxonomy enables a layered approach to view the "big picture" of the changes in the content between two releases of an ontology.

Collapse

Keloth VK, Zhou S, Lindemann L, Zheng L, Elhanan G, Einstein AJ, Geller J, Perl Y. Mining of EHR for interface terminology concepts for annotating EHRs of COVID patients. BMC Med Inform Decis Mak 2023;23:40. [PMID: 36829139 PMCID: PMC9951157 DOI: 10.1186/s12911-023-02136-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 02/09/2023] [Indexed: 02/26/2023] Open

Abstract

BACKGROUND

Two years into the COVID-19 pandemic and with more than five million deaths worldwide, the healthcare establishment continues to struggle with every new wave of the pandemic resulting from a new coronavirus variant. Research has demonstrated that there are variations in the symptoms, and even in the order of symptom presentations, in COVID-19 patients infected by different SARS-CoV-2 variants (e.g., Alpha and Omicron). Textual data in the form of admission notes and physician notes in the Electronic Health Records (EHRs) is rich in information regarding the symptoms and their orders of presentation. Unstructured EHR data is often underutilized in research due to the lack of annotations that enable automatic extraction of useful information from the available extensive volumes of textual data.

METHODS

We present the design of a COVID Interface Terminology (CIT), not just a generic COVID-19 terminology, but one serving a specific purpose of enabling automatic annotation of EHRs of COVID-19 patients. CIT was constructed by integrating existing COVID-related ontologies and mining additional fine granularity concepts from clinical notes. The iterative mining approach utilized the techniques of 'anchoring' and 'concatenation' to identify potential fine granularity concepts to be added to the CIT. We also tested the generalizability of our approach on a hold-out dataset and compared the annotation coverage to the coverage obtained for the dataset used to build the CIT.

RESULTS

Our experiments demonstrate that this approach results in higher annotation coverage compared to existing ontologies such as SNOMED CT and Coronavirus Infectious Disease Ontology (CIDO). The final version of CIT achieved about 20% more coverage than SNOMED CT and 50% more coverage than CIDO. In the future, the concepts mined and added into CIT could be used as training data for machine learning models for mining even more concepts into CIT and further increasing the annotation coverage.

CONCLUSION

In this paper, we demonstrated the construction of a COVID interface terminology that can be utilized for automatically annotating EHRs of COVID-19 patients. The techniques presented can identify frequently documented fine granularity concepts that are missing in other ontologies thereby increasing the annotation coverage.

Collapse

He Y, Yu H, Huffman A, Lin AY, Natale DA, Beverley J, Zheng L, Perl Y, Wang Z, Liu Y, Ong E, Wang Y, Huang P, Tran L, Du J, Shah Z, Shah E, Desai R, Huang HH, Tian Y, Merrell E, Duncan WD, Arabandi S, Schriml LM, Zheng J, Masci AM, Wang L, Liu H, Smaili FZ, Hoehndorf R, Pendlington ZM, Roncaglia P, Ye X, Xie J, Tang YW, Yang X, Peng S, Zhang L, Chen L, Hur J, Omenn GS, Athey B, Smith B. A comprehensive update on CIDO: the community-based coronavirus infectious disease ontology. J Biomed Semantics 2022;13:25. [PMID: 36271389 PMCID: PMC9585694 DOI: 10.1186/s13326-022-00279-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 09/13/2022] [Indexed: 11/24/2022] Open

Abstract

Background

The current COVID-19 pandemic and the previous SARS/MERS outbreaks of 2003 and 2012 have resulted in a series of major global public health crises. We argue that in the interest of developing effective and safe vaccines and drugs and to better understand coronaviruses and associated disease mechenisms it is necessary to integrate the large and exponentially growing body of heterogeneous coronavirus data. Ontologies play an important role in standard-based knowledge and data representation, integration, sharing, and analysis. Accordingly, we initiated the development of the community-based Coronavirus Infectious Disease Ontology (CIDO) in early 2020.

Results

As an Open Biomedical Ontology (OBO) library ontology, CIDO is open source and interoperable with other existing OBO ontologies. CIDO is aligned with the Basic Formal Ontology and Viral Infectious Disease Ontology. CIDO has imported terms from over 30 OBO ontologies. For example, CIDO imports all SARS-CoV-2 protein terms from the Protein Ontology, COVID-19-related phenotype terms from the Human Phenotype Ontology, and over 100 COVID-19 terms for vaccines (both authorized and in clinical trial) from the Vaccine Ontology. CIDO systematically represents variants of SARS-CoV-2 viruses and over 300 amino acid substitutions therein, along with over 300 diagnostic kits and methods. CIDO also describes hundreds of host-coronavirus protein-protein interactions (PPIs) and the drugs that target proteins in these PPIs. CIDO has been used to model COVID-19 related phenomena in areas such as epidemiology. The scope of CIDO was evaluated by visual analysis supported by a summarization network method. CIDO has been used in various applications such as term standardization, inference, natural language processing (NLP) and clinical data integration. We have applied the amino acid variant knowledge present in CIDO to analyze differences between SARS-CoV-2 Delta and Omicron variants. CIDO's integrative host-coronavirus PPIs and drug-target knowledge has also been used to support drug repurposing for COVID-19 treatment.

Conclusion

CIDO represents entities and relations in the domain of coronavirus diseases with a special focus on COVID-19. It supports shared knowledge representation, data and metadata standardization and integration, and has been used in a range of applications.

Supplementary Information

The online version contains supplementary material available at 10.1186/s13326-022-00279-z.

Collapse

Affiliation(s)

Yongqun He University of Michigan Medical School, Ann Arbor, MI, USA.
Hong Yu People's Hospital of Guizhou Province, Guiyang, Guizhou, China.
Anthony Huffman University of Michigan Medical School, Ann Arbor, MI, USA
Asiyah Yu Lin National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.,National Center for Ontological Research, Buffalo, NY, USA
Darren A Natale Georgetown University Medical Center, Washington, DC, USA
John Beverley National Center for Ontological Research, Buffalo, NY, USA.,The Johns Hopkins University Applied Physics Laboratory, Laurel, MD, USA
Ling Zheng Computer Science and Software Engineering Department, Monmouth University, West Long Branch, NJ, USA
Yehoshua Perl Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA
Zhigang Wang Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & School of Basic Medicine, Peking Union Medical College, Beijing, China
Yingtong Liu University of Michigan Medical School, Ann Arbor, MI, USA
Edison Ong University of Michigan Medical School, Ann Arbor, MI, USA
Yang Wang University of Michigan Medical School, Ann Arbor, MI, USA.,People's Hospital of Guizhou Province, Guiyang, Guizhou, China
Philip Huang University of Michigan Medical School, Ann Arbor, MI, USA
Long Tran University of Michigan Medical School, Ann Arbor, MI, USA
Jinyang Du University of Michigan Medical School, Ann Arbor, MI, USA
Zalan Shah University of Michigan Medical School, Ann Arbor, MI, USA
Easheta Shah University of Michigan Medical School, Ann Arbor, MI, USA
Roshan Desai University of Michigan Medical School, Ann Arbor, MI, USA
Hsin-Hui Huang University of Michigan Medical School, Ann Arbor, MI, USA.,National Yang-Ming University, Taipei, Taiwan
Yujia Tian Rutgers University, New Brunswick, NJ, USA
Eric Merrell University at Buffalo, Buffalo, NY, 14260, USA
William D Duncan University of Florida, Gainesville, FL, USA
Sivaram Arabandi OntoPro LLC, Houston, TX, USA
Lynn M Schriml University of Maryland School of Medicine, Baltimore, MD, USA
Jie Zheng Department of Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
Anna Maria Masci Office of Data Science, National Institute of Environmental Health Sciences, Research Triangle Park, NC, USA
Liwei Wang Mayo Clinic, Rochester, MN, USA
Hongfang Liu Mayo Clinic, Rochester, MN, USA
Fatima Zohra Smaili King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
Robert Hoehndorf King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
Zoë May Pendlington European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
Paola Roncaglia European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
Xianwei Ye People's Hospital of Guizhou Province, Guiyang, Guizhou, China
Jiangan Xie School of Bioinformatics, Chongqing University of Posts and Telecommunications, Chongqing, China
Yi-Wei Tang Cepheid, Danaher Diagnostic Platform, Shanghai, China
Xiaolin Yang Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & School of Basic Medicine, Peking Union Medical College, Beijing, China
Suyuan Peng National Institute of Health Data Science, Peking University, Beijing, China
Luxia Zhang National Institute of Health Data Science, Peking University, Beijing, China
Luonan Chen Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
Junguk Hur University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND, USA
Gilbert S Omenn University of Michigan Medical School, Ann Arbor, MI, USA
Brian Athey University of Michigan Medical School, Ann Arbor, MI, USA
Barry Smith National Center for Ontological Research, Buffalo, NY, USA.,University at Buffalo, Buffalo, NY, 14260, USA

Collapse

Liu C, Lee J, Ta C, Soroush A, Rogers JR, Kim JH, Natarajan K, Zucker J, Perl Y, Weng C. Risk Factors Associated With SARS-CoV-2 Breakthrough Infections in Fully mRNA-Vaccinated Individuals: Retrospective Analysis. JMIR Public Health Surveill 2022;8:e35311. [PMID: 35486806 PMCID: PMC9132195 DOI: 10.2196/35311] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 03/29/2022] [Accepted: 04/27/2022] [Indexed: 01/13/2023] Open

Abstract

BACKGROUND

COVID-19 messenger RNA (mRNA) vaccines have demonstrated efficacy and effectiveness in preventing symptomatic COVID-19, while being relatively safe in trial studies. However, vaccine breakthrough infections have been reported.

OBJECTIVE

This study aims to identify risk factors associated with COVID-19 breakthrough infections among fully mRNA-vaccinated individuals.

METHODS

We conducted a series of observational retrospective analyses using the electronic health records (EHRs) of the Columbia University Irving Medical Center/New York Presbyterian (CUIMC/NYP) up to September 21, 2021. New York City (NYC) adult residences with at least 1 polymerase chain reaction (PCR) record were included in this analysis. Poisson regression was performed to assess the association between the breakthrough infection rate in vaccinated individuals and multiple risk factors-including vaccine brand, demographics, and underlying conditions-while adjusting for calendar month, prior number of visits, and observational days in the EHR.

RESULTS

The overall estimated breakthrough infection rate was 0.16 (95% CI 0.14-0.18). Individuals who were vaccinated with Pfizer/BNT162b2 (incidence rate ratio [IRR] against Moderna/mRNA-1273=1.66, 95% CI 1.17-2.35) were male (IRR against female=1.47, 95% CI 1.11-1.94) and had compromised immune systems (IRR=1.48, 95% CI 1.09-2.00) were at the highest risk for breakthrough infections. Among all underlying conditions, those with primary immunodeficiency, a history of organ transplant, an active tumor, use of immunosuppressant medications, or Alzheimer disease were at the highest risk.

CONCLUSIONS

Although we found both mRNA vaccines were effective, Moderna/mRNA-1273 had a lower incidence rate of breakthrough infections. Immunocompromised and male individuals were among the highest risk groups experiencing breakthrough infections. Given the rapidly changing nature of the SARS-CoV-2 pandemic, continued monitoring and a generalizable analysis pipeline are warranted to inform quick updates on vaccine effectiveness in real time.

Collapse

Zheng L, Perl Y, He Y, Ochs C, Geller J, Liu H, Keloth VK. Visual comprehension and orientation into the COVID-19 CIDO ontology. J Biomed Inform 2021;120:103861. [PMID: 34224898 PMCID: PMC8252699 DOI: 10.1016/j.jbi.2021.103861] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Revised: 05/11/2021] [Accepted: 06/30/2021] [Indexed: 12/12/2022]

Abstract

The current intensive research on potential remedies and vaccinations for COVID-19 would greatly benefit from an ontology of standardized COVID terms. The Coronavirus Infectious Disease Ontology (CIDO) is the largest among several COVID ontologies, and it keeps growing, but it is still a medium sized ontology. Sophisticated CIDO users, who need more than searching for a specific concept, require orientation and comprehension of CIDO. In previous research, we designed a summarization network called "partial-area taxonomy" to support comprehension of ontologies. The partial-area taxonomy for CIDO is of smaller magnitude than CIDO, but is still too large for comprehension. We present here the "weighted aggregate taxonomy" of CIDO, designed to provide compact views at various granularities of our partial-area taxonomy (and the CIDO ontology). Such a compact view provides a "big picture" of the content of an ontology. In previous work, in the visualization patterns used for partial-area taxonomies, the nodes were arranged in levels according to the numbers of relationships of their concepts. Applying this visualization pattern to CIDO's weighted aggregate taxonomy resulted in an overly long and narrow layout that does not support orientation and comprehension since the names of nodes are barely readable. Thus, we introduce in this paper an innovative visualization of the weighted aggregate taxonomy for better orientation and comprehension of CIDO (and other ontologies). A measure for the efficiency of a layout is introduced and is used to demonstrate the advantage of the new layout over the previous one. With this new visualization, the user can "see the forest for the trees" of the ontology. Benefits of this visualization in highlighting insights into CIDO's content are provided. Generality of the new layout is demonstrated.

Collapse

Zheng L, Min H, Chen Y, Keloth V, Geller J, Perl Y, Hripcsak G. Outlier concepts auditing methodology for a large family of biomedical ontologies. BMC Med Inform Decis Mak 2020;20:296. [PMID: 33319713 PMCID: PMC7737254 DOI: 10.1186/s12911-020-01311-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Accepted: 10/28/2020] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Summarization networks are compact summaries of ontologies. The "Big Picture" view offered by summarization networks enables to identify sets of concepts that are more likely to have errors than control concepts. For ontologies that have outgoing lateral relationships, we have developed the "partial-area taxonomy" summarization network. Prior research has identified one kind of outlier concepts, concepts of small partials-areas within partial-area taxonomies. Previously we have shown that the small partial-area technique works successfully for four ontologies (or their hierarchies).

METHODS

To improve the Quality Assurance (QA) scalability, a family-based QA framework, where one QA technique is potentially applicable to a whole family of ontologies with similar structural features, was developed. The 373 ontologies hosted at the NCBO BioPortal in 2015 were classified into a collection of families based on structural features. A meta-ontology represents this family collection, including one family of ontologies having outgoing lateral relationships. The process of updating the current meta-ontology is described. To conclude that one QA technique is applicable for at least half of the members for a family F, this technique should be demonstrated as successful for six out of six ontologies in F. We describe a hypothesis setting the condition required for a technique to be successful for a given ontology. The process of a study to demonstrate such success is described. This paper intends to prove the scalability of the small partial-area technique.

RESULTS

We first updated the meta-ontology classifying 566 BioPortal ontologies. There were 371 ontologies in the family with outgoing lateral relationships. We demonstrated the success of the small partial-area technique for two ontology hierarchies which belong to this family, SNOMED CT's Specimen hierarchy and NCIt's Gene hierarchy. Together with the four previous ontologies from the same family, we fulfilled the "six out of six" condition required to show the scalability for the whole family.

CONCLUSIONS

We have shown that the small partial-area technique can be potentially successful for the family of ontologies with outgoing lateral relationships in BioPortal, thus improve the scalability of this QA technique.

Collapse

Zheng L, Chen Y, Min H, Hildebrand PL, Liu H, Halper M, Geller J, de Coronado S, Perl Y. Missing lateral relationships in top-level concepts of an ontology. BMC Med Inform Decis Mak 2020;20:305. [PMID: 33319709 PMCID: PMC7737264 DOI: 10.1186/s12911-020-01319-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 11/09/2020] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Ontologies house various kinds of domain knowledge in formal structures, primarily in the form of concepts and the associative relationships between them. Ontologies have become integral components of many health information processing environments. Hence, quality assurance of the conceptual content of any ontology is critical. Relationships are foundational to the definition of concepts. Missing relationship errors (i.e., unintended omissions of important definitional relationships) can have a deleterious effect on the quality of an ontology. An abstraction network is a structure that overlays an ontology and provides an alternate, summarization view of its contents. One kind of abstraction network is called an area taxonomy, and a variation of it is called a subtaxonomy. A methodology based on these taxonomies for more readily finding missing relationship errors is explored.

METHODS

The area taxonomy and the subtaxonomy are deployed to help reveal concepts that have a high likelihood of exhibiting missing relationship errors. A specific top-level grouping unit found within the area taxonomy and subtaxonomy, when deemed to be anomalous, is used as an indicator that missing relationship errors are likely to be found among certain concepts. Two hypotheses pertaining to the effectiveness of our Quality Assurance approach are studied.

RESULTS

Our Quality Assurance methodology was applied to the Biological Process hierarchy of the National Cancer Institute thesaurus (NCIt) and SNOMED CT's Eye/vision finding subhierarchy within its Clinical finding hierarchy. Many missing relationship errors were discovered and confirmed in our analysis. For both test-bed hierarchies, our Quality Assurance methodology yielded a statistically significantly higher number of concepts with missing relationship errors in comparison to a control sample of concepts. Two hypotheses are confirmed by these findings.

CONCLUSIONS

Quality assurance is a critical part of an ontology's lifecycle, and automated or semi-automated tools for supporting this process are invaluable. We introduced a Quality Assurance methodology targeted at missing relationship errors. Its successful application to the NCIt's Biological Process hierarchy and SNOMED CT's Eye/vision finding subhierarchy indicates that it can be a useful addition to the arsenal of tools available to ontology maintenance personnel.

Collapse

Liu H, Perl Y, Geller J. Concept placement using BERT trained by transforming and summarizing biomedical ontology structure. J Biomed Inform 2020;112:103607. [PMID: 33098987 DOI: 10.1016/j.jbi.2020.103607] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 09/07/2020] [Accepted: 10/17/2020] [Indexed: 11/17/2022]

Zheng L, He Z, Wei D, Keloth V, Fan JW, Lindemann L, Zhu X, Cimino JJ, Perl Y. A review of auditing techniques for the Unified Medical Language System. J Am Med Inform Assoc 2020;27:1625-1638. [PMID: 32766692 PMCID: PMC7566540 DOI: 10.1093/jamia/ocaa108] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 05/05/2020] [Accepted: 05/13/2020] [Indexed: 11/12/2022] Open

Liu H, Perl Y, Geller J. Transfer Learning from BERT to Support Insertion of New Concepts into SNOMED CT. AMIA Annu Symp Proc 2020;2019:1129-1138. [PMID: 32308910 PMCID: PMC7153142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Zheng L, Liu H, Perl Y, Geller J. Training a Convolutional Neural Network with Terminology Summarization Data Improves SNOMED CT Enrichment. AMIA Annu Symp Proc 2020;2019:972-981. [PMID: 32308894 PMCID: PMC7153126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Zheng L, Liu H, Perl Y, Geller J, Ochs C, Case JT. Overlapping Complex Concepts Have More Commission Errors, Especially in Intensive Terminology Auditing. AMIA Annu Symp Proc 2018;2018:1157-1166. [PMID: 30815158 PMCID: PMC6371375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Liu H, Geller J, Halper M, Perl Y. Using Convolutional Neural Networks to Support Insertion of New Concepts into SNOMED CT. AMIA Annu Symp Proc 2018;2018:750-759. [PMID: 30815117 PMCID: PMC6371320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Zheng L, Chen Y, Elhanan G, Perl Y, Geller J, Ochs C. Complex overlapping concepts: An effective auditing methodology for families of similarly structured BioPortal ontologies. J Biomed Inform 2018;83:135-149. [PMID: 29852316 DOI: 10.1016/j.jbi.2018.05.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2017] [Revised: 05/25/2018] [Accepted: 05/26/2018] [Indexed: 11/30/2022]

Abstract

In previous research, we have demonstrated for a number of ontologies that structurally complex concepts (for different definitions of "complex") in an ontology are more likely to exhibit errors than other concepts. Thus, such complex concepts often become fertile ground for quality assurance (QA) in ontologies. They should be audited first. One example of complex concepts is given by "overlapping concepts" (to be defined below.) Historically, a different auditing methodology had to be developed for every single ontology. For better scalability and efficiency, it is desirable to identify family-wide QA methodologies. Each such methodology would be applicable to a whole family of similar ontologies. In past research, we had divided the 685 ontologies of BioPortal into families of structurally similar ontologies. We showed for four ontologies of the same large family in BioPortal that "overlapping concepts" are indeed statistically significantly more likely to exhibit errors. In order to make an authoritative statement concerning the success of "overlapping concepts" as a methodology for a whole family of similar ontologies (or of large subhierarchies of ontologies), it is necessary to show that "overlapping concepts" have a higher likelihood of errors for six out of six ontologies of the family. In this paper, we are demonstrating for two more ontologies that "overlapping concepts" can successfully predict groups of concepts with a higher error rate than concepts from a control group. The fifth ontology is the Neoplasm subhierarchy of the National Cancer Institute thesaurus (NCIt). The sixth ontology is the Infectious Disease subhierarchy of SNOMED CT. We demonstrate quality assurance results for both of them. Furthermore, in this paper we observe two novel, important, and useful phenomena during quality assurance of "overlapping concepts." First, an erroneous "overlapping concept" can help with discovering other erroneous "non-overlapping concepts" in its vicinity. Secondly, correcting erroneous "overlapping concepts" may turn them into "non-overlapping concepts." We demonstrate that this may reduce the complexity of parts of the ontology, which in turn makes the ontology more comprehensible, simplifying maintenance and use of the ontology.

Collapse

Perl Y, Halper M, Geller J, Kuo F, Cimino JJ, Gu H. Partitioning an Object-Oriented Terminology Schema. Methods Inf Med 2018. [DOI: 10.1055/s-0038-1634167] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]

He Z, Perl Y, Elhanan G, Chen Y, Geller J, Bian J. Auditing the Assignments of Top-Level Semantic Types in the UMLS Semantic Network to UMLS Concepts. Proceedings (IEEE Int Conf Bioinformatics Biomed) 2017;2017:1262-1269. [PMID: 29375930 DOI: 10.1109/bibm.2017.8217840] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Zheng L, Yumak H, Chen L, Ochs C, Geller J, Kapusnik-Uner J, Perl Y. Quality assurance of chemical ingredient classification for the National Drug File - Reference Terminology. J Biomed Inform 2017;73:30-42. [PMID: 28723580 DOI: 10.1016/j.jbi.2017.07.013] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2017] [Revised: 07/13/2017] [Accepted: 07/14/2017] [Indexed: 02/04/2023]

Elhanan G, Ochs C, Mejino JLV, Liu H, Mungall CJ, Perl Y. From SNOMED CT to Uberon: Transferability of evaluation methodology between similarly structured ontologies. Artif Intell Med 2017;79:9-14. [PMID: 28532962 DOI: 10.1016/j.artmed.2017.05.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Revised: 05/03/2017] [Accepted: 05/04/2017] [Indexed: 12/29/2022]

Min H, Zheng L, Perl Y, Halper M, De Coronado S, Ochs C. Relating Complexity and Error Rates of Ontology Concepts. More Complex NCIt Concepts Have More Errors. Methods Inf Med 2017;56:200-208. [PMID: 28244549 DOI: 10.3414/me16-01-0085] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2016] [Accepted: 01/19/2017] [Indexed: 11/09/2022]

Abstract

OBJECTIVES

Ontologies are knowledge structures that lend support to many health-information systems. A study is carried out to assess the quality of ontological concepts based on a measure of their complexity. The results show a relation between complexity of concepts and error rates of concepts.

METHODS

A measure of lateral complexity defined as the number of exhibited role types is used to distinguish between more complex and simpler concepts. Using a framework called an area taxonomy, a kind of abstraction network that summarizes the structural organization of an ontology, concepts are divided into two groups along these lines. Various concepts from each group are then subjected to a two-phase QA analysis to uncover and verify errors and inconsistencies in their modeling. A hierarchy of the National Cancer Institute thesaurus (NCIt) is used as our test-bed. A hypothesis pertaining to the expected error rates of the complex and simple concepts is tested.

RESULTS

Our study was done on the NCIt's Biological Process hierarchy. Various errors, including missing roles, incorrect role targets, and incorrectly assigned roles, were discovered and verified in the two phases of our QA analysis. The overall findings confirmed our hypothesis by showing a statistically significant difference between the amounts of errors exhibited by more laterally complex concepts vis-à-vis simpler concepts.

CONCLUSIONS

QA is an essential part of any ontology's maintenance regimen. In this paper, we reported on the results of a QA study targeting two groups of ontology concepts distinguished by their level of complexity, defined in terms of the number of exhibited role types. The study was carried out on a major component of an important ontology, the NCIt. The findings suggest that more complex concepts tend to have a higher error rate than simpler concepts. These findings can be utilized to guide ongoing efforts in ontology QA.

Collapse

Ochs C, Case JT, Perl Y. Analyzing structural changes in SNOMED CT's Bacterial infectious diseases using a visual semantic delta. J Biomed Inform 2017;67:101-116. [PMID: 28215561 DOI: 10.1016/j.jbi.2017.02.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2016] [Revised: 02/08/2017] [Accepted: 02/09/2017] [Indexed: 12/23/2022]

Ochs C, Case JT, Perl Y. Tracking the Remodeling of SNOMED CT's Bacterial Infectious Diseases. AMIA Annu Symp Proc 2017;2016:974-983. [PMID: 28269894 PMCID: PMC5333319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Zheng L, Perl Y, Elhanan G, Ochs C, Geller J, Halper M. Summarizing an Ontology: A "Big Knowledge" Coverage Approach. Stud Health Technol Inform 2017;245:978-982. [PMID: 29295246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Liu H, Zheng L, Perl Y, Chen Y, Elhanan G. Correcting Ontology Errors Simplifies Visual Complexity. Stud Health Technol Inform 2017;245:1330. [PMID: 29295411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Perl Y, Geller J, Halper M, Ochs C, Zheng L, Kapusnik-Uner J. Introducing the Big Knowledge to Use (BK2U) challenge. Ann N Y Acad Sci 2016;1387:12-24. [PMID: 27750400 DOI: 10.1111/nyas.13225] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2016] [Revised: 07/07/2016] [Accepted: 08/11/2016] [Indexed: 12/26/2022]

Ochs C, Geller J, Perl Y, Musen MA. A unified software framework for deriving, visualizing, and exploring abstraction networks for ontologies. J Biomed Inform 2016;62:90-105. [PMID: 27345947 DOI: 10.1016/j.jbi.2016.06.008] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Revised: 06/02/2016] [Accepted: 06/22/2016] [Indexed: 11/27/2022]

Agrawal A, Perl Y, Ochs C, Elhanan G. A contextual auditing method for SNOMED CT concepts. INT J DATA MIN BIOIN 2016. [DOI: 10.1504/ijdmb.2016.078153] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Ochs C, Perl Y, Halper M, Geller J, Lomax J. Quality assurance of the gene ontology using abstraction networks. J Bioinform Comput Biol 2015;14:1642001. [PMID: 27301779 DOI: 10.1142/s0219720016420014] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Ochs C, Zheng L, Gu H, Perl Y, Geller J, Kapusnik-Uner J, Zakharchenko A. Drug-drug Interaction Discovery Using Abstraction Networks for "National Drug File - Reference Terminology" Chemical Ingredients. AMIA Annu Symp Proc 2015;2015:973-982. [PMID: 26958234 PMCID: PMC4765653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Wei D, Helen Gu H, Perl Y, Halper M, Ochs C, Elhanan G, Chen Y. Structural measures to track the evolution of SNOMED CT hierarchies. J Biomed Inform 2015;57:278-87. [PMID: 26260003 DOI: 10.1016/j.jbi.2015.08.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 08/01/2015] [Accepted: 08/01/2015] [Indexed: 11/28/2022]

Ochs C, Perl Y, Geller J, Haendel M, Brush M, Arabandi S, Tu S. Summarizing and visualizing structural changes during the evolution of biomedical ontologies using a Diff Abstraction Network. J Biomed Inform 2015;56:127-44. [PMID: 26048076 DOI: 10.1016/j.jbi.2015.05.018] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Revised: 04/01/2015] [Accepted: 05/27/2015] [Indexed: 10/23/2022]

Halper M, Gu H, Perl Y, Ochs C. Abstraction networks for terminologies: Supporting management of "big knowledge". Artif Intell Med 2015;64:1-16. [PMID: 25890687 DOI: 10.1016/j.artmed.2015.03.005] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2014] [Revised: 02/24/2015] [Accepted: 03/25/2015] [Indexed: 11/16/2022]

Abstract

OBJECTIVE

Terminologies and terminological systems have assumed important roles in many medical information processing environments, giving rise to the "big knowledge" challenge when terminological content comprises tens of thousands to millions of concepts arranged in a tangled web of relationships. Use and maintenance of knowledge structures on that scale can be daunting. The notion of abstraction network is presented as a means of facilitating the usability, comprehensibility, visualization, and quality assurance of terminologies.

METHODS AND MATERIALS

An abstraction network overlays a terminology's underlying network structure at a higher level of abstraction. In particular, it provides a more compact view of the terminology's content, avoiding the display of minutiae. General abstraction network characteristics are discussed. Moreover, the notion of meta-abstraction network, existing at an even higher level of abstraction than a typical abstraction network, is described for cases where even the abstraction network itself represents a case of "big knowledge." Various features in the design of abstraction networks are demonstrated in a methodological survey of some existing abstraction networks previously developed and deployed for a variety of terminologies.

RESULTS

The applicability of the general abstraction-network framework is shown through use-cases of various terminologies, including the Systematized Nomenclature of Medicine - Clinical Terms (SNOMED CT), the Medical Entities Dictionary (MED), and the Unified Medical Language System (UMLS). Important characteristics of the surveyed abstraction networks are provided, e.g., the magnitude of the respective size reduction referred to as the abstraction ratio. Specific benefits of these alternative terminology-network views, particularly their use in terminology quality assurance, are discussed. Examples of meta-abstraction networks are presented.

CONCLUSIONS

The "big knowledge" challenge constitutes the use and maintenance of terminological structures that comprise tens of thousands to millions of concepts and their attendant complexity. The notion of abstraction network has been introduced as a tool in helping to overcome this challenge, thus enhancing the usefulness of terminologies. Abstraction networks have been shown to be applicable to a variety of existing biomedical terminologies, and these alternative structural views hold promise for future expanded use with additional terminologies.

Collapse

Ochs C, Geller J, Perl Y, Chen Y, Xu J, Min H, Case JT, Wei Z. Scalable quality assurance for large SNOMED CT hierarchies using subject-based subtaxonomies. J Am Med Inform Assoc 2014;22:507-18. [PMID: 25336594 DOI: 10.1136/amiajnl-2014-003151] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2014] [Accepted: 09/27/2014] [Indexed: 11/04/2022] Open

Ochs C, Geller J, Perl Y, Chen Y, Agrawal A, Case JT, Hripcsak G. A tribal abstraction network for SNOMED CT target hierarchies without attribute relationships. J Am Med Inform Assoc 2014;22:628-39. [PMID: 25332354 DOI: 10.1136/amiajnl-2014-003173] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2014] [Accepted: 09/20/2014] [Indexed: 11/03/2022] Open

He Z, Ochs C, Agrawal A, Perl Y, Zeginis D, Tarabanis K, Elhanan G, Halper M, Noy N, Geller J. A family-based framework for supporting quality assurance of biomedical ontologies in BioPortal. AMIA Annu Symp Proc 2013;2013:581-590. [PMID: 24551360 PMCID: PMC3900201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Agrawal A, Perl Y, Chen Y, Elhanan G, Liu M. Identifying inconsistencies in SNOMED CT problem lists using structural indicators. AMIA Annu Symp Proc 2013;2013:17-26. [PMID: 24551319 PMCID: PMC3900119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Ochs C, Perl Y, Geller J, Halper M, Gu H, Chen Y, Elhanan G. Scalability of abstraction-network-based quality assurance to large SNOMED hierarchies. AMIA Annu Symp Proc 2013;2013:1071-1080. [PMID: 24551393 PMCID: PMC3900129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Agrawal A, He Z, Perl Y, Wei D, Halper M, Elhanan G, Chen Y. The readiness of SNOMED problem list concepts for meaningful use of electronic health records. Artif Intell Med 2013;58:73-80. [PMID: 23602702 DOI: 10.1016/j.artmed.2013.03.008] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2012] [Revised: 03/05/2013] [Accepted: 03/17/2013] [Indexed: 11/24/2022]

Abstract

OBJECTIVE

By 2015, SNOMED CT (SCT) will become the USA's standard for encoding diagnoses and problem lists in electronic health records (EHRs). To facilitate this effort, the National Library of Medicine has published the "SCT Clinical Observations Recording and Encoding" and the "Veterans Health Administration and Kaiser Permanente" problem lists (collectively, the "PL"). The PL is studied in regard to its readiness to support meaningful use of EHRs. In particular, we wish to determine if inconsistencies appearing in SCT, in general, occur as frequently in the PL, and whether further quality-assurance (QA) efforts on the PL are required.

METHODS AND MATERIALS

A study is conducted where two random samples of SCT concepts are compared. The first consists of concepts strictly from the PL and the second contains general SCT concepts distributed proportionally to the PL's in terms of their hierarchies. Each sample is analyzed for its percentage of primitive concepts and for frequency of modeling errors of various severity levels as quality measures. A simple structural indicator, namely, the number of parents, is suggested to locate high likelihood inconsistencies in hierarchical relationships. The effectiveness of this indicator is evaluated.

RESULTS

PL concepts are found to be slightly better than other concepts in the respective SCT hierarchies with regards to the quality measure of the percentage of primitive concepts and the frequency of modeling errors. There were 58% primitive concepts in the PL sample versus 62% in the control sample. The structural indicator of number of parents is shown to be statistically significant in its ability to identify concepts having a higher likelihood of inconsistencies in their hierarchical relationships. The absolute number of errors in the group of concepts having 1-3 parents was shown to be significantly lower than that for concepts with 4-6 parents and those with 7 or more parents based on Chi-squared analyses.

CONCLUSION

PL concepts suffer from the same issues as general SCT concepts, although to a slightly lesser extent, and do require further QA efforts to promote meaningful use of EHRs. To support such efforts, a structural indicator is shown to effectively ferret out potentially problematic concepts where those QA efforts should be focused.

Collapse

Agrawal A, Perl Y, Elhanan G. Identifying problematic concepts in SNOMED CT using a lexical approach. Stud Health Technol Inform 2013;192:773-777. [PMID: 23920662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]

Geller J, Ochs C, Perl Y, Xu J. New abstraction networks and a new visualization tool in support of auditing the SNOMED CT content. AMIA Annu Symp Proc 2012;2012:237-246. [PMID: 23304293 PMCID: PMC3540556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]

Ochs C, Agrawal A, Perl Y, Halper M, Tu SW, Carini S, Sim I, Noy N, Musen M, Geller J. Deriving an abstraction network to support quality assurance in OCRe. AMIA Annu Symp Proc 2012;2012:681-689. [PMID: 23304341 PMCID: PMC3540580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]

Gu HH, Elhanan G, Perl Y, Hripcsak G, Cimino JJ, Xu J, Chen Y, Geller J, Paul Morrey C. A study of terminology auditors' performance for UMLS semantic type assignments. J Biomed Inform 2012;45:1042-8. [PMID: 22687822 DOI: 10.1016/j.jbi.2012.05.006] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2012] [Revised: 05/26/2012] [Accepted: 05/31/2012] [Indexed: 11/30/2022]

Morrey CP, Perl Y, Halper M, Chen L, Gu H“H. A chemical specialty semantic network for the Unified Medical Language System. J Cheminform 2012;4:9. [PMID: 22577759 PMCID: PMC3428652 DOI: 10.1186/1758-2946-4-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2012] [Accepted: 05/11/2012] [Indexed: 11/26/2022] Open

Wang Y, Halper M, Wei D, Gu H, Perl Y, Xu J, Elhanan G, Chen Y, Spackman KA, Case JT, Hripcsak G. Auditing complex concepts of SNOMED using a refined hierarchical abstraction network. J Biomed Inform 2012;45:1-14. [PMID: 21907827 PMCID: PMC3313651 DOI: 10.1016/j.jbi.2011.08.016] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2011] [Revised: 08/25/2011] [Accepted: 08/26/2011] [Indexed: 10/17/2022]

Abstract

Auditors of a large terminology, such as SNOMED CT, face a daunting challenge. To aid them in their efforts, it is essential to devise techniques that can automatically identify concepts warranting special attention. "Complex" concepts, which by their very nature are more difficult to model, fall neatly into this category. A special kind of grouping, called a partial-area, is utilized in the characterization of complex concepts. In particular, the complex concepts that are the focus of this work are those appearing in intersections of multiple partial-areas and are thus referred to as overlapping concepts. In a companion paper, an automatic methodology for identifying and partitioning the entire collection of overlapping concepts into disjoint, singly-rooted groups, that are more manageable to work with and comprehend, has been presented. The partitioning methodology formed the foundation for the development of an abstraction network for the overlapping concepts called a disjoint partial-area taxonomy. This new disjoint partial-area taxonomy offers a collection of semantically uniform partial-areas and is exploited herein as the basis for a novel auditing methodology. The review of the overlapping concepts is done in a top-down order within semantically uniform groups. These groups are themselves reviewed in a top-down order, which proceeds from the less complex to the more complex overlapping concepts. The results of applying the methodology to SNOMED's Specimen hierarchy are presented. Hypotheses regarding error ratios for overlapping concepts and between different kinds of overlapping concepts are formulated. Two phases of auditing the Specimen hierarchy for two releases of SNOMED are reported on. With the use of the double bootstrap and Fisher's exact test (two-tailed), the auditing of concepts and especially roots of overlapping partial-areas is shown to yield a statistically significant higher proportion of errors.

Collapse

Chen Y, Gu H, Perl Y, Geller J. Overcoming an obstacle in expanding a UMLS semantic type extent. J Biomed Inform 2012;45:61-70. [PMID: 21925287 PMCID: PMC3272131 DOI: 10.1016/j.jbi.2011.08.021] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2011] [Revised: 08/30/2011] [Accepted: 08/31/2011] [Indexed: 11/25/2022]

Wang Y, Halper M, Wei D, Perl Y, Geller J. Abstraction of complex concepts with a refined partial-area taxonomy of SNOMED. J Biomed Inform 2012;45:15-29. [PMID: 21878396 PMCID: PMC3313654 DOI: 10.1016/j.jbi.2011.08.013] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2010] [Revised: 08/22/2011] [Accepted: 08/23/2011] [Indexed: 10/17/2022]

He Z, Halper M, Perl Y, Elhanan G. Clinical Clarity versus Terminological Order - The Readiness of SNOMED CT Concept Descriptors for Primary Care. MIXHS 12 (2012) 2012;2012:1-6. [PMID: 26870837 DOI: 10.1145/2389672.2389674] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Elhanan G, Perl Y, Geller J. A survey of SNOMED CT direct users, 2010: impressions and preferences regarding content and quality. J Am Med Inform Assoc 2011;18 Suppl 1:i36-44. [PMID: 21836159 PMCID: PMC3241171 DOI: 10.1136/amiajnl-2011-000341] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2011] [Accepted: 07/11/2011] [Indexed: 11/04/2022] Open

Abstract

OBJECTIVE

Little information exists concerning SNOMED CT (systematized nomenclature of medicine-clinical terms) users. This report describes current impressions and preferences of direct SNOMED CT users regarding coverage, quality, and concept details, and the change request mechanism.

DESIGN

A 43-question anonymous survey distributed electronically to relevant online communities.

MEASUREMENTS

Data on user demographic characteristics, modes and purposes of use, means and frequencies of access, satisfaction with SNOMED CT content coverage and quality and with the change request mechanism were recorded.

RESULTS

The survey was conducted in January 2010 and elicited 215 responses. Details regarding users' profiles, modes of use and access were reported elsewhere. The coverage of SNOMED CT was perceived to be at least 85% complete by 42% of responders, and 60% were at least satisfied with its quality. Various deficiencies were encountered at least 'somewhat often' by 28-61% of responders. Incorrect data were more bothersome than missing data. Users indicated that significant resources should be allocated to more consistent and complete conceptual representations and to further enhance content coverage. Enhanced synonym coverage and the introduction of textual definitions were important to users (54% and 63%, respectively).

LIMITATIONS

A survey format with limited control over recruitment and selection bias. Lack of information regarding the SNOMED CT version used by responders.

CONCLUSION

Despite overall satisfaction, direct users indicated a strong desire to improve consistency, quality, and completeness of conceptual representations and concept details, as well as a continued desire to expand coverage. The survey provides much needed data for informed decisions regarding the use and development goals of SNOMED CT. Focused periodical surveys are warranted.

Collapse

Halper M, Morrey CP, Chen Y, Elhanan G, Hripcsak G, Perl Y. Auditing hierarchical cycles to locate other inconsistencies in the UMLS. AMIA Annu Symp Proc 2011;2011:529-36. [PMID: 22195107 PMCID: PMC3243212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Morrey CP, Chen L, Halper M, Perl Y. Resolution of redundant semantic type assignments for organic chemicals in the UMLS. Artif Intell Med 2011;52:141-51. [PMID: 21646001 DOI: 10.1016/j.artmed.2011.05.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2009] [Revised: 05/03/2011] [Accepted: 05/09/2011] [Indexed: 11/27/2022]

Abstract

OBJECTIVE

The Unified Medical Language System (UMLS) integrates terms from different sources into concepts and supplements these with the assignment of one or more high-level semantic types (STs) from its Semantic Network (SN). For a composite organic chemical concept, multiple assignments of organic chemical STs often serve to enumerate the types of the composite's underlying chemical constituents. This practice sometimes leads to the introduction of a forbidden redundant ST assignment, where both an ST and one of its descendants are assigned to the same concept. A methodology for resolving redundant ST assignments for organic chemicals, better capturing the essence of such composite chemicals than the typical omission of the more general ST, is presented.

MATERIALS AND METHODS

The typical SN resolution of a redundant ST assignment is to retain only the more specific ST assignment and omit the more general one. However, with organic chemicals, that is not always the correct strategy. A methodology for properly dealing with the redundancy based on the relative sizes of the chemical components is presented. It is more accurate to use the ST of the larger chemical component for capturing the category of the concept, even if that means using the more general ST.

RESULTS

A sample of 254 chemical concepts having redundant ST assignments in older UMLS releases was audited to analyze the accuracy of current ST assignments. For 81 (32%) of them, our chemical analysis-based approach yielded a different recommendation from the UMLS (2009AA). New UMLS usage notes capturing rules of this methodology are proffered.

CONCLUSIONS

Redundant ST assignments have typically arisen for organic composite chemical concepts. A methodology for dealing with this kind of erroneous configuration, capturing the proper category for a composite chemical, is presented and demonstrated.

Collapse

Huang KC, Geller J, Elhanan G, Perl Y, Halper M. Auditing SNOMED Integration into the UMLS for Duplicate Concepts. AMIA Annu Symp Proc 2010;2010:321-325. [PMID: 21346993 PMCID: PMC3041353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]