1
|
Kanbar LJ, Dexheimer JW, Zahner J, Burrows EK, Chatburn R, Messinger A, Baker CD, Schuler CL, Benscoter D, Amin R, Pajor N. Standardizing electronic health record ventilation data in the pediatric long-term mechanical ventilator-dependent population. Pediatr Pulmonol 2023; 58:433-440. [PMID: 36226360 DOI: 10.1002/ppul.26204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 09/22/2022] [Accepted: 10/08/2022] [Indexed: 01/25/2023]
Abstract
BACKGROUND Sharing data across institutions is critical to improving care for children who are using long-term mechanical ventilation (LTMV). Mechanical ventilation data are complex and poorly standardized. This lack of data standardization is a major barrier to data sharing. OBJECTIVE We aimed to describe current ventilator data in the electronic health record (EHR) and propose a framework for standardizing these data using a common data model (CDM) across multiple populations and sites. METHODS We focused on a cohort of patients with LTMV dependence who were weaned from mechanical ventilation (MV). We extracted and described relevant EHR ventilation data. We identified the minimum necessary components, termed "Clinical Ideas," to describe MV from time of initiation to liberation. We then utilized existing resources and partnered with informatics collaborators to develop a framework for incorporating Clinical Ideas into the PEDSnet CDM based on the Observational Medical Outcomes Partnership (OMOP). RESULTS We identified 78 children with LTMV dependence who weaned from ventilator support. There were 25 unique device names and 28 unique ventilation mode names used in the cohort. We identified multiple Clinical Ideas necessary to describe ventilator support over time: device, interface, ventilation mode, settings, measurements, and duration of ventilation usage per day. We used Concepts from the SNOMED-CT vocabulary and integrated an existing ventilator mode taxonomy to create a framework for CDM and OMOP integration. CONCLUSION The proposed framework standardizes mechanical ventilation terminology and may facilitate efficient data exchange in a multisite network. Rapid data sharing is necessary to improve research and clinical care for children with LTMV dependence.
Collapse
Affiliation(s)
- Lara J Kanbar
- Division of Pulmonary Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA.,Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
| | - Judith W Dexheimer
- Division of Emergency Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
| | - Janet Zahner
- Department of Information Services, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
| | - Evanette K Burrows
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Robert Chatburn
- Program Manager Enterprise Research for Respiratory Care, Cleveland Clinic, Cleveland, Ohio, USA
| | - Amanda Messinger
- Department of Pediatrics, Section of Pulmonary and Sleep Medicine, University of Colorado, Denver, Colorado, USA
| | - Christopher D Baker
- Department of Pediatrics, Section of Pulmonary and Sleep Medicine, University of Colorado, Denver, Colorado, USA
| | - Christine L Schuler
- Division of Pulmonary Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA.,Division of Hospital Medicine, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA.,Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA
| | - Dan Benscoter
- Division of Pulmonary Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA.,Division of Emergency Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
| | - Raouf Amin
- Division of Pulmonary Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA.,Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA
| | - Nathan Pajor
- Division of Pulmonary Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA.,Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA.,Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA
| |
Collapse
|
2
|
Hao X, Abeysinghe R, Shi J, Cui L. A substring replacement approach for identifying missing IS-A relations in SNOMED CT. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2022; 2022:2611-2618. [PMID: 36776766 PMCID: PMC9918377 DOI: 10.1109/bibm55620.2022.9995595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Biomedical ontologies provide formalized information and knowledge in the biomedical domain. Over the years, biomedical ontologies have played an important role in facilitating biomedical research and applications. Common quality issues of biomedical ontologies include inconsistent naming of concepts, redundant concepts, redundant relations, incomplete/incorrect concept definitions, and incomplete/incorrect class hierarchies. In this work, we focus on addressing the incompleteness of the class hierarchy in SNOMED CT. We develop a substring replacement approach, leveraging concepts' lexical features and existing IS-A relations to identify potential missing IS-A relations in SNOMED CT. To evaluate the effectiveness of our approach, we performed both automated and manual validation. For the automated evaluation, we leverage relations from external terminologies in the Unified Medical Language System (UMLS) to validate the identified missing IS-A relations. For the manual validation, a randomly selected 100 samples from the results are reviewed by a domain expert. Applying our approach to the March 2022 release of SNOMED CT US Edition, we identified 3,228 potential missing IS-A relations, among which 63 were validated through the UMLS. The evaluation by the domain expert revealed that 89 out of 100 (a precision of 89%) missing IS-A relations are valid cases, showing the effectiveness of this substring replacement approach to facilitate the quality assurance of IS-A relations in SNOMED CT.
Collapse
Affiliation(s)
- Xubing Hao
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Rashmie Abeysinghe
- Department of Neurology, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Jay Shi
- SCL Health Medical Group, Denver, Colorado, USA
| | - Licong Cui
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| |
Collapse
|
3
|
Post AR, Burningham Z, Halwani AS. Electronic Health Record Data in Cancer Learning Health Systems: Challenges and Opportunities. JCO Clin Cancer Inform 2022; 6:e2100158. [PMID: 35353547 PMCID: PMC9005105 DOI: 10.1200/cci.21.00158] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 01/04/2022] [Accepted: 02/18/2022] [Indexed: 12/21/2022] Open
Affiliation(s)
- Andrew R. Post
- Research Informatics Shared Resource, Huntsman Cancer Institute, University of Utah, Salt Lake City, UT
- Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, UT
| | - Zachary Burningham
- Division of Epidemiology, Department of Internal Medicine, University of Utah, Salt Lake City, UT
| | - Ahmad S. Halwani
- Division of Hematology and Hematologic Malignancies, Department of Internal Medicine, University of Utah, Salt Lake City, UT
| |
Collapse
|
4
|
Gaudet-Blavignac C, Rudaz A, Lovis C. Building a Shared, Scalable, and Sustainable Source for the Problem-Oriented Medical Record: Developmental Study. JMIR Med Inform 2021; 9:e29174. [PMID: 34643542 PMCID: PMC8552094 DOI: 10.2196/29174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 04/30/2021] [Accepted: 09/19/2021] [Indexed: 11/13/2022] Open
Abstract
Background Since the creation of the problem-oriented medical record, the building of problem lists has been the focus of many studies. To date, this issue is not well resolved, and building an appropriate contextualized problem list is still a challenge. Objective This paper aims to present the process of building a shared multipurpose common problem list at the Geneva University Hospitals. This list aims to bridge the gap between clinicians’ language expressed in free text and secondary uses requiring structured information. Methods We focused on the needs of clinicians by building a list of uniquely identified expressions to support their daily activities. In the second stage, these expressions were connected to additional information to build a complex graph of information. A list of 45,946 expressions manually extracted from clinical documents was manually curated and encoded in multiple semantic dimensions, such as International Classification of Diseases, 10th revision; International Classification of Primary Care 2nd edition; Systematized Nomenclature of Medicine Clinical Terms; or dimensions dictated by specific usages, such as identifying expressions specific to a domain, a gender, or an intervention. The list was progressively deployed for clinicians with an iterative process of quality control, maintenance, and improvements, including the addition of new expressions or dimensions for specific needs. The problem management of the electronic health record allowed the measurement and correction of encoding based on real-world use. Results The list was deployed in production in January 2017 and was regularly updated and deployed in new divisions of the hospital. Over 4 years, 684,102 problems were created using the list. The proportion of free-text entries decreased progressively from 37.47% (8321/22,206) in December 2017 to 18.38% (4547/24,738) in December 2020. In the last version of the list, over 14 dimensions were mapped to expressions, among which 5 were international classifications and 8 were other classifications for specific uses. The list became a central axis in the electronic health record, being used for many different purposes linked to care, such as surgical planning or emergency wards, or in research, for various predictions using machine learning techniques. Conclusions This study breaks with common approaches primarily by focusing on real clinicians’ language when expressing patients’ problems and secondarily by mapping whatever is required, including controlled vocabularies to answer specific needs. This approach improves the quality of the expression of patients’ problems while allowing the building of as many structured dimensions as needed to convey semantics according to specific contexts. The method is shown to be scalable, sustainable, and efficient at hiding the complexity of semantics or the burden of constraint-structured problem list entry for clinicians. Ongoing work is analyzing the impact of this approach on how clinicians express patients’ problems.
Collapse
Affiliation(s)
- Christophe Gaudet-Blavignac
- Division of Medical Information Sciences, Geneva University Hospitals, Geneva, Switzerland.,Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| | - Andrea Rudaz
- Medical and Quality Directorate, Geneva University Hospitals, Geneva, Switzerland
| | - Christian Lovis
- Division of Medical Information Sciences, Geneva University Hospitals, Geneva, Switzerland.,Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| |
Collapse
|
5
|
Gaudet-Blavignac C, Foufi V, Bjelogrlic M, Lovis C. Use of the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) for Processing Free Text in Health Care: Systematic Scoping Review. J Med Internet Res 2021; 23:e24594. [PMID: 33496673 PMCID: PMC7872838 DOI: 10.2196/24594] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 11/24/2020] [Accepted: 11/30/2020] [Indexed: 12/19/2022] Open
Abstract
Background Interoperability and secondary use of data is a challenge in health care. Specifically, the reuse of clinical free text remains an unresolved problem. The Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) has become the universal language of health care and presents characteristics of a natural language. Its use to represent clinical free text could constitute a solution to improve interoperability. Objective Although the use of SNOMED and SNOMED CT has already been reviewed, its specific use in processing and representing unstructured data such as clinical free text has not. This review aims to better understand SNOMED CT's use for representing free text in medicine. Methods A scoping review was performed on the topic by searching MEDLINE, Embase, and Web of Science for publications featuring free-text processing and SNOMED CT. A recursive reference review was conducted to broaden the scope of research. The review covered the type of processed data, the targeted language, the goal of the terminology binding, the method used and, when appropriate, the specific software used. Results In total, 76 publications were selected for an extensive study. The language targeted by publications was 91% (n=69) English. The most frequent types of documents for which the terminology was used are complementary exam reports (n=18, 24%) and narrative notes (n=16, 21%). Mapping to SNOMED CT was the final goal of the research in 21% (n=16) of publications and a part of the final goal in 33% (n=25). The main objectives of mapping are information extraction (n=44, 39%), feature in a classification task (n=26, 23%), and data normalization (n=23, 20%). The method used was rule-based in 70% (n=53) of publications, hybrid in 11% (n=8), and machine learning in 5% (n=4). In total, 12 different software packages were used to map text to SNOMED CT concepts, the most frequent being Medtex, Mayo Clinic Vocabulary Server, and Medical Text Extraction Reasoning and Mapping System. Full terminology was used in 64% (n=49) of publications, whereas only a subset was used in 30% (n=23) of publications. Postcoordination was proposed in 17% (n=13) of publications, and only 5% (n=4) of publications specifically mentioned the use of the compositional grammar. Conclusions SNOMED CT has been largely used to represent free-text data, most frequently with rule-based approaches, in English. However, currently, there is no easy solution for mapping free text to this terminology and to perform automatic postcoordination. Most solutions conceive SNOMED CT as a simple terminology rather than as a compositional bag of ontologies. Since 2012, the number of publications on this subject per year has decreased. However, the need for formal semantic representation of free text in health care is high, and automatic encoding into a compositional ontology could be a solution.
Collapse
Affiliation(s)
- Christophe Gaudet-Blavignac
- Division of Medical Information Sciences, Geneva University Hospitals, Geneva, Switzerland.,Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| | - Vasiliki Foufi
- Division of Medical Information Sciences, Geneva University Hospitals, Geneva, Switzerland.,Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| | - Mina Bjelogrlic
- Division of Medical Information Sciences, Geneva University Hospitals, Geneva, Switzerland.,Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| | - Christian Lovis
- Division of Medical Information Sciences, Geneva University Hospitals, Geneva, Switzerland.,Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| |
Collapse
|
6
|
Liu H, Perl Y, Geller J. Concept placement using BERT trained by transforming and summarizing biomedical ontology structure. J Biomed Inform 2020; 112:103607. [PMID: 33098987 DOI: 10.1016/j.jbi.2020.103607] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 09/07/2020] [Accepted: 10/17/2020] [Indexed: 11/17/2022]
Abstract
The comprehensive modeling and hierarchical positioning of a new concept in an ontology heavily relies on its set of proper subsumption relationships (IS-As) to other concepts. Identifying a concept's IS-A relationships is a laborious task requiring curators to have both domain knowledge and terminology skills. In this work, we propose a method to automatically predict the presence of IS-A relationships between a new concept and pre-existing concepts based on the language representation model BERT. This method converts the neighborhood network of a concept into "sentences" and harnesses BERT's Next Sentence Prediction (NSP) capability of predicting the adjacency of two sentences. To augment our method's performance, we refined the training data by employing an ontology summarization technique. We trained our model with the two largest hierarchies of the SNOMED CT 2017 July release and applied it to predicting the parents of new concepts added in the SNOMED CT 2018 January release. The results showed that our method achieved an average F1 score of 0.88, and the average Recall score improves slightly from 0.94 to 0.96 by using the ontology summarization technique.
Collapse
Affiliation(s)
- Hao Liu
- Dept of Computer Science, NJIT, Newark, NJ, USA.
| | | | | |
Collapse
|
7
|
Agrawal A, Qazi K. Detecting modeling inconsistencies in SNOMED CT using a machine learning technique. Methods 2020; 179:111-118. [PMID: 32442671 DOI: 10.1016/j.ymeth.2020.05.019] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2020] [Revised: 05/07/2020] [Accepted: 05/18/2020] [Indexed: 11/19/2022] Open
Abstract
SNOMED CT is a comprehensive and evolving clinical reference terminology that has been widely adopted as a common vocabulary to promote interoperability between Electronic Health Records. Owing to its importance in healthcare, quality assurance becomes an integral part of the lifecycle of SNOMED CT. While, manual auditing of every concept in SNOMED CT is difficult and labor intensive, identifying inconsistencies in the modeling of concepts without any context can be challenging. Algorithmic techniques are needed to identify modeling inconsistencies, if any, in SNOMED CT. This study proposes a context-based, machine learning quality assurance technique to identify concepts in SNOMED CT that may be in need of auditing. The Clinical Finding and the Procedure hierarchies are used as a testbed to check the efficacy of the method. Results of auditing show that the method identified inconsistencies in 72% of the concept pairs that were deemed inconsistent by the algorithm. The method is shown to be effective in both maximizing the yield of correction, as well as providing a context to identify the inconsistencies. Such methods, along with SNOMED International's own efforts, can greatly help reduce inconsistencies in SNOMED CT.
Collapse
Affiliation(s)
- Ankur Agrawal
- Department of Computer Science, Manhattan College, NY, USA.
| | | |
Collapse
|
8
|
Yu B, He Z, Xing A, Lustria MLA. An Informatics Framework to Assess Consumer Health Language Complexity Differences: Proof-of-Concept Study. J Med Internet Res 2020; 22:e16795. [PMID: 32436849 PMCID: PMC7273233 DOI: 10.2196/16795] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 01/21/2020] [Accepted: 02/21/2020] [Indexed: 11/23/2022] Open
Abstract
Background The language gap between health consumers and health professionals has been long recognized as the main hindrance to effective health information comprehension. Although providing health information access in consumer health language (CHL) is widely accepted as the solution to the problem, health consumers are found to have varying health language preferences and proficiencies. To simplify health documents for heterogeneous consumer groups, it is important to quantify how CHLs are different in terms of complexity among various consumer groups. Objective This study aimed to propose an informatics framework (consumer health language complexity [CHELC]) to assess the complexity differences of CHL using syntax-level, text-level, term-level, and semantic-level complexity metrics. Specifically, we identified 8 language complexity metrics validated in previous literature and combined them into a 4-faceted framework. Through a rank-based algorithm, we developed unifying scores (CHELC scores [CHELCS]) to quantify syntax-level, text-level, term-level, semantic-level, and overall CHL complexity. We applied CHELCS to compare posts of each individual on online health forums designed for (1) the general public, (2) deaf and hearing-impaired people, and (3) people with autism spectrum disorder (ASD). Methods We examined posts with more than 4 sentences of each user from 3 health forums to understand CHL complexity differences among these groups: 12,560 posts from 3756 users in Yahoo! Answers, 25,545 posts from 1623 users in AllDeaf, and 26,484 posts from 2751 users in Wrong Planet. We calculated CHELCS for each user and compared the scores of 3 user groups (ie, deaf and hearing-impaired people, people with ASD, and the public) through 2-sample Kolmogorov-Smirnov tests and analysis of covariance tests. Results The results suggest that users in the public forum used more complex CHL, particularly more diverse semantics and more complex health terms compared with users in the ASD and deaf and hearing-impaired user forums. However, between the latter 2 groups, people with ASD used more complex words, and deaf and hearing-impaired users used more complex syntax. Conclusions Our results show that the users in 3 online forums had significantly different CHL complexities in different facets. The proposed framework and detailed measurements help to quantify these CHL complexity differences comprehensively. The results emphasize the importance of tailoring health-related content for different consumer groups with varying CHL complexities.
Collapse
Affiliation(s)
- Biyang Yu
- Florida State University, School of Information, Tallahassee, FL, United States
| | - Zhe He
- Florida State University, School of Information, Tallahassee, FL, United States
| | - Aiwen Xing
- Florida State University, Department of Statistics, Tallahassee, FL, United States
| | - Mia Liza A Lustria
- Florida State University, School of Information, Tallahassee, FL, United States
| |
Collapse
|
9
|
Wang L, Wang Y, Shen F, Rastegar-Mojarad M, Liu H. Discovering associations between problem list and practice setting. BMC Med Inform Decis Mak 2019; 19:69. [PMID: 30943957 PMCID: PMC6448189 DOI: 10.1186/s12911-019-0779-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Background The Health Information Technology for Economic and Clinical Health Act (HITECH) has greatly accelerated the adoption of electronic health records (EHRs) with the promise of better clinical decisions and patients’ outcomes. One of the core criteria for “Meaningful Use” of EHRs is to have a problem list that shows the most important health problems faced by a patient. The implementation of problem lists in EHRs has a potential to help practitioners to provide customized care to patients. However, it remains an open question on how to leverage problem lists in different practice settings to provide tailored care, of which the bottleneck lies in the associations between problem list and practice setting. Methods In this study, using sampled clinical documents associated with a cohort of patients who received their primary care at Mayo Clinic, we investigated the associations between problem list and practice setting through natural language processing (NLP) and topic modeling techniques. Specifically, after practice settings and problem lists were normalized, statistical χ2 test, term frequency-inverse document frequency (TF-IDF) and enrichment analysis were used to choose representative concepts for each setting. Then Latent Dirichlet Allocations (LDA) were used to train topic models and predict potential practice settings using similarity metrics based on the problem concepts representative of practice settings. Evaluation was conducted through 5-fold cross validation and Recall@k, Precision@k and F1@k were calculated. Results Our method can generate prioritized and meaningful problem lists corresponding to specific practice settings. For practice setting prediction, recall increases from 0.719 (k = 2) to 0.931 (k = 10), precision increases from 0.882 (k = 2) to 0.931 (k = 10) and F1 increases from 0.790 (k = 2) to 0.931 (k = 10). Conclusion To our best knowledge, our study is the first attempting to discover the association between the problem lists and hospital practice settings. In the future, we plan to investigate how to provide more tailored care by utilizing the association between problem list and practice setting revealed in this study. Electronic supplementary material The online version of this article (10.1186/s12911-019-0779-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Liwei Wang
- Division of Digital Health Sciences, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, 55905, USA
| | - Yanshan Wang
- Division of Digital Health Sciences, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, 55905, USA
| | - Feichen Shen
- Division of Digital Health Sciences, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, 55905, USA
| | - Majid Rastegar-Mojarad
- Division of Digital Health Sciences, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, 55905, USA
| | - Hongfang Liu
- Division of Digital Health Sciences, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, 55905, USA.
| |
Collapse
|
10
|
Amith M, He Z, Bian J, Lossio-Ventura JA, Tao C. Assessing the practice of biomedical ontology evaluation: Gaps and opportunities. J Biomed Inform 2018; 80:1-13. [PMID: 29462669 PMCID: PMC5882531 DOI: 10.1016/j.jbi.2018.02.010] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Revised: 02/12/2018] [Accepted: 02/16/2018] [Indexed: 11/26/2022]
Abstract
With the proliferation of heterogeneous health care data in the last three decades, biomedical ontologies and controlled biomedical terminologies play a more and more important role in knowledge representation and management, data integration, natural language processing, as well as decision support for health information systems and biomedical research. Biomedical ontologies and controlled terminologies are intended to assure interoperability. Nevertheless, the quality of biomedical ontologies has hindered their applicability and subsequent adoption in real-world applications. Ontology evaluation is an integral part of ontology development and maintenance. In the biomedicine domain, ontology evaluation is often conducted by third parties as a quality assurance (or auditing) effort that focuses on identifying modeling errors and inconsistencies. In this work, we first organized four categorical schemes of ontology evaluation methods in the existing literature to create an integrated taxonomy. Further, to understand the ontology evaluation practice in the biomedicine domain, we reviewed a sample of 200 ontologies from the National Center for Biomedical Ontology (NCBO) BioPortal-the largest repository for biomedical ontologies-and observed that only 15 of these ontologies have documented evaluation in their corresponding inception papers. We then surveyed the recent quality assurance approaches for biomedical ontologies and their use. We also mapped these quality assurance approaches to the ontology evaluation criteria. It is our anticipation that ontology evaluation and quality assurance approaches will be more widely adopted in the development life cycle of biomedical ontologies.
Collapse
Affiliation(s)
- Muhammad Amith
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Zhe He
- School of Information, Florida State University, Tallahassee, FL, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | | | - Cui Tao
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA.
| |
Collapse
|
11
|
Semantics-Powered Healthcare Engineering and Data Analytics. JOURNAL OF HEALTHCARE ENGINEERING 2017; 2017:7983473. [PMID: 29214005 PMCID: PMC5682067 DOI: 10.1155/2017/7983473] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 08/17/2017] [Accepted: 08/20/2017] [Indexed: 11/17/2022]
|
12
|
Zhang R, Liu J, Huang Y, Wang M, Shi Q, Chen J, Zeng Z. Enriching the international clinical nomenclature with Chinese daily used synonyms and concept recognition in physician notes. BMC Med Inform Decis Mak 2017; 17:54. [PMID: 28464923 PMCID: PMC5414139 DOI: 10.1186/s12911-017-0455-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 04/26/2017] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND It has been shown that the entities in everyday clinical text are often expressed in a way that varies from how they are expressed in the nomenclature. Owing to lots of synonyms, abbreviations, medical jargons or even misspellings in the daily used physician notes in clinical information system (CIS), the terminology without enough synonyms may not be adequately suitable for the task of Chinese clinical term recognition. METHODS This paper demonstrates a validated system to retrieve the Chinese term of clinical finding (CTCF) from CIS and map them to the corresponding concepts of international clinical nomenclature, such as SNOMED CT. The system focuses on the SNOMED CT with Chinese synonyms enrichment (SCCSE). The literal similarity and the diagnosis-related similarity metrics were used for concept mapping. Two CTCF recognition methods, the rule- and terminology-based approach (RTBA) and the conditional random field machine learner (CRF), were adopted to identify the concepts in physician notes. The system was validated against the history of present illness annotated by clinical experts. The RTBA and CRF could be combined to predict new CTCFs besides SCCSE persistently. RESULTS Around 59,000 CTCF candidates were accepted as valid and 39,000 of them occurred at least once in the history of present illness. 3,729 of them were accordant with the description in referenced Chinese clinical nomenclature, which could cross map to other international nomenclature such as SNOMED CT. With the hybrid similarity metrics, another 7,454 valid CTCFs (synonyms) were succeeded in concept mapping. For CTCF recognition in physician notes, a series of experiments were performed to find out the best CRF feature set, which gained an F-score of 0.887. The RTBA achieved a better F-score of 0.919 by the CTCF dictionary created in this research. CONCLUSIONS This research demonstrated that it is feasible to help the SNOMED CT with Chinese synonyms enrichment based on physician notes in CIS. With continuous maintenance of SCCSE, the CTCFs could be precisely retrieved from free text, and the CTCFs arranged in semantic hierarchy of SNOMED CT could greatly improve the meaningful use of electronic health record in China. The methodology is also useful for clinical synonyms enrichment in other languages.
Collapse
Affiliation(s)
- Rui Zhang
- Department of Medical Informatics, West China School of Medicine/West China Hospital, Sichuan University, Chengdu, Sichuan, People's Republic of China
| | - Jialin Liu
- Department of Medical Informatics, West China School of Medicine/West China Hospital, Sichuan University, Chengdu, Sichuan, People's Republic of China
- Information Center, West China Hospital, Sichuan University, Chengdu, Sichuan, People's Republic of China
- Department of Otorhinolaryngology, West China Hospital, Sichuan University, Chengdu, Sichuan, People's Republic of China
| | - Yong Huang
- Information Center, West China Hospital, Sichuan University, Chengdu, Sichuan, People's Republic of China
| | - Miye Wang
- Information Center, West China Hospital, Sichuan University, Chengdu, Sichuan, People's Republic of China
| | - Qingke Shi
- Information Center, West China Hospital, Sichuan University, Chengdu, Sichuan, People's Republic of China
| | - Jun Chen
- Department of Ophthalmology, West China Hospital, Sichuan University, Chengdu, Sichuan, People's Republic of China
| | - Zhi Zeng
- Department of Medical Informatics, West China School of Medicine/West China Hospital, Sichuan University, Chengdu, Sichuan, People's Republic of China.
- Department of Cardiology, West China Hospital, Sichuan University, Chengdu, Sichuan, People's Republic of China.
| |
Collapse
|
13
|
Abstract
The use of standard terminologies is an essential component for using data to inform practice and conduct research; perinatal nursing data standardization is needed. This study explored whether 76 distinct process elements important for perinatal nursing were present in four American Nurses Association-recognized standard terminologies. The 76 process elements were taken from a valid paper-based perinatal nursing process measurement tool. Using terminology-supported browsers, the elements were manually mapped to the selected terminologies by the researcher. A five-member expert panel validated 100% of the mapping findings. The majority of the process elements (n = 63, 83%) were present in SNOMED-CT, 28% (n = 21) in LOINC, 34% (n = 26) in ICNP, and 15% (n = 11) in CCC. SNOMED-CT and LOINC are terminologies currently recommended for use to facilitate interoperability in the capture of assessment and problem data in certified electronic medical records. Study results suggest that SNOMED-CT and LOINC contain perinatal nursing process elements and are useful standard terminologies to support perinatal nursing practice in electronic health records. Terminology mapping is the first step toward incorporating traditional paper-based tools into electronic systems.
Collapse
|
14
|
He Z, Chen Z, Oh S, Hou J, Bian J. Enriching consumer health vocabulary through mining a social Q&A site: A similarity-based approach. J Biomed Inform 2017; 69:75-85. [PMID: 28359728 DOI: 10.1016/j.jbi.2017.03.016] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Revised: 03/21/2017] [Accepted: 03/24/2017] [Indexed: 11/29/2022]
Abstract
The widely known vocabulary gap between health consumers and healthcare professionals hinders information seeking and health dialogue of consumers on end-user health applications. The Open Access and Collaborative Consumer Health Vocabulary (OAC CHV), which contains health-related terms used by lay consumers, has been created to bridge such a gap. Specifically, the OAC CHV facilitates consumers' health information retrieval by enabling consumer-facing health applications to translate between professional language and consumer friendly language. To keep up with the constantly evolving medical knowledge and language use, new terms need to be identified and added to the OAC CHV. User-generated content on social media, including social question and answer (social Q&A) sites, afford us an enormous opportunity in mining consumer health terms. Existing methods of identifying new consumer terms from text typically use ad-hoc lexical syntactic patterns and human review. Our study extends an existing method by extracting n-grams from a social Q&A textual corpus and representing them with a rich set of contextual and syntactic features. Using K-means clustering, our method, simiTerm, was able to identify terms that are both contextually and syntactically similar to the existing OAC CHV terms. We tested our method on social Q&A corpora on two disease domains: diabetes and cancer. Our method outperformed three baseline ranking methods. A post-hoc qualitative evaluation by human experts further validated that our method can effectively identify meaningful new consumer terms on social Q&A.
Collapse
Affiliation(s)
- Zhe He
- School of Information, Florida State University, Tallahassee, FL 32306, USA; Institute for Successful Longevity, Florida State University, Tallahassee, FL 32306, USA.
| | - Zhiwei Chen
- Department of Computer Science, Florida State University, Tallahassee, FL 32306, USA
| | - Sanghee Oh
- Department of Library and Information Science, Chungnam National University, South Korea
| | - Jinghui Hou
- School of Communication, Florida State University, Tallahassee, FL 32306, USA
| | - Jiang Bian
- Department of Health Outcomes and Policy, University of Florida, Gainesville, FL 32608, USA
| |
Collapse
|
15
|
Chen CC, Chang CH, Peng YC, Poon SK, Huang SC, Li YCJ. Effect of implementation of a coded problem list entry subsystem. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2016; 134:1-9. [PMID: 27480728 DOI: 10.1016/j.cmpb.2016.05.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Revised: 05/25/2016] [Accepted: 05/25/2016] [Indexed: 06/06/2023]
Abstract
OBJECTIVES Complete patient problem lists may improve the quality of care. To improve the completeness of the lists at our institution, we implemented the coded problem list entry subsystem (CPLES) in our electronic medical record system. Subsequently, physicians used the CPLES instead of handwritten notes to document coded problem lists and progress notes. We evaluated the effect of implementing the CPLES on the completeness of problem lists. METHODS We compared the completeness of coded problem lists input after CPLES implementation with that of problem lists handwritten before CPLES implementation and determined the differences. Moreover, the efficiency and usability of the CPLES were evaluated. RESULTS The efficiency and usability of CPLES were acceptable. However, the completeness of problem lists was reduced after CPLES implementation. The possible reasons for this reduction, namely system usability, efficacy, incentives, leadership, and education, were crucial for successful CPLES implementation and are discussed in the text. CONCLUSION CPLES implementation reduced the completeness of problem lists. Institutions may learn from our experience and carefully implement their own coded problem list systems to avoid this consequence.
Collapse
Affiliation(s)
- Chia-Chang Chen
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Taichung Veterans General Hospital, Taichung, Taiwan; College of Medicine Science and Technology, Graduate Institute of Biomedical Informatics, Taipei Medical University, Taipei, Taiwan
| | - Chung-Hsin Chang
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Yen-Chun Peng
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Taichung Veterans General Hospital, Taichung, Taiwan; School of Medicine, National Yang-Ming University, Taipei, Taiwan
| | - Sek-Kwong Poon
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Shih-Che Huang
- Department of Emergency Medicine, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Yu-Chuan Jack Li
- College of Medicine Science and Technology, Graduate Institute of Biomedical Informatics, Taipei Medical University, Taipei, Taiwan; Department of Dermatology, Taipei Medical University-Wan Fang Hospital, Taipei, Taiwan.
| |
Collapse
|
16
|
Telang PR, Kalia AK, Singh MP. Modeling Healthcare Processes Using Commitments: An Empirical Evaluation. PLoS One 2015; 10:e0141202. [PMID: 26539985 PMCID: PMC4634947 DOI: 10.1371/journal.pone.0141202] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2014] [Accepted: 10/06/2015] [Indexed: 11/18/2022] Open
Abstract
The two primary objectives of this paper are: (a) to demonstrate how Comma, a business modeling methodology based on commitments, can be applied in healthcare process modeling, and (b) to evaluate the effectiveness of such an approach in producing healthcare process models. We apply the Comma approach on a breast cancer diagnosis process adapted from an HHS committee report, and presents the results of an empirical study that compares Comma with a traditional approach based on the HL7 Messaging Standard (Traditional-HL7). Our empirical study involved 47 subjects, and two phases. In the first phase, we partitioned the subjects into two approximately equal groups. We gave each group the same requirements based on a process scenario for breast cancer diagnosis. Members of one group first applied Traditional-HL7 and then Comma whereas members of the second group first applied Comma and then Traditional-HL7—each on the above-mentioned requirements. Thus, each subject produced two models, each model being a set of UML Sequence Diagrams. In the second phase, we repartitioned the subjects into two groups with approximately equal distributions from both original groups. We developed exemplar Traditional-HL7 and Comma models; we gave one repartitioned group our Traditional-HL7 model and the other repartitioned group our Comma model. We provided the same changed set of requirements to all subjects and asked them to modify the provided exemplar model to satisfy the new requirements. We assessed solutions produced by subjects in both phases with respect to measures of flexibility, time, difficulty, objective quality, and subjective quality. Our study found that Comma is superior to Traditional-HL7 in flexibility and objective quality as validated via Student’s t-test to the 10% level of significance. Comma is a promising new approach for modeling healthcare processes. Further gains could be made through improved tooling and enhanced training of modeling personnel.
Collapse
Affiliation(s)
- Pankaj R. Telang
- Cisco Systems Inc., Research Triangle Park, Durham, North Carolina, United States of America
- * E-mail:
| | - Anup K. Kalia
- Department of Computer Science, NC State University, Raleigh, North Carolina, United States of America
| | - Munindar P. Singh
- Department of Computer Science, NC State University, Raleigh, North Carolina, United States of America
| |
Collapse
|
17
|
Fung KW, Xu J. An exploration of the properties of the CORE problem list subset and how it facilitates the implementation of SNOMED CT. J Am Med Inform Assoc 2015; 22:649-58. [PMID: 25725003 PMCID: PMC5566198 DOI: 10.1093/jamia/ocu022] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2014] [Revised: 10/16/2014] [Accepted: 11/06/2014] [Indexed: 11/14/2022] Open
Abstract
OBJECTIVE Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) is the emergent international health terminology standard for encoding clinical information in electronic health records. The CORE Problem List Subset was created to facilitate the terminology's implementation. This study evaluates the CORE Subset's coverage and examines its growth pattern as source datasets are being incorporated. METHODS Coverage of frequently used terms and the corresponding usage of the covered terms were assessed by "leave-one-out" analysis of the eight datasets constituting the current CORE Subset. The growth pattern was studied using a retrospective experiment, growing the Subset one dataset at a time and examining the relationship between the size of the starting subset and the coverage of frequently used terms in the incoming dataset. Linear regression was used to model that relationship. RESULTS On average, the CORE Subset covered 80.3% of the frequently used terms of the left-out dataset, and the covered terms accounted for 83.7% of term usage. There was a significant positive correlation between the CORE Subset's size and the coverage of the frequently used terms in an incoming dataset. This implies that the CORE Subset will grow at a progressively slower pace as it gets bigger. CONCLUSION The CORE Problem List Subset is a useful resource for the implementation of Systematized Nomenclature of Medicine Clinical Terms in electronic health records. It offers good coverage of frequently used terms, which account for a high proportion of term usage. If future datasets are incorporated into the CORE Subset, it is likely that its size will remain small and manageable.
Collapse
Affiliation(s)
| | - Julia Xu
- National Library of Medicine, Bethesda, MD, USA
| |
Collapse
|
18
|
He Z, Geller J, Chen Y. A comparative analysis of the density of the SNOMED CT conceptual content for semantic harmonization. Artif Intell Med 2015; 64:29-40. [PMID: 25890688 DOI: 10.1016/j.artmed.2015.03.002] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2014] [Revised: 03/20/2015] [Accepted: 03/25/2015] [Indexed: 11/17/2022]
Abstract
OBJECTIVES Medical terminologies vary in the amount of concept information (the "density") represented, even in the same sub-domains. This causes problems in terminology mapping, semantic harmonization and terminology integration. Moreover, complex clinical scenarios need to be encoded by a medical terminology with comprehensive content. SNOMED Clinical Terms (SNOMED CT), a leading clinical terminology, was reported to lack concepts and synonyms, problems that cannot be fully alleviated by using post-coordination. Therefore, a scalable solution is needed to enrich the conceptual content of SNOMED CT. We are developing a structure-based, algorithmic method to identify potential concepts for enriching the conceptual content of SNOMED CT and to support semantic harmonization of SNOMED CT with selected other Unified Medical Language System (UMLS) terminologies. METHODS We first identified a subset of English terminologies in the UMLS that have 'PAR' relationship labeled with 'IS_A' and over 10% overlap with one or more of the 19 hierarchies of SNOMED CT. We call these "reference terminologies" and we note that our use of this name is different from the standard use. Next, we defined a set of topological patterns across pairs of terminologies, with SNOMED CT being one terminology in each pair and the other being one of the reference terminologies. We then explored how often these topological patterns appear between SNOMED CT and each reference terminology, and how to interpret them. RESULTS Four viable reference terminologies were identified. Large density differences between terminologies were found. Expected interpretations of these differences were indeed observed, as follows. A random sample of 299 instances of special topological patterns ("2:3 and 3:2 trapezoids") showed that 39.1% and 59.5% of analyzed concepts in SNOMED CT and in a reference terminology, respectively, were deemed to be alternative classifications of the same conceptual content. In 30.5% and 17.6% of the cases, it was found that intermediate concepts could be imported into SNOMED CT or into the reference terminology, respectively, to enhance their conceptual content, if approved by a human curator. Other cases included synonymy and errors in one of the terminologies. CONCLUSION These results show that structure-based algorithmic methods can be used to identify potential concepts to enrich SNOMED CT and the four reference terminologies. The comparative analysis has the future potential of supporting terminology authoring by suggesting new content to improve content coverage and semantic harmonization between terminologies.
Collapse
Affiliation(s)
- Zhe He
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA.
| | - James Geller
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA
| | - Yan Chen
- Department of Computer Information Systems, Borough of Manhattan Community College, City University New York, New York, NY 10007, USA
| |
Collapse
|
19
|
Ochs C, Geller J, Perl Y, Chen Y, Agrawal A, Case JT, Hripcsak G. A tribal abstraction network for SNOMED CT target hierarchies without attribute relationships. J Am Med Inform Assoc 2014; 22:628-39. [PMID: 25332354 DOI: 10.1136/amiajnl-2014-003173] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2014] [Accepted: 09/20/2014] [Indexed: 11/03/2022] Open
Abstract
OBJECTIVE Large and complex terminologies, such as Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT), are prone to errors and inconsistencies. Abstraction networks are compact summarizations of the content and structure of a terminology. Abstraction networks have been shown to support terminology quality assurance. In this paper, we introduce an abstraction network derivation methodology which can be applied to SNOMED CT target hierarchies whose classes are defined using only hierarchical relationships (ie, without attribute relationships) and similar description-logic-based terminologies. METHODS We introduce the tribal abstraction network (TAN), based on the notion of a tribe-a subhierarchy rooted at a child of a hierarchy root, assuming only the existence of concepts with multiple parents. The TAN summarizes a hierarchy that does not have attribute relationships using sets of concepts, called tribal units that belong to exactly the same multiple tribes. Tribal units are further divided into refined tribal units which contain closely related concepts. A quality assurance methodology that utilizes TAN summarizations is introduced. RESULTS A TAN is derived for the Observable entity hierarchy of SNOMED CT, summarizing its content. A TAN-based quality assurance review of the concepts of the hierarchy is performed, and erroneous concepts are shown to appear more frequently in large refined tribal units than in small refined tribal units. Furthermore, more erroneous concepts appear in large refined tribal units of more tribes than of fewer tribes. CONCLUSIONS In this paper we introduce the TAN for summarizing SNOMED CT target hierarchies. A TAN was derived for the Observable entity hierarchy of SNOMED CT. A quality assurance methodology utilizing the TAN was introduced and demonstrated.
Collapse
Affiliation(s)
- Christopher Ochs
- Computer Science Department, New Jersey Institute of Technology, Newark, New Jersey, USA
| | - James Geller
- Computer Science Department, New Jersey Institute of Technology, Newark, New Jersey, USA
| | - Yehoshua Perl
- Computer Science Department, New Jersey Institute of Technology, Newark, New Jersey, USA
| | - Yan Chen
- Computer Information Systems Department, BMCC, CUNY, New York, New York, USA
| | - Ankur Agrawal
- Department of Computer Science, Manhattan College, Riverdale, New York, USA
| | | | - George Hripcsak
- Department of Biomedical Informatics, Columbia University, New York, New Jersey, USA
| |
Collapse
|