1
|
Kim HH, Park YR, Lee KH, Song YS, Kim JH. Clinical MetaData ontology: a simple classification scheme for data elements of clinical data based on semantics. BMC Med Inform Decis Mak 2019; 19:166. [PMID: 31429750 PMCID: PMC6701018 DOI: 10.1186/s12911-019-0877-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 07/24/2019] [Indexed: 11/26/2022] Open
Abstract
Background The increasing use of common data elements (CDEs) in numerous research projects and clinical applications has made it imperative to create an effective classification scheme for the efficient management of these data elements. We applied high-level integrative modeling of entire clinical documents from real-world practice to create the Clinical MetaData Ontology (CMDO) for the appropriate classification and integration of CDEs that are in practical use in current clinical documents. Methods CMDO was developed using the General Formal Ontology method with a manual iterative process comprising five steps: (1) defining the scope of CMDO by conceptualizing its first-level terms based on an analysis of clinical-practice procedures, (2) identifying CMDO concepts for representing clinical data of general CDEs by examining how and what clinical data are generated with flows of clinical care practices, (3) assigning hierarchical relationships for CMDO concepts, (4) developing CMDO properties (e.g., synonyms, preferred terms, and definitions) for each CMDO concept, and (5) evaluating the utility of CMDO. Results We created CMDO comprising 189 concepts under the 4 first-level classes of Description, Event, Finding, and Procedure. CMDO has 256 definitions that cover the 189 CMDO concepts, with 459 synonyms for 139 (74.0%) of the concepts. All of the CDEs extracted from 6 HL7 templates, 25 clinical documents of 5 teaching hospitals, and 1 personal health record specification were successfully annotated by 41 (21.9%), 89 (47.6%), and 13 (7.0%) of the CMDO concepts, respectively. We created a CMDO Browser to facilitate navigation of the CMDO concept hierarchy and a CMDO-enabled CDE Browser for displaying the relationships between CMDO concepts and the CDEs extracted from the clinical documents that are used in current practice. Conclusions CMDO is an ontology and classification scheme for CDEs used in clinical documents. Given the increasing use of CDEs in many studies and real-world clinical documentation, CMDO will be a useful tool for integrating numerous CDEs from different research projects and clinical documents. The CMDO Browser and CMDO-enabled CDE Browser make it easy to search, share, and reuse CDEs, and also effectively integrate and manage CDEs from different studies and clinical documents. Electronic supplementary material The online version of this article (10.1186/s12911-019-0877-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hye Hyeon Kim
- Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine, Seoul, 03080, Republic of Korea.,Seoul National University Hospital Biomedical Research Institute, Seoul National University Hospital, Seoul, 03080, Republic of Korea
| | - Yu Rang Park
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, 03722, Republic of Korea
| | - Kye Hwa Lee
- Precision Medicine Center, Seoul National University Hospital, Seoul, 03080, Republic of Korea
| | - Young Soo Song
- Department of Pathology, Hanyang University College of Medicine, Seoul, 04763, Republic of Korea.
| | - Ju Han Kim
- Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine, Seoul, 03080, Republic of Korea. .,Division of Biomedical Informatics, Seoul National University College of Medicine, 103 Daehak-ro Jongno-gu, Seoul, 03080, Republic of Korea.
| |
Collapse
|
2
|
Maarouf H, Taboada M, Rodriguez H, Arias M, Sesar Á, Sobrido MJ. An ontology-aware integration of clinical models, terminologies and guidelines: an exploratory study of the Scale for the Assessment and Rating of Ataxia (SARA). BMC Med Inform Decis Mak 2017; 17:159. [PMID: 29207981 PMCID: PMC5718136 DOI: 10.1186/s12911-017-0568-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2017] [Accepted: 11/24/2017] [Indexed: 11/30/2022] Open
Abstract
Background Electronic rating scales represent an important resource for standardized data collection. However, the ability to exploit reasoning on rating scale data is still limited. The objective of this work is to facilitate the integration of the semantics required to automatically interpret collections of standardized clinical data. We developed an electronic prototype for the Scale of the Assessment and Rating of Ataxia (SARA), broadly used in neurology. In order to address the modeling challenges of the SARA, we propose to combine the best performances from OpenEHR clinical archetypes, guidelines and ontologies. Methods A scaled-down version of the Human Phenotype Ontology (HPO) was built, extracting the terms that describe the SARA tests from free-text sources. This version of the HPO was then used as backbone to normalize the content of the SARA through clinical archetypes. The knowledge required to exploit reasoning on the SARA data was modeled as separate information-processing units interconnected via the defined archetypes. Each unit used the most appropriate technology to formally represent the required knowledge. Results Based on this approach, we implemented a prototype named SARA Management System, to be used for both the assessment of cerebellar syndrome and the production of a clinical synopsis. For validation purposes, we used recorded SARA data from 28 anonymous subjects affected by Spinocerebellar Ataxia Type 36 (SCA36). When comparing the performance of our prototype with that of two independent experts, weighted kappa scores ranged from 0.62 to 0.86. Conclusions The combination of archetypes, phenotype ontologies and electronic information-processing rules can be used to automate the extraction of relevant clinical knowledge from plain scores of rating scales. Our results reveal a substantial degree of agreement between the results achieved by an ontology-aware system and the human experts. Electronic supplementary material The online version of this article (10.1186/s12911-017-0568-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Haitham Maarouf
- Department of Electronics & Computer Science, Campus Vida, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - María Taboada
- Department of Electronics & Computer Science, Campus Vida, University of Santiago de Compostela, Santiago de Compostela, Spain.
| | - Hadriana Rodriguez
- Department of Electronics & Computer Science, Campus Vida, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - Manuel Arias
- Department of Neurology, University Hospital of Santiago de Compostela, Santiago de Compostela, Spain
| | - Ángel Sesar
- Department of Neurology, University Hospital of Santiago de Compostela, Santiago de Compostela, Spain
| | - María Jesús Sobrido
- Instituto de Investigación Sanitaria (IDIS), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Santiago de Compostela, Spain
| |
Collapse
|
3
|
Park YR, Kim JJ, Yoon YJ, Yoon YK, Koo HY, Hong YM, Jang GY, Shin SY, Lee JK. Establishment of Kawasaki disease database based on metadata standard. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2017; 2016:baw109. [PMID: 27630202 PMCID: PMC4962667 DOI: 10.1093/database/baw109] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2016] [Accepted: 06/29/2016] [Indexed: 12/17/2022]
Abstract
Kawasaki disease (KD) is a rare disease that occurs predominantly in infants and young children. To identify KD susceptibility genes and to develop a diagnostic test, a specific therapy, or prevention method, collecting KD patients’ clinical and genomic data is one of the major issues. For this purpose, Kawasaki Disease Database (KDD) was developed based on the efforts of Korean Kawasaki Disease Genetics Consortium (KKDGC). KDD is a collection of 1292 clinical data and genomic samples of 1283 patients from 13 KKDGC-participating hospitals. Each sample contains the relevant clinical data, genomic DNA and plasma samples isolated from patients’ blood, omics data and KD-associated genotype data. Clinical data was collected and saved using the common data elements based on the ISO/IEC 11179 metadata standard. Two genome-wide association study data of total 482 samples and whole exome sequencing data of 12 samples were also collected. In addition, KDD includes the rare cases of KD (16 cases with family history, 46 cases with recurrence, 119 cases with intravenous immunoglobulin non-responsiveness, and 52 cases with coronary artery aneurysm). As the first public database for KD, KDD can significantly facilitate KD studies. All data in KDD can be searchable and downloadable. KDD was implemented in PHP, MySQL and Apache, with all major browsers supported. Database URL:http://www.kawasakidisease.kr
Collapse
Affiliation(s)
- Yu Rang Park
- Clinical Research Center, Asan Institute of Life Sciences, Asan Medical Center, Seoul, Korea Department of Biomedical Informatics, Asan Medical Center, Seoul, Korea
| | - Jae-Jung Kim
- Asan Institute of Life Sciences, Asan Medical Center, Seoul, Korea
| | - Young Jo Yoon
- Clinical Research Center, Asan Institute of Life Sciences, Asan Medical Center, Seoul, Korea
| | - Young-Kwang Yoon
- Clinical Research Center, Asan Institute of Life Sciences, Asan Medical Center, Seoul, Korea
| | - Ha Yeong Koo
- Clinical Research Center, Asan Institute of Life Sciences, Asan Medical Center, Seoul, Korea
| | - Young Mi Hong
- Department of Pediatrics, Ewha Womans University Hospital, Seoul, Korea
| | - Gi Young Jang
- Department of Pediatrics, Korea University Hospital, Seoul, Korea
| | - Soo-Yong Shin
- Department of Biomedical Informatics, Asan Medical Center, Seoul, Korea
| | - Jong-Keuk Lee
- Asan Institute of Life Sciences, Asan Medical Center, Seoul, Korea
| | | |
Collapse
|
4
|
Farber GK. Can data repositories help find effective treatments for complex diseases? Prog Neurobiol 2017; 152:200-212. [PMID: 27018167 PMCID: PMC5035561 DOI: 10.1016/j.pneurobio.2016.03.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2015] [Revised: 12/31/2015] [Accepted: 03/22/2016] [Indexed: 01/28/2023]
Abstract
There are many challenges to developing treatments for complex diseases. This review explores the question of whether it is possible to imagine a data repository that would increase the pace of understanding complex diseases sufficiently well to facilitate the development of effective treatments. First, consideration is given to the amount of data that might be needed for such a data repository and whether the existing data storage infrastructure is enough. Several successful data repositories are then examined to see if they have common characteristics. An area of science where unsuccessful attempts to develop a data infrastructure is then described to see what lessons could be learned for a data repository devoted to complex disease. Then, a variety of issues related to sharing data are discussed. In some of these areas, it is reasonably clear how to move forward. In other areas, there are significant open questions that need to be addressed by all data repositories. Using that baseline information, the question of whether data archives can be effective in understanding a complex disease is explored. The major goal of such a data archive is likely to be identifying biomarkers that define sub-populations of the disease.
Collapse
Affiliation(s)
- Gregory K Farber
- Office of Technology Development and Coordination, National Institute of Mental Health, National Institutes of Health, 6001 Executive Boulevard, Room 7162, Rockville, MD 20892-9640, USA.
| |
Collapse
|
5
|
|
6
|
Lin CH, Fann YC, Liou DM. An exploratory study using an openEHR 2-level modeling approach to represent common data elements. J Am Med Inform Assoc 2016; 23:956-67. [PMID: 26911823 DOI: 10.1093/jamia/ocv137] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2014] [Accepted: 07/28/2015] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND AND OBJECTIVE In order to facilitate clinical research across multiple institutions, data harmonization is a critical requirement. Common data elements (CDEs) collect data uniformly, allowing data interoperability between research studies. However, structural limitations have hindered the application of CDEs. An advanced modeling structure is needed to rectify such limitations. The openEHR 2-level modeling approach has been widely implemented in the medical informatics domain. The aim of our study is to explore the feasibility of applying an openEHR approach to model the CDE concept. MATERIALS AND METHODS Using the National Institute of Neurological Disorders and Stroke General CDEs as material, we developed a semiautomatic mapping tool to assist domain experts mapping CDEs to existing openEHR archetypes in order to evaluate their coverage and to allow further analysis. In addition, we modeled a set of CDEs using the openEHR approach to evaluate the ability of archetypes to structurally represent any type of CDE content. RESULTS Among 184 CDEs, 28% (51) of the archetypes could be directly used to represent CDEs, while 53% (98) of the archetypes required further development (extension or specialization). A comprehensive comparison between CDEs and openEHR archetypes was conducted based on the lessons learnt from the practical modeling. DISCUSSION CDEs and archetypes have dissimilar modeling approaches, but the data structure of both models are essentially similar. This study proposes to develop a comprehensive structure to model CDE concepts instead of improving the structure of CED. CONCLUSION The findings from this research show that the openEHR archetype has structural coverage for the CDEs, namely the openEHR archetype is able to represent the CDEs and meet the functional expectations of the CDEs. This work can be used as a reference when improving CDE structure using an advanced modeling approach.
Collapse
Affiliation(s)
- Ching-Heng Lin
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan
| | - Yang-Cheng Fann
- National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Maryland, USA
| | - Der-Ming Liou
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan
| |
Collapse
|
7
|
Hesse BW, Moser RP, Riley WT. From Big Data to Knowledge in the Social Sciences. THE ANNALS OF THE AMERICAN ACADEMY OF POLITICAL AND SOCIAL SCIENCE 2015; 659:16-32. [PMID: 26294799 PMCID: PMC4539961 DOI: 10.1177/0002716215570007] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
One of the challenges associated with high-volume, diverse datasets is whether synthesis of open data streams can translate into actionable knowledge. Recognizing that challenge and other issues related to these types of data, the National Institutes of Health developed the Big Data to Knowledge or BD2K initiative. The concept of translating "big data to knowledge" is important to the social and behavioral sciences in several respects. First, a general shift to data-intensive science will exert an influence on all scientific disciplines, but particularly on the behavioral and social sciences given the wealth of behavior and related constructs captured by big data sources. Second, science is itself a social enterprise; by applying principles from the social sciences to the conduct of research, it should be possible to ameliorate some of the systemic problems that plague the scientific enterprise in the age of big data. We explore the feasibility of recalibrating the basic mechanisms of the scientific enterprise so that they are more transparent and cumulative; more integrative and cohesive; and more rapid, relevant, and responsive.
Collapse
|
8
|
Chow M, Beene M, O’Brien A, Greim P, Cromwell T, DuLong D, Bedecarré D. A nursing information model process for interoperability. J Am Med Inform Assoc 2015; 22:608-14. [DOI: 10.1093/jamia/ocu026] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Accepted: 11/14/2014] [Indexed: 11/12/2022] Open
Abstract
Abstract
The ability to share nursing data across organizations and electronic health records is a key component of improving care coordination and quality outcomes. Currently, substantial organizational and technical barriers limit the ability to share and compare essential patient data that inform nursing care. Nursing leaders at Kaiser Permanente and the U.S. Department of Veterans Affairs collaborated on the development of an evidence-based information model driven by nursing practice to enable data capture, re-use, and sharing between organizations and disparate electronic health records. This article describes a framework with repeatable steps and processes to enable the semantic interoperability of relevant and contextual nursing data. Hospital-acquired pressure ulcer prevention was selected as the prototype nurse-sensitive quality measure to develop and test the model. In a Health 2.0 Developer Challenge program from the Office of the National Coordinator for Health, mobile applications implemented the model to help nurses assess the risk of hospital-acquired pressure ulcers and reduce their severity. The common information model can be applied to other nurse-sensitive measures to enable data standardization supporting patient transitions between care settings, quality reporting, and research.
Collapse
Affiliation(s)
- Marilyn Chow
- National Patient Care Services, Kaiser Permanente, Oakland, CA, USA
| | | | - Ann O’Brien
- Kaiser Permanente-Information Technology, Pleasanton, CA USA
| | | | - Tim Cromwell
- Department of Veterans Affairs, Washington, DC, USA
| | | | | |
Collapse
|