1
|
Shafiee F, Sarbaz M, Marouzi P, Banaye Yazdipour A, Kimiafar K. Providing a framework for evaluation disease registry and health outcomes Software: Updating the CIPROS checklist. J Biomed Inform 2024; 149:104574. [PMID: 38101688 DOI: 10.1016/j.jbi.2023.104574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 11/27/2023] [Accepted: 12/08/2023] [Indexed: 12/17/2023]
Abstract
BACKGROUND AND AIMS Properly designed and implemented registry systems play an important role in improving health outcomes and reducing care costs, and can provide a true representation of clinical practice, disease outcomes, safety, and efficacy. Therefore, the aim of this study was to redesign and develop a checklist with items for a patient registry software system (CIPROS) Checklist. METHOD The study is descriptive-cross-sectional. The extraction of the data elements of the checklist was first done through a comprehensive review of the texts in PubMed, Science Direct and Scopus databases and receiving articles related to the evaluation of registry systems. Based on the extracted data, a five-point Likert scale questionnaire was created and 30 experts in this field were asked for their opinions using the two-step Delphi method. RESULTS A total of 100 information items were determined as a registry software evaluation checklist. This checklist included 12 groups of software architecture factors, development, interfaces and interactivity, semantics and standardization, internationality, data management, data quality and usability, data analysis, security, privacy, organizational, education and public factors. CONCLUSION By using the results of this research, it is possible to identify the defects and possible strengths of the registry software and put it at the disposal of the relevant officials to make a decision in this field. In this way, among the designers and developers of these softwares, the best and most appropriate ones are selected with the needs of the registry programs.
Collapse
Affiliation(s)
- Fatemeh Shafiee
- Department of Health Information Technology, School of Paramedical and Rehabilitation Sciences, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Masoume Sarbaz
- Department of Health Information Technology, School of Paramedical and Rehabilitation Sciences, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Parviz Marouzi
- Department of Health Information Technology, School of Paramedical and Rehabilitation Sciences, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Alireza Banaye Yazdipour
- Department of Health Information Technology, School of Paramedical and Rehabilitation Sciences, Mashhad University of Medical Sciences, Mashhad, Iran; Department of Health Information Management and Medical Informatics, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran; Students' Scientific Research Center (SSRC), Tehran University of Medical Sciences, Tehran, Iran.
| | - Khalil Kimiafar
- Department of Health Information Technology, School of Paramedical and Rehabilitation Sciences, Mashhad University of Medical Sciences, Mashhad, Iran.
| |
Collapse
|
2
|
Adams MCB, Hurley RW, Siddons A, Topaloglu U, Wandner LD. NIH HEAL Clinical Data Elements (CDE) implementation: NIH HEAL Initiative IMPOWR network IDEA-CC. PAIN MEDICINE (MALDEN, MASS.) 2023; 24:743-749. [PMID: 36799548 PMCID: PMC10321760 DOI: 10.1093/pm/pnad018] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Revised: 02/14/2023] [Accepted: 02/15/2023] [Indexed: 02/18/2023]
Abstract
OBJECTIVE The National Institutes of Health (NIH) HEAL Initiative is making data findable, accessible, interoperable, and reusable (FAIR) to maximize the value of the unprecedented federal investment in pain and opioid-use disorder research. This involves standardizing the use of common data elements (CDE) for clinical research. METHODS This work describes the process of the selection, processing, harmonization, and design constraints of CDE across a pain and opioid use disorder clinical trials network (NIH HEAL IMPOWR). RESULTS The network alignment allowed for incorporation of newer data standards across the clinical trials. Specific advances included geographic coding (RUCA), deidentified patient identifiers (GUID), shareable clinical survey libraries (REDCap), and concept mapping to standardized concepts (UMLS). CONCLUSIONS While complex, harmonization across a network of chronic pain and opioid use disorder clinical trials with separate interventions can be optimized through use of CDEs and data standardization processes. This standardization process will support the robust secondary data analyses. Scaling this process could standardize CDE results across interventions or disease state which could help inform insurance companies or government organizations about coverage determinations. The development of the HEAL CDE program supports connecting isolated studies and solutions to each other, but the practical aspects may be challenging for some studies to implement. Leveraging tools and technology to simplify process and create ready to use resources may support wider adoption of consistent data standards.
Collapse
Affiliation(s)
- Meredith C B Adams
- Departments of Anesthesiology, Biomedical Informatics, and Public Health Sciences, Wake Forest University School of Medicine, Medical Center Boulevard, Winston-Salem, NC 27157, United States
| | - Robert W Hurley
- Departments of Anesthesiology, Translational Neuroscience, and Public Health Sciences, Wake Forest University School of Medicine, Winston-Salem, NC 27157, United States
| | - Andrew Siddons
- National Institute of Neurological Disorders and Stroke, Bethesda, MD, United States
| | - Umit Topaloglu
- Department of Cancer Biology, Wake Forest University School of Medicine, Winston-Salem, NC 27157, United States
| | - Laura D Wandner
- National Institute of Neurological Disorders and Stroke, Bethesda, MD, United States
| |
Collapse
|
3
|
Jing X. The Unified Medical Language System at 30 Years and How It Is Used and Published: Systematic Review and Content Analysis. JMIR Med Inform 2021; 9:e20675. [PMID: 34236337 PMCID: PMC8433943 DOI: 10.2196/20675] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 11/25/2020] [Accepted: 07/02/2021] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND The Unified Medical Language System (UMLS) has been a critical tool in biomedical and health informatics, and the year 2021 marks its 30th anniversary. The UMLS brings together many broadly used vocabularies and standards in the biomedical field to facilitate interoperability among different computer systems and applications. OBJECTIVE Despite its longevity, there is no comprehensive publication analysis of the use of the UMLS. Thus, this review and analysis is conducted to provide an overview of the UMLS and its use in English-language peer-reviewed publications, with the objective of providing a comprehensive understanding of how the UMLS has been used in English-language peer-reviewed publications over the last 30 years. METHODS PubMed, ACM Digital Library, and the Nursing & Allied Health Database were used to search for studies. The primary search strategy was as follows: UMLS was used as a Medical Subject Headings term or a keyword or appeared in the title or abstract. Only English-language publications were considered. The publications were screened first, then coded and categorized iteratively, following the grounded theory. The review process followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. RESULTS A total of 943 publications were included in the final analysis. Moreover, 32 publications were categorized into 2 categories; hence the total number of publications before duplicates are removed is 975. After analysis and categorization of the publications, UMLS was found to be used in the following emerging themes or areas (the number of publications and their respective percentages are given in parentheses): natural language processing (230/975, 23.6%), information retrieval (125/975, 12.8%), terminology study (90/975, 9.2%), ontology and modeling (80/975, 8.2%), medical subdomains (76/975, 7.8%), other language studies (53/975, 5.4%), artificial intelligence tools and applications (46/975, 4.7%), patient care (35/975, 3.6%), data mining and knowledge discovery (25/975, 2.6%), medical education (20/975, 2.1%), degree-related theses (13/975, 1.3%), digital library (5/975, 0.5%), and the UMLS itself (150/975, 15.4%), as well as the UMLS for other purposes (27/975, 2.8%). CONCLUSIONS The UMLS has been used successfully in patient care, medical education, digital libraries, and software development, as originally planned, as well as in degree-related theses, the building of artificial intelligence tools, data mining and knowledge discovery, foundational work in methodology, and middle layers that may lead to advanced products. Natural language processing, the UMLS itself, and information retrieval are the 3 most common themes that emerged among the included publications. The results, although largely related to academia, demonstrate that UMLS achieves its intended uses successfully, in addition to achieving uses broadly beyond its original intentions.
Collapse
Affiliation(s)
- Xia Jing
- Department of Public Health Sciences, College of Behavioral, Social and Health Sciences, Clemson University, Clemson, SC, United States
| |
Collapse
|
4
|
O'Connor MJ, Warzel DB, Martínez-Romero M, Hardi J, Willrett D, Egyedi AL, Eftekhari A, Graybeal J, Musen MA. Unleashing the value of Common Data Elements through the CEDAR Workbench. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2020; 2019:681-690. [PMID: 32308863 PMCID: PMC7153094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Developing promising treatments in biomedicine often requires aggregation and analysis of data from disparate sources across the healthcare and research spectrum. To facilitate these approaches, there is a growing focus on supporting interoperation of datasets by standardizing data-capture and reporting requirements. Common Data Elements (CDEs)-precise specifications of questions and the set of allowable answers to each question-are increasingly being adopted to help meet these standardization goals. While CDEs can provide a strong conceptual foundation for interoperation, there are no widely recognized serialization or interchange formats to describe and exchange their definitions. As a result, CDEs defined in one system cannot be easily be reused by other systems. An additional problem is that current CDE-based systems tend to be rather heavyweight and cannot be easily adopted and used by third-parties. To address these problems, we developed extensions to a metadata management system called the CEDAR Workbench to provide a platform to simplify the creation, exchange, and use of CDEs. We show how the resulting system allows users to quickly define and share CDEs and to immediately use these CDEs to build and deploy Web-based forms to acquire conforming metadata. We also show how we incorporated a large CDE library from the National Cancer Institute's caDSR system and made these CDEs publicly available for general use.
Collapse
Affiliation(s)
- Martin J O'Connor
- Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | - Denise B Warzel
- Cancer Informatics Branch, National Cancer Institute, Bethesda, MD, USA
| | | | - Josef Hardi
- Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | - Debra Willrett
- Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | - Attila L Egyedi
- Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | | | - John Graybeal
- Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | - Mark A Musen
- Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| |
Collapse
|
5
|
Fung KW, Xu J, Gold S. The Use of Inter-terminology Maps for the Creation and Maintenance of Value Sets. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2020; 2019:438-447. [PMID: 32308837 PMCID: PMC7153132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Value sets are essential in activities such as electronic clinical quality measures (eCQM) and patient cohort definition. Creation and maintenance of value sets is labor intensive and error prone. Our method aims to use existing inter-terminology maps to improve the quality of value sets that are defined in more than one terminology. For 197 eCQM value sets defined in SNOMED CT plus ICD-9-CM and/or ICD-10-CM, the map-generated codes showed good overlap with the value set codes. Manual review showed that some new codes identified by mapping should probably be included in the value sets. This could potentially augment the ICD-9-CM codes by 45% (1.5 codes), ICD-10-CM codes by 25% (1.8 codes) and SNOMED CT codes by up to 42% (4.8 codes) per value set on average. The mapping between SNOMED CT and ICD-10-PCS did not perform as well because of the granularity discrepancy in the map.
Collapse
Affiliation(s)
- Kin Wah Fung
- National Library of Medicine, National Institutes of Health, Bethesda, MD ||
| | - Julia Xu
- National Library of Medicine, National Institutes of Health, Bethesda, MD ||
| | - Sigfried Gold
- National Library of Medicine, National Institutes of Health, Bethesda, MD ||
| |
Collapse
|
6
|
Kim HH, Park YR, Lee KH, Song YS, Kim JH. Clinical MetaData ontology: a simple classification scheme for data elements of clinical data based on semantics. BMC Med Inform Decis Mak 2019; 19:166. [PMID: 31429750 PMCID: PMC6701018 DOI: 10.1186/s12911-019-0877-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 07/24/2019] [Indexed: 11/26/2022] Open
Abstract
Background The increasing use of common data elements (CDEs) in numerous research projects and clinical applications has made it imperative to create an effective classification scheme for the efficient management of these data elements. We applied high-level integrative modeling of entire clinical documents from real-world practice to create the Clinical MetaData Ontology (CMDO) for the appropriate classification and integration of CDEs that are in practical use in current clinical documents. Methods CMDO was developed using the General Formal Ontology method with a manual iterative process comprising five steps: (1) defining the scope of CMDO by conceptualizing its first-level terms based on an analysis of clinical-practice procedures, (2) identifying CMDO concepts for representing clinical data of general CDEs by examining how and what clinical data are generated with flows of clinical care practices, (3) assigning hierarchical relationships for CMDO concepts, (4) developing CMDO properties (e.g., synonyms, preferred terms, and definitions) for each CMDO concept, and (5) evaluating the utility of CMDO. Results We created CMDO comprising 189 concepts under the 4 first-level classes of Description, Event, Finding, and Procedure. CMDO has 256 definitions that cover the 189 CMDO concepts, with 459 synonyms for 139 (74.0%) of the concepts. All of the CDEs extracted from 6 HL7 templates, 25 clinical documents of 5 teaching hospitals, and 1 personal health record specification were successfully annotated by 41 (21.9%), 89 (47.6%), and 13 (7.0%) of the CMDO concepts, respectively. We created a CMDO Browser to facilitate navigation of the CMDO concept hierarchy and a CMDO-enabled CDE Browser for displaying the relationships between CMDO concepts and the CDEs extracted from the clinical documents that are used in current practice. Conclusions CMDO is an ontology and classification scheme for CDEs used in clinical documents. Given the increasing use of CDEs in many studies and real-world clinical documentation, CMDO will be a useful tool for integrating numerous CDEs from different research projects and clinical documents. The CMDO Browser and CMDO-enabled CDE Browser make it easy to search, share, and reuse CDEs, and also effectively integrate and manage CDEs from different studies and clinical documents. Electronic supplementary material The online version of this article (10.1186/s12911-019-0877-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hye Hyeon Kim
- Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine, Seoul, 03080, Republic of Korea.,Seoul National University Hospital Biomedical Research Institute, Seoul National University Hospital, Seoul, 03080, Republic of Korea
| | - Yu Rang Park
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, 03722, Republic of Korea
| | - Kye Hwa Lee
- Precision Medicine Center, Seoul National University Hospital, Seoul, 03080, Republic of Korea
| | - Young Soo Song
- Department of Pathology, Hanyang University College of Medicine, Seoul, 04763, Republic of Korea.
| | - Ju Han Kim
- Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine, Seoul, 03080, Republic of Korea. .,Division of Biomedical Informatics, Seoul National University College of Medicine, 103 Daehak-ro Jongno-gu, Seoul, 03080, Republic of Korea.
| |
Collapse
|
7
|
He Z, Keloth VK, Chen Y, Geller J. Extended Analysis of Topological-Pattern-Based Ontology Enrichment. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2019; 2018:1641-1648. [PMID: 30854243 DOI: 10.1109/bibm.2018.8621564] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Maintenance of biomedical ontologies is difficult. We have previously developed a topological-pattern-based method to deal with the problem of identifying concepts in a reference ontology that could be of interest for insertion into a target ontology. Assuming that both ontologies are parts of the Unified Medical Language System (UMLS), the method suggests approximate locations where the target ontology could be extended with new concepts from the reference ontology. However, the final decision about each concept has to be made by a human expert. In this paper, we describe the universe of cross-ontology topological patterns in quantitative terms. We then present a theoretical analysis of the number of potential placements of reference concepts in a path in a target ontology, allowing for new cross-ontology synonyms. This provides a rough estimate of what expert resources need to be allocated for the task. One insight in previous work on this topic was the large percentage of cases where importing concepts was impossible, due to a configuration called "alternative classification." In this paper, we confirm this observation. Our target ontology is the National Cancer Institute thesaurus (NCIt). However, the methods can be applied to other pairs of ontologies with hierarchical relationships from the UMLS.
Collapse
Affiliation(s)
- Zhe He
- School of Information, Florida State University Tallahassee, Florida USA
| | | | - Yan Chen
- Department of Computer Inforamtion Systems, BMCC, CUNY, New York, NY USA,
| | - James Geller
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ USA,
| |
Collapse
|
8
|
Gold S, Batch A, McClure R, Jiang G, Kharrazi H, Saripalle R, Huser V, Weng C, Roderer N, Szarfman A, Elmqvist N, Gotz D. Clinical Concept Value Sets and Interoperability in Health Data Analytics. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018; 2018:480-489. [PMID: 30815088 PMCID: PMC6371254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
This paper focuses on value sets as an essential component in the health analytics ecosystem. We discuss shared repositories of reusable value sets and offer recommendations for their further development and adoption. In order to motivate these contributions, we explain how value sets fit into specific analytic tasks and the health analytics landscape more broadly; their growing importance and ubiquity with the advent of Common Data Models, Distributed Research Networks, and the availability of higher order, reusable analytic resources like electronic phenotypes and electronic clinical quality measures; the formidable barriers to value set reuse; and our introduction of a concept-agnostic orientation to vocabulary collections. The costs of ad hoc value set management and the benefits of value set reuse are described or implied throughout. Our standards, infrastructure, and design recommendations are not systematic or comprehensive but invite further work to support value set reuse for health analytics. The views represented in the paper do not necessarily represent the views of the institutions or of all the co-authors.
Collapse
Affiliation(s)
- Sigfried Gold
- University of Maryland, College Park
- Observational Health Data Sciences and Informatics
| | | | | | - Guoqian Jiang
- Mayo Clinic, Rochester, MN
- Observational Health Data Sciences and Informatics
| | | | | | - Vojtech Huser
- National Library of Medicine, Bethesda, MD
- Observational Health Data Sciences and Informatics
| | - Chunhua Weng
- Columbia University, New York, NY
- Observational Health Data Sciences and Informatics
| | | | - Ana Szarfman
- Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD
| | | | - David Gotz
- University of North Carolina, Chapel Hill, NC
| |
Collapse
|
9
|
Haendel MA, McMurry JA, Relevo R, Mungall CJ, Robinson PN, Chute CG. A Census of Disease Ontologies. Annu Rev Biomed Data Sci 2018. [DOI: 10.1146/annurev-biodatasci-080917-013459] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
For centuries, humans have sought to classify diseases based on phenotypic presentation and available treatments. Today, a wide landscape of strategies, resources, and tools exist to classify patients and diseases. Ontologies can provide a robust foundation of logic for precise stratification and classification along diverse axes such as etiology, development, treatment, and genetics. Disease and phenotype ontologies are used in four primary ways: ( a) search, retrieval, and annotation of knowledge; ( b) data integration and analysis; ( c) clinical decision support; and ( d) knowledge discovery. Computational inference can connect existing knowledge and generate new insights and hypotheses about drug targets, prognosis prediction, or diagnosis. In this review, we examine the rise of disease and phenotype ontologies and the diverse ways they are represented and applied in biomedicine.
Collapse
Affiliation(s)
- Melissa A. Haendel
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon 97239, USA
- Linus Pauling Institute, Oregon State University, Corvallis, Oregon 97331, USA
| | - Julie A. McMurry
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon 97239, USA
| | - Rose Relevo
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon 97239, USA
| | - Christopher J. Mungall
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | | | - Christopher G. Chute
- School of Medicine, School of Public Health, and School of Nursing, Johns Hopkins University, Baltimore, Maryland 21205, USA
| |
Collapse
|
10
|
Chen HW, Du J, Song HY, Liu X, Jiang G, Tao C. Representation of Time-Relevant Common Data Elements in the Cancer Data Standards Repository: Statistical Evaluation of an Ontological Approach. JMIR Med Inform 2018; 6:e7. [PMID: 29472179 PMCID: PMC5843793 DOI: 10.2196/medinform.8175] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Revised: 09/17/2017] [Accepted: 11/16/2017] [Indexed: 11/20/2022] Open
Abstract
Background Today, there is an increasing need to centralize and standardize electronic health data within clinical research as the volume of data continues to balloon. Domain-specific common data elements (CDEs) are emerging as a standard approach to clinical research data capturing and reporting. Recent efforts to standardize clinical study CDEs have been of great benefit in facilitating data integration and data sharing. The importance of the temporal dimension of clinical research studies has been well recognized; however, very few studies have focused on the formal representation of temporal constraints and temporal relationships within clinical research data in the biomedical research community. In particular, temporal information can be extremely powerful to enable high-quality cancer research. Objective The objective of the study was to develop and evaluate an ontological approach to represent the temporal aspects of cancer study CDEs. Methods We used CDEs recorded in the National Cancer Institute (NCI) Cancer Data Standards Repository (caDSR) and created a CDE parser to extract time-relevant CDEs from the caDSR. Using the Web Ontology Language (OWL)–based Time Event Ontology (TEO), we manually derived representative patterns to semantically model the temporal components of the CDEs using an observing set of randomly selected time-related CDEs (n=600) to create a set of TEO ontological representation patterns. In evaluating TEO’s ability to represent the temporal components of the CDEs, this set of representation patterns was tested against two test sets of randomly selected time-related CDEs (n=425). Results It was found that 94.2% (801/850) of the CDEs in the test sets could be represented by the TEO representation patterns. Conclusions In conclusion, TEO is a good ontological model for representing the temporal components of the CDEs recorded in caDSR. Our representative model can harness the Semantic Web reasoning and inferencing functionalities and present a means for temporal CDEs to be machine-readable, streamlining meaningful searches.
Collapse
Affiliation(s)
- Henry W Chen
- The University of Texas at Austin, Austin, TX, United States
| | - Jingcheng Du
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Hsing-Yi Song
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Xiangyu Liu
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Guoqian Jiang
- Mayo Clinic College of Medicine, Rochester, MN, United States
| | - Cui Tao
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| |
Collapse
|
11
|
He Z, Chen Y, de Coronado S, Piskorski K, Geller J. Topological-Pattern-Based Recommendation of UMLS Concepts for National Cancer Institute Thesaurus. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2017; 2016:618-627. [PMID: 28269858 PMCID: PMC5333219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The National Cancer Institute Thesaurus (NCIt) is a reference terminology used to support clinical, translational and basic research as well as administrative activities. As medical knowledge evolves, concepts that might be missing from a particular needed subdomain are regularly added to the NCIt. However, terminology development is known to be labor-intensive and error-prone. Therefore, cost-effective semi-automated methods for identifying potentially missing concepts would be useful to terminology curators. Previously, we have developed a structural method leveraging the native term mappings of the Unified Medical Language System to identify potential concepts in several of its source vocabularies to enrich the SNOMED CT. In this paper, we tested an analogous method for NCIt. Concepts from eight UMLS source terminologies were identified as possibilities to enrich NCIt's conceptual content.
Collapse
Affiliation(s)
- Zhe He
- School of Information, Florida State University, Tallahassee, FL; Institute for Successful Longevity, Florida State University, Tallahassee, FL
| | - Yan Chen
- Department of Computer Information Systems, Borough of Manhattan Community College, City University of New York, New York, NY
| | | | | | - James Geller
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ
| |
Collapse
|
12
|
HE Z, GELLER J. Preliminary Analysis of Difficulty of Importing Pattern-Based Concepts into the National Cancer Institute Thesaurus. Stud Health Technol Inform 2016; 228:389-93. [PMID: 27577410 PMCID: PMC5785234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Abstract
Maintenance of biomedical ontologies is difficult. We have developed a pattern-based method for dealing with the problem of identifying missing concepts in the National Cancer Institute thesaurus (NCIt). Specifically, we are mining patterns connecting NCIt concepts with concepts in other ontologies to identify candidate missing concepts. However, the final decision about a concept insertion is always up to a human ontology curator. In this paper, we are estimating the difficulty of this task for a domain expert by counting possible choices for a pattern-based insertion. We conclude that even with support of our mining algorithm, the insertion task is challenging.
Collapse
Affiliation(s)
- Zhe HE
- School of Information, Florida State University
| | - James GELLER
- Department of Computer Science, New Jersey Institute of Technology,Corresponding Author: CS Dept., NJIT, Newark NJ 07102, USA.
| |
Collapse
|
13
|
Jiang G, Solbrig HR, Prud'hommeaux E, Tao C, Weng C, Chute CG. Quality Assurance of Cancer Study Common Data Elements Using A Post-Coordination Approach. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2015; 2015:659-668. [PMID: 26958201 PMCID: PMC4765658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Domain-specific common data elements (CDEs) are emerging as an effective approach to standards-based clinical research data storage and retrieval. A limiting factor, however, is the lack of robust automated quality assurance (QA) tools for the CDEs in clinical study domains. The objectives of the present study are to prototype and evaluate a QA tool for the study of cancer CDEs using a post-coordination approach. The study starts by integrating the NCI caDSR CDEs and The Cancer Genome Atlas (TCGA) data dictionaries in a single Resource Description Framework (RDF) data store. We designed a compositional expression pattern based on the Data Element Concept model structure informed by ISO/IEC 11179, and developed a transformation tool that converts the pattern-based compositional expressions into the Web Ontology Language (OWL) syntax. Invoking reasoning and explanation services, we tested the system utilizing the CDEs extracted from two TCGA clinical cancer study domains. The system could automatically identify duplicate CDEs, and detect CDE modeling errors. In conclusion, compositional expressions not only enable reuse of existing ontology codes to define new domain concepts, but also provide an automated mechanism for QA of terminological annotations for CDEs.
Collapse
Affiliation(s)
- Guoqian Jiang
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN
| | - Harold R Solbrig
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN
| | | | - Cui Tao
- University of Texas Health Science Center at Houston, Houston, TX
| | | | | |
Collapse
|
14
|
Noor AM, Holmberg L, Gillett C, Grigoriadis A. Big Data: the challenge for small research groups in the era of cancer genomics. Br J Cancer 2015; 113:1405-12. [PMID: 26492224 PMCID: PMC4815885 DOI: 10.1038/bjc.2015.341] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Revised: 08/04/2015] [Accepted: 08/09/2015] [Indexed: 01/06/2023] Open
Abstract
In the past decade, cancer research has seen an increasing trend towards high-throughput techniques and translational approaches. The increasing availability of assays that utilise smaller quantities of source material and produce higher volumes of data output have resulted in the necessity for data storage solutions beyond those previously used. Multifactorial data, both large in sample size and heterogeneous in context, needs to be integrated in a standardised, cost-effective and secure manner. This requires technical solutions and administrative support not normally financially accounted for in small- to moderate-sized research groups. In this review, we highlight the Big Data challenges faced by translational research groups in the precision medicine era; an era in which the genomes of over 75 000 patients will be sequenced by the National Health Service over the next 3 years to advance healthcare. In particular, we have looked at three main themes of data management in relation to cancer research, namely (1) cancer ontology management, (2) IT infrastructures that have been developed to support data management and (3) the unique ethical challenges introduced by utilising Big Data in research.
Collapse
Affiliation(s)
- Aisyah Mohd Noor
- Research Oncology, Faculty of Life Sciences and Medicine, King's College London, Guy's Hospital, London SE1 9RT, UK
| | - Lars Holmberg
- Research Oncology, Faculty of Life Sciences and Medicine, King's College London, Guy's Hospital, London SE1 9RT, UK.,Department of Surgical Sciences, Uppsala University, Uppsala 751 85, Sweden
| | - Cheryl Gillett
- Research Oncology, Faculty of Life Sciences and Medicine, King's College London, Guy's Hospital, London SE1 9RT, UK.,Faculty of Life Sciences and Medicine, King's Health Partners Cancer Biobank, King's College London, Research Oncology, Guy's Hospital, London SE1 9RT, UK
| | - Anita Grigoriadis
- Research Oncology, Faculty of Life Sciences and Medicine, King's College London, Guy's Hospital, London SE1 9RT, UK.,Breast Cancer Now Research Unit, Research Oncology, Faculty of Life Sciences and Medicine, King's College London, Guy's Hospital, London SE1 9RT, UK
| |
Collapse
|
15
|
Rance B, Le T, Bodenreider O. Fingerprinting Biomedical Terminologies--Automatic Classification and Visualization of Biomedical Vocabularies through UMLS Semantic Group Profiles. Stud Health Technol Inform 2015; 216:771-5. [PMID: 26262156 PMCID: PMC5881385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
OBJECTIVES To explore automatic methods for the classification of biomedical vocabularies based on their content. METHODS We create semantic group profiles for each source vocabulary in the UMLS and compare the vectors using a Euclidian distance. We explore several techniques for visualizing individual semantic group profiles and the entire distance matrix, including donut pie charts, heatmaps, dendrograms and networks. RESULTS We provide donut pie charts for individual source vocavularies, as well as a heatmap, dendrogram and network for a subset of 78 vocabularies from the UMLS. CONCLUSIONS Our approach to fingerprinting biomedical terminologies is completely automated and can easily be applied to all source vocabularies in the UMLS, including upcoming versions of the UMLS. It supports the exploration, selection and comparison of the biomedical terminologies integrated into the UMLS. The visualizations are available at (http://mor.-nlm.nih.gov/pubs/supp/2015-medinfo-br/index.html).
Collapse
Affiliation(s)
- Bastien Rance
- AP-HP, University Hospital Georges Pompidou; INSERM, UMR_S 1138, Centre de Recherche des Cordeliers, Paris, France
| | - Thai Le
- Biomedical and Health Informatics, University of Washington, Seattle, WA, USA
| | - Olivier Bodenreider
- National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
16
|
|
17
|
Jiang G, Evans J, Oniki TA, Coyle JF, Bain L, Huff SM, Kush RD, Chute CG. Harmonization of detailed clinical models with clinical study data standards. Methods Inf Med 2014; 54:65-74. [PMID: 25426730 DOI: 10.3414/me13-02-0019] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Accepted: 04/23/2014] [Indexed: 11/09/2022]
Abstract
INTRODUCTION This article is part of the Focus Theme of METHODS of Information in Medicine on "Managing Interoperability and Complexity in Health Systems". BACKGROUND Data sharing and integration between the clinical research data management system and the electronic health record system remains a challenging issue. To approach the issue, there is emerging interest in utilizing the Detailed Clinical Model (DCM) approach across a variety of contexts. The Intermountain Healthcare Clinical Element Models (CEMs) have been adopted by the Office of the National Coordinator awarded Strategic Health IT Advanced Research Projects for normalization (SHARPn) project for normalizing patient data from the electronic health records (EHR). OBJECTIVE The objective of the present study is to describe our preliminary efforts toward harmonization of the SHARPn CEMs with CDISC (Clinical Data Interchange Standards Consortium) clinical study data standards. METHODS We were focused on three generic domains: demographics, lab tests, and medications. We performed a panel review on each data element extracted from the CDISC templates and SHARPn CEMs. RESULTS We have identified a set of data elements that are common to the context of both clinical study and broad secondary use of EHR data and discussed outstanding harmonization issues. CONCLUSIONS We consider that the outcomes would be useful for defining new requirements for the DCM modeling community and ultimately facilitating the semantic interoperability between systems for both clinical study and broad secondary use domains.
Collapse
Affiliation(s)
- G Jiang
- Guoqian Jiang, MD, PhD, Department of Health Sciences Research, Mayo Clinic, 200 First St SW, Rochester, MN 55905, USA, E-mail:
| | | | | | | | | | | | | | | |
Collapse
|
18
|
Vawdrey DK, Weng C, Herion D, Cimino JJ. Enhancing electronic health records to support clinical research. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2014; 2014:102-8. [PMID: 25954585 PMCID: PMC4419762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
The "Learning Health System" has been described as an environment that drives research and innovation as a natural outgrowth of patient care. Electronic health records (EHRs) are necessary to enable the Learning Health System; however, a source of frustration is that current systems fail to adequately support research needs. We propose a model for enhancing EHRs to collect structured and standards-based clinical research data during clinical encounters that promotes efficiency and computational reuse of quality data for both care and research. The model integrates Common Data Elements (CDEs) for clinical research into existing clinical documentation workflows, leveraging executable documentation guidance within the EHR to support coordinated, standardized data collection for both patient care and clinical research.
Collapse
Affiliation(s)
- David K Vawdrey
- Columbia University Department of Biomedical Informatics, New York, NY
| | - Chunhua Weng
- Columbia University Department of Biomedical Informatics, New York, NY
| | - David Herion
- Department of Clinical Research Informatics, NIH Clinical Center, Bethesda, MD
| | - James J Cimino
- Laboratory for Informatics Development, NIH Clinical Center, Bethesda, MD ; Columbia University Department of Biomedical Informatics, New York, NY
| |
Collapse
|
19
|
Abstract
Clinical research informatics is the rapidly evolving sub-discipline within biomedical informatics that focuses on developing new informatics theories, tools, and solutions to accelerate the full translational continuum: basic research to clinical trials (T1), clinical trials to academic health center practice (T2), diffusion and implementation to community practice (T3), and ‘real world’ outcomes (T4). We present a conceptual model based on an informatics-enabled clinical research workflow, integration across heterogeneous data sources, and core informatics tools and platforms. We use this conceptual model to highlight 18 new articles in the JAMIA special issue on clinical research informatics.
Collapse
Affiliation(s)
- Michael G Kahn
- Department of Pediatrics, University of Colorado, Aurora, Colorado 80045, USA.
| | | |
Collapse
|