1
|
Hu Z, Wang A, Duan Y, Zhou J, Hu W, Wu S. Toward Better Semantic Interoperability of Data Element Repositories in Medicine: Analysis Study. JMIR Med Inform 2024; 12:e60293. [PMID: 39348178 DOI: 10.2196/60293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Revised: 07/07/2024] [Accepted: 07/21/2024] [Indexed: 10/01/2024] Open
Abstract
BACKGROUND Data element repositories facilitate high-quality medical data sharing by standardizing data and enhancing semantic interoperability. However, the application of repositories is confined to specific projects and institutions. OBJECTIVE This study aims to explore potential issues and promote broader application of data element repositories within the medical field by evaluating and analyzing typical repositories. METHODS Following the inclusion of 5 data element repositories through a literature review, a novel analysis framework consisting of 7 dimensions and 36 secondary indicators was constructed and used for evaluation and analysis. RESULTS The study's results delineate the unique characteristics of different repositories and uncover specific issues in their construction. These issues include the absence of data reuse protocols and insufficient information regarding the application scenarios and efficacy of data elements. The repositories fully comply with only 45% (9/20) of the subprinciples for Findable and Reusable in the FAIR principle, while achieving a 90% (19/20 subprinciples) compliance rate for Accessible and 67% (10/15 subprinciples) for Interoperable. CONCLUSIONS The recommendations proposed in this study address the issues to improve the construction and application of repositories, offering valuable insights to data managers, computer experts, and other pertinent stakeholders.
Collapse
Affiliation(s)
- Zhengyong Hu
- Institute of Medical Information/Medical Library, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Anran Wang
- Institute of Medical Information/Medical Library, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Yifan Duan
- Institute of Medical Information/Medical Library, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Jiayin Zhou
- Institute of Medical Information/Medical Library, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Wanfei Hu
- Institute of Medical Information/Medical Library, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Sizhu Wu
- Institute of Medical Information/Medical Library, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| |
Collapse
|
2
|
Riepenhausen S, Blumenstock M, Niklas C, Hegselmann S, Neuhaus P, Meidt A, Püttmann C, Storck M, Ganzinger M, Varghese J, Dugas M. Europe's Largest Research Infrastructure for Curated Medical Data Models with Semantic Annotations. Methods Inf Med 2024. [PMID: 38740374 DOI: 10.1055/s-0044-1786839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
BACKGROUND Structural metadata from the majority of clinical studies and routine health care systems is currently not yet available to the scientific community. OBJECTIVE To provide an overview of available contents in the Portal of Medical Data Models (MDM Portal). METHODS The MDM Portal is a registered European information infrastructure for research and health care, and its contents are curated and semantically annotated by medical experts. It enables users to search, view, discuss, and download existing medical data models. RESULTS The most frequent keyword is "clinical trial" (n = 18,777), and the most frequent disease-specific keyword is "breast neoplasms" (n = 1,943). Most data items are available in English (n = 545,749) and German (n = 109,267). Manually curated semantic annotations are available for 805,308 elements (554,352 items, 58,101 item groups, and 192,855 code list items), which were derived from 25,257 data models. In total, 1,609,225 Unified Medical Language System (UMLS) codes have been assigned, with 66,373 unique UMLS codes. CONCLUSION To our knowledge, the MDM Portal constitutes Europe's largest collection of medical data models with semantically annotated elements. As such, it can be used to increase compatibility of medical datasets and can be utilized as a large expert-annotated medical text corpus for natural language processing.
Collapse
Affiliation(s)
- Sarah Riepenhausen
- Institute of Medical Informatics, University of Münster, Münster, Nordrhein-Westfalen, Germany
| | - Max Blumenstock
- Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
| | - Christian Niklas
- Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
| | - Stefan Hegselmann
- Institute of Medical Informatics, University of Münster, Münster, Nordrhein-Westfalen, Germany
| | - Philipp Neuhaus
- Institute of Medical Informatics, University of Münster, Münster, Nordrhein-Westfalen, Germany
| | - Alexandra Meidt
- Institute of Medical Informatics, University of Münster, Münster, Nordrhein-Westfalen, Germany
| | - Cornelia Püttmann
- Institute of Medical Informatics, University of Münster, Münster, Nordrhein-Westfalen, Germany
| | - Michael Storck
- Institute of Medical Informatics, University of Münster, Münster, Nordrhein-Westfalen, Germany
| | - Matthias Ganzinger
- Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
| | - Julian Varghese
- Institute of Medical Informatics, University of Münster, Münster, Nordrhein-Westfalen, Germany
| | - Martin Dugas
- Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
- European Research Center for Information Systems (ERCIS), Münster, Nordrhein-Westfalen, Germany
| |
Collapse
|
3
|
Maier-Hein L, Eisenmann M, Sarikaya D, März K, Collins T, Malpani A, Fallert J, Feussner H, Giannarou S, Mascagni P, Nakawala H, Park A, Pugh C, Stoyanov D, Vedula SS, Cleary K, Fichtinger G, Forestier G, Gibaud B, Grantcharov T, Hashizume M, Heckmann-Nötzel D, Kenngott HG, Kikinis R, Mündermann L, Navab N, Onogur S, Roß T, Sznitman R, Taylor RH, Tizabi MD, Wagner M, Hager GD, Neumuth T, Padoy N, Collins J, Gockel I, Goedeke J, Hashimoto DA, Joyeux L, Lam K, Leff DR, Madani A, Marcus HJ, Meireles O, Seitel A, Teber D, Ückert F, Müller-Stich BP, Jannin P, Speidel S. Surgical data science - from concepts toward clinical translation. Med Image Anal 2022; 76:102306. [PMID: 34879287 PMCID: PMC9135051 DOI: 10.1016/j.media.2021.102306] [Citation(s) in RCA: 86] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 11/03/2021] [Accepted: 11/08/2021] [Indexed: 02/06/2023]
Abstract
Recent developments in data science in general and machine learning in particular have transformed the way experts envision the future of surgery. Surgical Data Science (SDS) is a new research field that aims to improve the quality of interventional healthcare through the capture, organization, analysis and modeling of data. While an increasing number of data-driven approaches and clinical applications have been studied in the fields of radiological and clinical data science, translational success stories are still lacking in surgery. In this publication, we shed light on the underlying reasons and provide a roadmap for future advances in the field. Based on an international workshop involving leading researchers in the field of SDS, we review current practice, key achievements and initiatives as well as available standards and tools for a number of topics relevant to the field, namely (1) infrastructure for data acquisition, storage and access in the presence of regulatory constraints, (2) data annotation and sharing and (3) data analytics. We further complement this technical perspective with (4) a review of currently available SDS products and the translational progress from academia and (5) a roadmap for faster clinical translation and exploitation of the full potential of SDS, based on an international multi-round Delphi process.
Collapse
Affiliation(s)
- Lena Maier-Hein
- Division of Computer Assisted Medical Interventions (CAMI), German Cancer Research Center (DKFZ), Heidelberg, Germany; Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany; Medical Faculty, Heidelberg University, Heidelberg, Germany.
| | - Matthias Eisenmann
- Division of Computer Assisted Medical Interventions (CAMI), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Duygu Sarikaya
- Department of Computer Engineering, Faculty of Engineering, Gazi University, Ankara, Turkey; LTSI, Inserm UMR 1099, University of Rennes 1, Rennes, France
| | - Keno März
- Division of Computer Assisted Medical Interventions (CAMI), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | | | - Anand Malpani
- The Malone Center for Engineering in Healthcare, The Johns Hopkins University, Baltimore, Maryland, USA
| | | | - Hubertus Feussner
- Department of Surgery, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Stamatia Giannarou
- The Hamlyn Centre for Robotic Surgery, Imperial College London, London, United Kingdom
| | - Pietro Mascagni
- ICube, University of Strasbourg, CNRS, France; IHU Strasbourg, Strasbourg, France
| | | | - Adrian Park
- Department of Surgery, Anne Arundel Health System, Annapolis, Maryland, USA; Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Carla Pugh
- Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| | - Danail Stoyanov
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London, United Kingdom
| | - Swaroop S Vedula
- The Malone Center for Engineering in Healthcare, The Johns Hopkins University, Baltimore, Maryland, USA
| | - Kevin Cleary
- The Sheikh Zayed Institute for Pediatric Surgical Innovation, Children's National Hospital, Washington, D.C., USA
| | | | - Germain Forestier
- L'Institut de Recherche en Informatique, Mathématiques, Automatique et Signal (IRIMAS), University of Haute-Alsace, Mulhouse, France; Faculty of Information Technology, Monash University, Clayton, Victoria, Australia
| | - Bernard Gibaud
- LTSI, Inserm UMR 1099, University of Rennes 1, Rennes, France
| | - Teodor Grantcharov
- University of Toronto, Toronto, Ontario, Canada; The Li Ka Shing Knowledge Institute of St. Michael's Hospital, Toronto, Ontario, Canada
| | - Makoto Hashizume
- Kyushu University, Fukuoka, Japan; Kitakyushu Koga Hospital, Fukuoka, Japan
| | - Doreen Heckmann-Nötzel
- Division of Computer Assisted Medical Interventions (CAMI), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Hannes G Kenngott
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Heidelberg, Germany
| | - Ron Kikinis
- Department of Radiology, Brigham and Women's Hospital, and Harvard Medical School, Boston, Massachusetts, USA
| | | | - Nassir Navab
- Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany; Department of Computer Science, The Johns Hopkins University, Baltimore, Maryland, USA
| | - Sinan Onogur
- Division of Computer Assisted Medical Interventions (CAMI), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Tobias Roß
- Division of Computer Assisted Medical Interventions (CAMI), German Cancer Research Center (DKFZ), Heidelberg, Germany; Medical Faculty, Heidelberg University, Heidelberg, Germany
| | - Raphael Sznitman
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland
| | - Russell H Taylor
- Department of Computer Science, The Johns Hopkins University, Baltimore, Maryland, USA
| | - Minu D Tizabi
- Division of Computer Assisted Medical Interventions (CAMI), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Martin Wagner
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Heidelberg, Germany
| | - Gregory D Hager
- The Malone Center for Engineering in Healthcare, The Johns Hopkins University, Baltimore, Maryland, USA; Department of Computer Science, The Johns Hopkins University, Baltimore, Maryland, USA
| | - Thomas Neumuth
- Innovation Center Computer Assisted Surgery (ICCAS), University of Leipzig, Leipzig, Germany
| | - Nicolas Padoy
- ICube, University of Strasbourg, CNRS, France; IHU Strasbourg, Strasbourg, France
| | - Justin Collins
- Division of Surgery and Interventional Science, University College London, London, United Kingdom
| | - Ines Gockel
- Department of Visceral, Transplant, Thoracic and Vascular Surgery, Leipzig University Hospital, Leipzig, Germany
| | - Jan Goedeke
- Pediatric Surgery, Dr. von Hauner Children's Hospital, Ludwig-Maximilians-University, Munich, Germany
| | - Daniel A Hashimoto
- University Hospitals Cleveland Medical Center, Case Western Reserve University, Cleveland, Ohio, USA; Surgical AI and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Luc Joyeux
- My FetUZ Fetal Research Center, Department of Development and Regeneration, Biomedical Sciences, KU Leuven, Leuven, Belgium; Center for Surgical Technologies, Faculty of Medicine, KU Leuven, Leuven, Belgium; Department of Obstetrics and Gynecology, Division Woman and Child, Fetal Medicine Unit, University Hospitals Leuven, Leuven, Belgium; Michael E. DeBakey Department of Surgery, Texas Children's Hospital and Baylor College of Medicine, Houston, Texas, USA
| | - Kyle Lam
- Department of Surgery and Cancer, Imperial College London, London, United Kingdom
| | - Daniel R Leff
- Department of BioSurgery and Surgical Technology, Imperial College London, London, United Kingdom; Hamlyn Centre for Robotic Surgery, Imperial College London, London, United Kingdom; Breast Unit, Imperial Healthcare NHS Trust, London, United Kingdom
| | - Amin Madani
- Department of Surgery, University Health Network, Toronto, Ontario, Canada
| | - Hani J Marcus
- National Hospital for Neurology and Neurosurgery, and UCL Queen Square Institute of Neurology, London, United Kingdom
| | - Ozanan Meireles
- Massachusetts General Hospital, and Harvard Medical School, Boston, Massachusetts, USA
| | - Alexander Seitel
- Division of Computer Assisted Medical Interventions (CAMI), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Dogu Teber
- Department of Urology, City Hospital Karlsruhe, Karlsruhe, Germany
| | - Frank Ückert
- Institute for Applied Medical Informatics, Hamburg University Hospital, Hamburg, Germany
| | - Beat P Müller-Stich
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Heidelberg, Germany
| | - Pierre Jannin
- LTSI, Inserm UMR 1099, University of Rennes 1, Rennes, France
| | - Stefanie Speidel
- Division of Translational Surgical Oncology, National Center for Tumor Diseases (NCT/UCC) Dresden, Dresden, Germany; Centre for Tactile Internet with Human-in-the-Loop (CeTI), TU Dresden, Dresden, Germany
| |
Collapse
|
4
|
Ulrich H, Kock-Schoppenhauer AK, Deppenwiese N, Gött R, Kern J, Lablans M, Majeed RW, Stöhr MR, Stausberg J, Varghese J, Dugas M, Ingenerf J. Understanding the Nature of Metadata: Systematic Review. J Med Internet Res 2022; 24:e25440. [PMID: 35014967 PMCID: PMC8790684 DOI: 10.2196/25440] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 01/28/2021] [Accepted: 10/14/2021] [Indexed: 01/11/2023] Open
Abstract
Background Metadata are created to describe the corresponding data in a detailed and unambiguous way and is used for various applications in different research areas, for example, data identification and classification. However, a clear definition of metadata is crucial for further use. Unfortunately, extensive experience with the processing and management of metadata has shown that the term “metadata” and its use is not always unambiguous. Objective This study aimed to understand the definition of metadata and the challenges resulting from metadata reuse. Methods A systematic literature search was performed in this study following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines for reporting on systematic reviews. Five research questions were identified to streamline the review process, addressing metadata characteristics, metadata standards, use cases, and problems encountered. This review was preceded by a harmonization process to achieve a general understanding of the terms used. Results The harmonization process resulted in a clear set of definitions for metadata processing focusing on data integration. The following literature review was conducted by 10 reviewers with different backgrounds and using the harmonized definitions. This study included 81 peer-reviewed papers from the last decade after applying various filtering steps to identify the most relevant papers. The 5 research questions could be answered, resulting in a broad overview of the standards, use cases, problems, and corresponding solutions for the application of metadata in different research areas. Conclusions Metadata can be a powerful tool for identifying, describing, and processing information, but its meaningful creation is costly and challenging. This review process uncovered many standards, use cases, problems, and solutions for dealing with metadata. The presented harmonized definitions and the new schema have the potential to improve the classification and generation of metadata by creating a shared understanding of metadata and its context.
Collapse
Affiliation(s)
- Hannes Ulrich
- IT Center for Clinical Research, University of Lübeck, Lübeck, Germany.,Institute of Medical Informatics, University of Lübeck, Lübeck, Germany
| | | | - Noemi Deppenwiese
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Robert Gött
- Department Epidemiology of Health Care and Community Health, Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany
| | - Jori Kern
- Federated Information Systems, German Cancer Research Center, Heidelberg, Germany.,Complex Data Processing in Medical Informatics, University Medical Center Mannheim, Mannheim, Germany
| | - Martin Lablans
- Federated Information Systems, German Cancer Research Center, Heidelberg, Germany.,Complex Data Processing in Medical Informatics, University Medical Center Mannheim, Mannheim, Germany
| | - Raphael W Majeed
- Universities of Giessen and Marburg Lung Center, German Center for Lung Research, Justus-Liebig-University, Giessen, Germany.,Institute of Medical Informatics, University Hospital RWTH Aachen, Aachen, Germany
| | - Mark R Stöhr
- Universities of Giessen and Marburg Lung Center, German Center for Lung Research, Justus-Liebig-University, Giessen, Germany
| | - Jürgen Stausberg
- Institute of Medical Informatics, Biometry and Epidemiology, Faculty of Medicine, University of Duisburg-Essen, Essen, Germany
| | - Julian Varghese
- Institute of Medical Informatics, University of Münster, Münster, Germany
| | - Martin Dugas
- Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
| | - Josef Ingenerf
- IT Center for Clinical Research, University of Lübeck, Lübeck, Germany.,Institute of Medical Informatics, University of Lübeck, Lübeck, Germany
| |
Collapse
|
5
|
Sabermahani F, Almasi-Dooghaee M, Sheikhtaheri A. Development and evaluation of serious games for diagnosis and cognitive improvement of patients with mild cognitive impairment: A study protocol. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.101039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022] Open
|
6
|
Berenspöhler S, Minnerup J, Dugas M, Varghese J. Common Data Elements for Meaningful Stroke Documentation in Routine Care and Clinical Research: Retrospective Data Analysis. JMIR Med Inform 2021; 9:e27396. [PMID: 34636733 PMCID: PMC8548969 DOI: 10.2196/27396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 07/12/2021] [Accepted: 07/19/2021] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Medical information management for stroke patients is currently a very time-consuming endeavor. There are clear guidelines and procedures to treat patients having acute stroke, but it is not known how well these established practices are reflected in patient documentation. OBJECTIVE This study compares a variety of documentation processes regarding stroke. The main objective of this work is to provide an overview of the most commonly occurring medical concepts in stroke documentation and identify overlaps between different documentation contexts to allow for the definition of a core data set that could be used in potential data interfaces. METHODS Medical source documentation forms from different documentation contexts, including hospitals, clinical trials, registries, and international standards, regarding stroke treatment followed by rehabilitation were digitized in the operational data model. Each source data element was semantically annotated using the Unified Medical Language System. The concept codes were analyzed for semantic overlaps. A concept was considered common if it appeared in at least two documentation contexts. The resulting common concepts were extended with implementation details, including data types and permissible values based on frequent patterns of source data elements, using an established expert-based and semiautomatic approach. RESULTS In total, 3287 data elements were identified, and 1051 of these emerged as unique medical concepts. The 100 most frequent medical concepts cover 9.51% (100/1051) of all concept occurrences in stroke documentation, and the 50 most frequent concepts cover 4.75% (50/1051). A list of common data elements was implemented in different standardized machine-readable formats on a public metadata repository for interoperable reuse. CONCLUSIONS Standardization of medical documentation is a prerequisite for data exchange as well as the transferability and reuse of data. In the long run, standardization would save time and money and extend the capabilities for which such data could be used. In the context of this work, a lack of standardization was observed regarding current information management. Free-form text fields and intricate questions complicate automated data access and transfer between institutions. This work also revealed the potential of a unified documentation process as a core data set of the 50 most frequent common data elements, accounting for 34% of the documentation in medical information management. Such a data set offers a starting point for standardized and interoperable data collection in routine care, quality management, and clinical research.
Collapse
Affiliation(s)
- Sarah Berenspöhler
- Institute of Medical Informatics, Westfälische Wilhelms-University Münster, Münster, Germany
| | - Jens Minnerup
- Department of Neurology with Institute of Translational Neurology, University Hospital Münster, Münster, Germany
| | - Martin Dugas
- Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
| | - Julian Varghese
- Institute of Medical Informatics, Westfälische Wilhelms-University Münster, Münster, Germany
| |
Collapse
|
7
|
Kentgen M, Varghese J, Samol A, Waltenberger J, Dugas M. Common Data Elements for Acute Coronary Syndrome: Analysis Based on the Unified Medical Language System. JMIR Med Inform 2019; 7:e14107. [PMID: 31444871 PMCID: PMC6729118 DOI: 10.2196/14107] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Revised: 06/21/2019] [Accepted: 07/04/2019] [Indexed: 01/29/2023] Open
Abstract
BACKGROUND Standardization in clinical documentation can increase efficiency and can save time and resources. OBJECTIVE The objectives of this work are to compare documentation forms for acute coronary syndrome (ACS), check for standardization, and generate a list of the most common data elements using semantic form annotation with the Unified Medical Language System (UMLS). METHODS Forms from registries, studies, risk scores, quality assurance, official guidelines, and routine documentation from four hospitals in Germany were semantically annotated using UMLS. This allowed for automatic comparison of concept frequencies and the generation of a list of the most common concepts. RESULTS A total of 3710 forms items from 86 sources were semantically annotated using 842 unique UMLS concepts. Half of all medical concept occurrences were covered by 60 unique concepts, which suggests the existence of a core dataset of relevant concepts. Overlap percentages between forms were relatively low, hinting at inconsistent documentation structures and lack of standardization. CONCLUSIONS This analysis shows a lack of standardized and semantically enriched documentation for patients with ACS. Efforts made by official institutions like the European Society for Cardiology have not yet been fully implemented. Utilizing a standardized and annotated core dataset of the most important data concepts could make export and automatic reuse of data easier. The generated list of common data elements is an exemplary implementation suggestion of the concepts to use in a standardized approach.
Collapse
Affiliation(s)
- Markus Kentgen
- Institute of Medical Informatics, University of Münster, Münster, Germany
| | - Julian Varghese
- Institute of Medical Informatics, University of Münster, Münster, Germany
| | - Alexander Samol
- Medical Faculty, University Hospital of Münster, Münster, Germany
| | | | - Martin Dugas
- Institute of Medical Informatics, University of Münster, Münster, Germany
| |
Collapse
|
8
|
Holz C, Kessler T, Dugas M, Varghese J. Core Data Elements in Acute Myeloid Leukemia: A Unified Medical Language System-Based Semantic Analysis and Experts' Review. JMIR Med Inform 2019; 7:e13554. [PMID: 31407666 PMCID: PMC6709897 DOI: 10.2196/13554] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 05/08/2019] [Accepted: 05/31/2019] [Indexed: 01/27/2023] Open
Abstract
Background For cancer domains such as acute myeloid leukemia (AML), a large set of data elements is obtained from different institutions with heterogeneous data definitions within one patient course. The lack of clinical data harmonization impedes cross-institutional electronic data exchange and future meta-analyses. Objective This study aimed to identify and harmonize a semantic core of common data elements (CDEs) in clinical routine and research documentation, based on a systematic metadata analysis of existing documentation models. Methods Lists of relevant data items were collected and reviewed by hematologists from two university hospitals regarding routine documentation and several case report forms of clinical trials for AML. In addition, existing registries and international recommendations were included. Data items were coded to medical concepts via the Unified Medical Language System (UMLS) by a physician and reviewed by another physician. On the basis of the coded concepts, the data sources were analyzed for concept overlaps and identification of most frequent concepts. The most frequent concepts were then implemented as data elements in the standardized format of the Operational Data Model by the Clinical Data Interchange Standards Consortium. Results A total of 3265 medical concepts were identified, of which 1414 were unique. Among the 1414 unique medical concepts, the 50 most frequent ones cover 26.98% of all concept occurrences within the collected AML documentation. The top 100 concepts represent 39.48% of all concepts’ occurrences. Implementation of CDEs is available on a European research infrastructure and can be downloaded in different formats for reuse in different electronic data capture systems. Conclusions Information management is a complex process for research-intense disease entities as AML that is associated with a large set of lab-based diagnostics and different treatment options. Our systematic UMLS-based analysis revealed the existence of a core data set and an exemplary reusable implementation for harmonized data capture is available on an established metadata repository.
Collapse
Affiliation(s)
- Christian Holz
- Institute of Medical Informatics, University of Münster, Münster, Germany
| | - Torsten Kessler
- Department of Medicine A, University Hospital of Münster, Münster, Germany
| | - Martin Dugas
- Institute of Medical Informatics, University of Münster, Münster, Germany
| | - Julian Varghese
- Institute of Medical Informatics, University of Münster, Münster, Germany
| |
Collapse
|
9
|
Abstract
The development of high-throughput, data-intensive biomedical research assays and technologies has created a need for researchers to develop strategies for analyzing, integrating, and interpreting the massive amounts of data they generate. Although a wide variety of statistical methods have been designed to accommodate 'big data,' experiences with the use of artificial intelligence (AI) techniques suggest that they might be particularly appropriate. In addition, the results of the application of these assays reveal a great heterogeneity in the pathophysiologic factors and processes that contribute to disease, suggesting that there is a need to tailor, or 'personalize,' medicines to the nuanced and often unique features possessed by individual patients. Given how important data-intensive assays are to revealing appropriate intervention targets and strategies for treating an individual with a disease, AI can play an important role in the development of personalized medicines. We describe many areas where AI can play such a role and argue that AI's ability to advance personalized medicine will depend critically on not only the refinement of relevant assays, but also on ways of storing, aggregating, accessing, and ultimately integrating, the data they produce. We also point out the limitations of many AI techniques in developing personalized medicines as well as consider areas for further research.
Collapse
Affiliation(s)
- Nicholas J Schork
- Department of Quantitative Medicine, The Translational Genomics Research Institute (TGen), Phoenix, AZ, USA.
- The City of Hope/TGen IMPACT Center, Duarte, CA, USA.
- The University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
10
|
Varghese J, Sandmann S, Dugas M. Web-Based Information Infrastructure Increases the Interrater Reliability of Medical Coders: Quasi-Experimental Study. J Med Internet Res 2018; 20:e274. [PMID: 30322834 PMCID: PMC6231825 DOI: 10.2196/jmir.9644] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2017] [Revised: 05/03/2018] [Accepted: 06/28/2018] [Indexed: 01/05/2023] Open
Abstract
Background Medical coding is essential for standardized communication and integration of clinical data. The Unified Medical Language System by the National Library of Medicine is the largest clinical terminology system for medical coders and Natural Language Processing tools. However, the abundance of ambiguous codes leads to low rates of uniform coding among different coders. Objective The objective of our study was to measure uniform coding among different medical experts in terms of interrater reliability and analyze the effect on interrater reliability using an expert- and Web-based code suggestion system. Methods We conducted a quasi-experimental study in which 6 medical experts coded 602 medical items from structured quality assurance forms or free-text eligibility criteria of 20 different clinical trials. The medical item content was selected on the basis of mortality-leading diseases according to World Health Organization data. The intervention comprised using a semiautomatic code suggestion tool that is linked to a European information infrastructure providing a large medical text corpus of >300,000 medical form items with expert-assigned semantic codes. Krippendorff alpha (Kalpha) with bootstrap analysis was used for the interrater reliability analysis, and coding times were measured before and after the intervention. Results The intervention improved interrater reliability in structured quality assurance form items (from Kalpha=0.50, 95% CI 0.43-0.57 to Kalpha=0.62 95% CI 0.55-0.69) and free-text eligibility criteria (from Kalpha=0.19, 95% CI 0.14-0.24 to Kalpha=0.43, 95% CI 0.37-0.50) while preserving or slightly reducing the mean coding time per item for all 6 coders. Regardless of the intervention, precoordination and structured items were associated with significantly high interrater reliability, but the proportion of items that were precoordinated significantly increased after intervention (eligibility criteria: OR 4.92, 95% CI 2.78-8.72; quality assurance: OR 1.96, 95% CI 1.19-3.25). Conclusions The Web-based code suggestion mechanism improved interrater reliability toward moderate or even substantial intercoder agreement. Precoordination and the use of structured versus free-text data elements are key drivers of higher interrater reliability.
Collapse
Affiliation(s)
- Julian Varghese
- Institute of Medical Informatics, University of Münster, Münster, Germany
| | - Sarah Sandmann
- Institute of Medical Informatics, University of Münster, Münster, Germany
| | - Martin Dugas
- Institute of Medical Informatics, European Research Center for Information Systems, Münster, Germany
| |
Collapse
|