1
|
Yang X, Huang K, Yang D, Zhao W, Zhou X. Biomedical Big Data Technologies, Applications, and Challenges for Precision Medicine: A Review. GLOBAL CHALLENGES (HOBOKEN, NJ) 2024; 8:2300163. [PMID: 38223896 PMCID: PMC10784210 DOI: 10.1002/gch2.202300163] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 09/20/2023] [Indexed: 01/16/2024]
Abstract
The explosive growth of biomedical Big Data presents both significant opportunities and challenges in the realm of knowledge discovery and translational applications within precision medicine. Efficient management, analysis, and interpretation of big data can pave the way for groundbreaking advancements in precision medicine. However, the unprecedented strides in the automated collection of large-scale molecular and clinical data have also introduced formidable challenges in terms of data analysis and interpretation, necessitating the development of novel computational approaches. Some potential challenges include the curse of dimensionality, data heterogeneity, missing data, class imbalance, and scalability issues. This overview article focuses on the recent progress and breakthroughs in the application of big data within precision medicine. Key aspects are summarized, including content, data sources, technologies, tools, challenges, and existing gaps. Nine fields-Datawarehouse and data management, electronic medical record, biomedical imaging informatics, Artificial intelligence-aided surgical design and surgery optimization, omics data, health monitoring data, knowledge graph, public health informatics, and security and privacy-are discussed.
Collapse
Affiliation(s)
- Xue Yang
- Department of Pancreatic Surgery and West China Biomedical Big Data CenterWest China HospitalSichuan UniversityChengdu610041China
| | - Kexin Huang
- Department of Pancreatic Surgery and West China Biomedical Big Data CenterWest China HospitalSichuan UniversityChengdu610041China
| | - Dewei Yang
- College of Advanced Manufacturing EngineeringChongqing University of Posts and TelecommunicationsChongqingChongqing400000China
| | - Weiling Zhao
- Center for Systems MedicineSchool of Biomedical InformaticsUTHealth at HoustonHoustonTX77030USA
| | - Xiaobo Zhou
- Center for Systems MedicineSchool of Biomedical InformaticsUTHealth at HoustonHoustonTX77030USA
| |
Collapse
|
2
|
Scheible R, Thomczyk F, Blum M, Rautenberg M, Prunotto A, Yazijy S, Boeker M. Integrating row level security in i2b2: segregation of medical records into data marts without data replication and synchronization. JAMIA Open 2023; 6:ooad068. [PMID: 37583654 PMCID: PMC10425194 DOI: 10.1093/jamiaopen/ooad068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 07/28/2023] [Accepted: 08/03/2023] [Indexed: 08/17/2023] Open
Abstract
Objective i2b2 offers the possibility to store biomedical data of different projects in subject oriented data marts of the data warehouse, which potentially requires data replication between different projects and also data synchronization in case of data changes. We present an approach that can save this effort and assess its query performance in a case study that reflects real-world scenarios. Material and Methods For data segregation, we used PostgreSQL's row level security (RLS) feature, the unit test framework pgTAP for validation and testing as well as the i2b2 application. No change of the i2b2 code was required. Instead, to leverage orchestration and deployment, we additionally implemented a command line interface (CLI). We evaluated performance using 3 different queries generated by i2b2, which we performed on an enlarged Harvard demo dataset. Results We introduce the open source Python CLI i2b2rls, which orchestrates and manages security roles to implement data marts so that they do not need to be replicated and synchronized as different i2b2 projects. Our evaluation showed that our approach is on average 3.55 and on median 2.71 times slower compared to classic i2b2 data marts, but has more flexibility and easier setup. Conclusion The RLS-based approach is particularly useful in a scenario with many projects, where data is constantly updated, user and group requirements change frequently or complex user authorization requirements have to be defined. The approach applies to both the i2b2 interface and direct database access.
Collapse
Affiliation(s)
- Raphael Scheible
- Institute of Artificial Intelligence and Informatics in Medicine (AIIM), Chair of Medical Informatics, University Hospital rechts der Isar, School of Medicine, Technical University of Munich, Munich, Germany
- Center for Chronic Immunodeficiency (CCI), Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Fabian Thomczyk
- Data Inintegration Center (DIC), University of Freiburg, Freiburg, Germany
| | - Marco Blum
- Data Inintegration Center (DIC), University of Freiburg, Freiburg, Germany
| | - Micha Rautenberg
- Institute of Medical Biometry and Statistics, Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
- Zentrum für Digitalisierung und Informationstechnologie (ZDI), Medical Center, University of Freiburg, Freiburg, Germany
| | - Andrea Prunotto
- Data Inintegration Center (DIC), University of Freiburg, Freiburg, Germany
| | - Suhail Yazijy
- Institute of Medical Biometry and Statistics, Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Martin Boeker
- Institute of Artificial Intelligence and Informatics in Medicine (AIIM), Chair of Medical Informatics, University Hospital rechts der Isar, School of Medicine, Technical University of Munich, Munich, Germany
| |
Collapse
|
3
|
Zhang GQ, Li X, Huang Y, Cui L. Temporal Cohort Logic. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2023; 2022:1237-1246. [PMID: 37128360 PMCID: PMC10148298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
We introduce a new logic, called Temporal Cohort Logic (TCL), for cohort specification and discovery in clinical and population health research. TCL is created to fill a conceptual gap in formalizing temporal reasoning in biomedicine, in a similar role that temporal logics play for computer science and its applications. We provide formal syntax and semantics for TCL and illustrate the various logical constructs using examples related to human health. Relationships and distinctions with existing temporal logical frameworks are discussed. Applications in electronic health record (EHR) and in neurophysiological data resource are provided. Our approach differs from existing temporal logics, in that we explicitly capture Allen's interval algebra as modal operators in a language of temporal logic (rather than addressing it in the semantic structure). This has two major implications. First, it provides a formal logical framework for reasoning about time in biomedicine, allowing general (i.e., higher-levels of abstraction) investigation into the properties of this approach (such as proof systems, completeness, expressiveness, and decidability) independent of a specific query language or a database system. Second, it puts our approach in the context of logical developments in computer science, allowing potential translation of existing results into the setting of TCL and its variants or subsystems so as to illuminate opportunities and computational challenges involved in temporal reasoning for biomedicine.
Collapse
Affiliation(s)
- Guo-Qiang Zhang
- McGovern Medical School
- School of Biomedical Informatics
- Texas Institute for Restorative Neurotechnologies The University of Texas Health Science Center at Houston, Houston, Texas 77030, USA
| | - Xiaojin Li
- McGovern Medical School
- Texas Institute for Restorative Neurotechnologies The University of Texas Health Science Center at Houston, Houston, Texas 77030, USA
| | - Yan Huang
- McGovern Medical School
- Texas Institute for Restorative Neurotechnologies The University of Texas Health Science Center at Houston, Houston, Texas 77030, USA
| | - Licong Cui
- School of Biomedical Informatics
- Texas Institute for Restorative Neurotechnologies The University of Texas Health Science Center at Houston, Houston, Texas 77030, USA
| |
Collapse
|
4
|
Mora S, Giannini B, Di Biagio A, Cenderello G, Nicolini LA, Taramasso L, Dentone C, Bassetti M, Giacomini M. Ten Years of Medical Informatics and Standards Support for Clinical Research in an Infectious Diseases Network. Appl Clin Inform 2023; 14:16-27. [PMID: 36631000 PMCID: PMC9833953 DOI: 10.1055/s-0042-1760081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND It is 30 years since evidence-based medicine became a great support for individual clinical expertise in daily practice and scientific research. Electronic systems can be used to achieve the goal of collecting data from heterogeneous datasets and to support multicenter clinical trials. The Ligurian Infectious Diseases Network (LIDN) is a web-based platform for data collection and reuse originating from a regional effort and involving many professionals from different fields. OBJECTIVES The objective of this work is to present an integrated system of ad hoc interfaces and tools that we use to perform pseudonymous clinical data collection, both manually and automatically, to support clinical trials. METHODS The project comprehends different scenarios of data collection systems, according to the degree of information technology of the involved centers. To be compliant with national regulations, the last developed connection is based on the standard Clinical Document Architecture Release 2 by Health Level 7 guidelines, interoperability is supported by the involvement of a terminology service. RESULTS Since 2011, the LIDN platform has involved more than 8,000 patients from eight different hospitals, treated or under treatment for at least one infectious disease among human immunodeficiency virus (HIV), hepatitis C virus, severe acute respiratory syndrome coronavirus 2, and tuberculosis. Since 2013, systems for the automatic transfer of laboratory data have been updating patients' information for three centers, daily. Direct communication was set up between the LIDN architecture and three of the main national cohorts of HIV-infected patients. CONCLUSION The LIDN was originally developed to support clinicians involved in the project in the management of data from HIV-infected patients through a web-based tool that could be easily used in primary-care units. Then, the developed system grew modularly to respond to the specific needs that arose over a time span of more than 10 years.
Collapse
Affiliation(s)
- Sara Mora
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, Genoa, Italy,Address for correspondence Sara Mora, Eng Department of Informatics, Bioengineering, Robotics and System Engineering, (DIBRIS), University of GenoaItaly
| | - Barbara Giannini
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, Genoa, Italy
| | - Antonio Di Biagio
- Infectious Diseases Unit, Policlinico San Martino Hospital, IRCCS for Oncology and Neuroscience, Genoa, Italy,Department of Infectious Disease, IRCCS AOU San Martino IST, (DISSAL), University of Genoa, Italy
| | | | - Laura Ambra Nicolini
- Infectious Diseases Unit, Policlinico San Martino Hospital, IRCCS for Oncology and Neuroscience, Genoa, Italy
| | - Lucia Taramasso
- Infectious Diseases Unit, Policlinico San Martino Hospital, IRCCS for Oncology and Neuroscience, Genoa, Italy
| | - Chiara Dentone
- Infectious Diseases Unit, Policlinico San Martino Hospital, IRCCS for Oncology and Neuroscience, Genoa, Italy
| | - Matteo Bassetti
- Infectious Diseases Unit, Policlinico San Martino Hospital, IRCCS for Oncology and Neuroscience, Genoa, Italy,Department of Infectious Disease, IRCCS AOU San Martino IST, (DISSAL), University of Genoa, Italy
| | - Mauro Giacomini
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, Genoa, Italy
| |
Collapse
|
5
|
Oh S, Sung M, Rhee Y, Hong N, Park YR. Evaluation of the Privacy Risks of Personal Health Identifiers and Quasi-Identifiers in a Distributed Research Network: Development and Validation Study. JMIR Med Inform 2021; 9:e24940. [PMID: 34057426 PMCID: PMC8204238 DOI: 10.2196/24940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 12/27/2020] [Accepted: 04/11/2021] [Indexed: 11/23/2022] Open
Abstract
Background Privacy should be protected in medical data that include patient information. A distributed research network (DRN) is one of the challenges in privacy protection and in the encouragement of multi-institutional clinical research. A DRN standardizes multi-institutional data into a common structure and terminology called a common data model (CDM), and it only shares analysis results. It is necessary to measure how a DRN protects patient information privacy even without sharing data in practice. Objective This study aimed to quantify the privacy risk of a DRN by comparing different deidentification levels focusing on personal health identifiers (PHIs) and quasi-identifiers (QIs). Methods We detected PHIs and QIs in an Observational Medical Outcomes Partnership (OMOP) CDM as threatening privacy, based on 18 Health Insurance Portability and Accountability Act of 1996 (HIPPA) identifiers and previous studies. To compare the privacy risk according to the different privacy policies, we generated limited and safe harbor data sets based on 16 PHIs and 12 QIs as threatening privacy from the Synthetic Public Use File 5 Percent (SynPUF5PCT) data set, which is a public data set of the OMOP CDM. With minimum cell size and equivalence class methods, we measured the privacy risk reduction with a trust differential gap obtained by comparing the two data sets. We also measured the gap in randomly sampled records from the two data sets to adjust the number of PHI or QI records. Results The gaps averaged 31.448% and 73.798% for PHIs and QIs, respectively, with a minimum cell size of one, which represents a unique record in a data set. Among PHIs, the national provider identifier had the highest gap of 71.236% (71.244% and 0.007% in the limited and safe harbor data sets, respectively). The maximum size of the equivalence class, which has the largest size of an indistinguishable set of records, averaged 771. In 1000 random samples of PHIs, Device_exposure_start_date had the highest gap of 33.730% (87.705% and 53.975% in the data sets). Among QIs, Death had the highest gap of 99.212% (99.997% and 0.784% in the data sets). In 1000, 10,000, and 100,000 random samples of QIs, Device_treatment had the highest gaps of 12.980% (99.980% and 87.000% in the data sets), 60.118% (99.831% and 39.713%), and 93.597% (98.805% and 5.207%), respectively, and in 1 million random samples, Death had the highest gap of 99.063% (99.998% and 0.934% in the data sets). Conclusions In this study, we verified and quantified the privacy risk of PHIs and QIs in the DRN. Although this study used limited PHIs and QIs for verification, the privacy limitations found in this study could be used as a quality measurement index for deidentification of multi-institutional collaboration research, thereby increasing DRN safety.
Collapse
Affiliation(s)
- SeHee Oh
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - MinDong Sung
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Yumie Rhee
- Department of Internal Medicine, Endocrine Research Institute, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Namki Hong
- Department of Internal Medicine, Endocrine Research Institute, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Yu Rang Park
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
6
|
Huang Y, Li X, Zhang GQ. ELII: A novel inverted index for fast temporal query, with application to a large Covid-19 EHR dataset. J Biomed Inform 2021; 117:103744. [PMID: 33775815 PMCID: PMC9759789 DOI: 10.1016/j.jbi.2021.103744] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 12/10/2020] [Accepted: 03/05/2021] [Indexed: 02/08/2023]
Abstract
Fast temporal query on large EHR-derived data sources presents an emerging big data challenge, as this query modality is intractable using conventional strategies that have not focused on addressing Covid-19-related research needs at scale. We introduce a novel approach called Event-level Inverted Index (ELII) to optimize time trade-offs between one-time batch preprocessing and subsequent open-ended, user-specified temporal queries. An experimental temporal query engine has been implemented in a NoSQL database using our new ELII strategy. Near-real-time performance was achieved on a large Covid-19 EHR dataset, with 1.3 million unique patients and 3.76 billion records. We evaluated the performance of ELII on several types of queries: classical (non-temporal), absolute temporal, and relative temporal. Our experimental results indicate that ELII accomplished these queries in seconds, achieving average speed accelerations of 26.8 times on relative temporal query, 88.6 times on absolute temporal query, and 1037.6 times on classical query compared to a baseline approach without using ELII. Our study suggests that ELII is a promising approach supporting fast temporal query, an important mode of cohort development for Covid-19 studies.
Collapse
Affiliation(s)
- Yan Huang
- University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Xiaojin Li
- University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Guo-Qiang Zhang
- University of Texas Health Science Center at Houston, Houston, TX, USA.
| |
Collapse
|
7
|
Kapsner LA, Kampf MO, Seuchter SA, Gruendner J, Gulden C, Mate S, Mang JM, Schüttler C, Deppenwiese N, Krause L, Zöller D, Balig J, Fuchs T, Fischer P, Haverkamp C, Holderried M, Mayer G, Stenzhorn H, Stolnicu A, Storck M, Storf H, Zohner J, Kohlbacher O, Strzelczyk A, Schüttler J, Acker T, Boeker M, Kaisers UX, Kestler HA, Prokosch HU. Reduced Rate of Inpatient Hospital Admissions in 18 German University Hospitals During the COVID-19 Lockdown. Front Public Health 2021; 8:594117. [PMID: 33520914 PMCID: PMC7838458 DOI: 10.3389/fpubh.2020.594117] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 12/11/2020] [Indexed: 12/23/2022] Open
Abstract
The COVID-19 pandemic has caused strains on health systems worldwide disrupting routine hospital services for all non-COVID patients. Within this retrospective study, we analyzed inpatient hospital admissions across 18 German university hospitals during the 2020 lockdown period compared to 2018. Patients admitted to hospital between January 1 and May 31, 2020 and the corresponding periods in 2018 and 2019 were included in this study. Data derived from electronic health records were collected and analyzed using the data integration center infrastructure implemented in the university hospitals that are part of the four consortia funded by the German Medical Informatics Initiative. Admissions were grouped and counted by ICD 10 chapters and specific reasons for treatment at each site. Pooled aggregated data were centrally analyzed with descriptive statistics to compare absolute and relative differences between time periods of different years. The results illustrate how care process adoptions depended on the COVID-19 epidemiological situation and the criticality of the disease. Overall inpatient hospital admissions decreased by 35% in weeks 1 to 4 and by 30.3% in weeks 5 to 8 after the lockdown announcement compared to 2018. Even hospital admissions for critical care conditions such as malignant cancer treatments were reduced. We also noted a high reduction of emergency admissions such as myocardial infarction (38.7%), whereas the reduction in stroke admissions was smaller (19.6%). In contrast, we observed a considerable reduction in admissions for non-critical clinical situations, such as hysterectomies for benign tumors (78.8%) and hip replacements due to arthrosis (82.4%). In summary, our study shows that the university hospital admission rates in Germany were substantially reduced following the national COVID-19 lockdown. These included critical care or emergency conditions in which deferral is expected to impair clinical outcomes. Future studies are needed to delineate how appropriate medical care of critically ill patients can be maintained during a pandemic.
Collapse
Affiliation(s)
- Lorenz A. Kapsner
- Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
- Department of Radiology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Marvin O. Kampf
- Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
| | - Susanne A. Seuchter
- Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
| | - Julian Gruendner
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Christian Gulden
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Sebastian Mate
- Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
| | - Jonathan M. Mang
- Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
| | - Christina Schüttler
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Noemi Deppenwiese
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Linda Krause
- Institute of Medical Biometry and Epidemiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Daniela Zöller
- Institute of Medical Biometry and Statistics, Medical Faculty and Medical Center, University of Freiburg, Freiburg, Germany
| | - Julien Balig
- Institute of Medical Systems Biology, Ulm University, Ulm, Germany
| | - Timo Fuchs
- Department of Nuclear Medicine, University Hospital Regensburg, Regensburg, Germany
| | - Patrick Fischer
- Institute of Medical Informatics, Faculty of Medicine, Justus-Liebig-University, Gießen, Germany
| | - Christian Haverkamp
- Institute of Digitalisation in Medicine, Medical Faculty and Medical Center, University of Freiburg, Freiburg, Germany
| | - Martin Holderried
- Department of Medical Development and Quality Management, University Hospital Tübingen, Tübingen, Germany
| | - Gerhard Mayer
- Institute of Medical Systems Biology, Ulm University, Ulm, Germany
| | - Holger Stenzhorn
- Saarland University Medical Center, Institute for Medical Biometry, Epidemiology and Medical Informatics, Homburg, Germany
- Institute for Translational Bioinformatics, University Hospital Tübingen, Tübingen, Germany
| | - Ana Stolnicu
- Institute of Medical Systems Biology, Ulm University, Ulm, Germany
| | - Michael Storck
- Institute of Medical Informatics, University of Münster, Münster, Germany
| | - Holger Storf
- Medical Informatics Group, Universitätsklinikum Frankfurt, Frankfurt, Germany
| | - Jochen Zohner
- Institute of Medical Informatics, Faculty of Medicine, Justus-Liebig-University, Gießen, Germany
| | - Oliver Kohlbacher
- Institute for Translational Bioinformatics, University Hospital Tübingen, Tübingen, Germany
- Applied Bioinformatics, Department of Computer Science, University of Tübingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany
- Biomolecular Interactions, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Adam Strzelczyk
- Epilepsy Center Frankfurt Rhine-Main, Center of Neurology and Neurosurgery, Goethe University Frankfurt, Frankfurt, Germany
| | - Jürgen Schüttler
- Department of Anesthesiology, University Hospital Erlangen, Erlangen, Germany
| | - Till Acker
- Institute of Neuropathology, Justus-Liebig-University, Gießen, Germany
| | - Martin Boeker
- Institute of Medical Biometry and Statistics, Medical Faculty and Medical Center, University of Freiburg, Freiburg, Germany
| | | | - Hans A. Kestler
- Institute of Medical Systems Biology, Ulm University, Ulm, Germany
| | - Hans-Ulrich Prokosch
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| |
Collapse
|
8
|
Sholle ET, Cusick M, Davila MA, Kabariti J, Flores S, Campion TR. Characterizing Basic and Complex Usage of i2b2 at an Academic Medical Center. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2020; 2020:589-596. [PMID: 32477681 PMCID: PMC7233105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Developed to enable basic queries for cohort discovery, i2b2 has evolved to support complex queries. Little is known whether query sophistication - and the informatics resources required to support it - addresses researcher needs. In three years at our institution, 609 researchers ran 6,662 queries and requested re-identification of 80 patient cohorts to support specific studies. After characterizing all queries as "basic" or "complex" with respect to use of sophisticated query features, we found that the majority of all queries, and the majority of queries resulting in a request for cohort re-identification, did not use complex i2b2 features. Data domains that required extensive effort to implement saw relatively little use compared to common domains (e.g., diagnoses). These findings suggest that efforts to ensure the performance of basic queries using common data domains may better serve the needs of the research community than efforts to integrate novel domains or introduce complex new features.
Collapse
Affiliation(s)
- Evan T Sholle
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Marika Cusick
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Marcos A Davila
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Joseph Kabariti
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Steven Flores
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Thomas R Campion
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY
- Clinical and Translational Science Center, Weill Cornell Medicine, New York, NY
- Department of Pediatrics, Weill Cornell Medicine, New York, NY
| |
Collapse
|
9
|
Experiences of Transforming a Complex Nephrologic Care and Research Database into i2b2 Using the IDRT Tools. JOURNAL OF HEALTHCARE ENGINEERING 2019; 2019:5640685. [PMID: 30800257 PMCID: PMC6360056 DOI: 10.1155/2019/5640685] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Revised: 09/18/2018] [Accepted: 12/05/2018] [Indexed: 01/15/2023]
Abstract
The secondary use of data from electronic medical records has become an important factor to determine and to identify various causes of disease. For this reason, applications like informatics for integrating biology and the bedside (i2b2) offer a GUI-based front end to select patient cohorts. To make use of those tools, however, clinical data need to be extracted from the Electronic Health Record (EHR) system and integrated into the data schema of i2b2. We used TBase, a documentation system for nephrologic transplantations, as a source system and applied the Integrated Data Repository Toolkit (IDRT) for the Extract, Transform, and Load (ETL) process to load the data into i2b2. Since i2b2 uses an entity-attribute-value (EAV) schema, which is a fundamentally different way of modeling data in comparison to a standard relational schema in TBase, we evaluated if (a) the data relationship of the source system entities can still be represented in the i2b2 schema and if (b) the IDRT is a suitable solution for loading the data of a comprehensive data schema like TBase into i2b2. For that reason, we identified entities in the TBase data schema which were relevant for answering questions on cohort identification. By doing so, we found out that the entities had different structures that needed to be handled differently for the ETL process. Furthermore, the use of IDRT revealed shortcomings with regard to large input data and specific data structures that are part of most modern EHR systems. However, this project also showed that our way of modeling the TBase data in i2b2 has been proven to be successful in terms of answering the most common questions of clinicians on cohort identification.
Collapse
|
10
|
Bruland P, Doods J, Brix T, Dugas M, Storck M. Connecting healthcare and clinical research: Workflow optimizations through seamless integration of EHR, pseudonymization services and EDC systems. Int J Med Inform 2018; 119:103-108. [PMID: 30342678 DOI: 10.1016/j.ijmedinf.2018.09.007] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Revised: 07/02/2018] [Accepted: 09/06/2018] [Indexed: 11/30/2022]
Abstract
OBJECTIVE In the last years, several projects promote the secondary use of routine healthcare data based on electronic health record (EHR) data. In multicenter studies, dedicated pseudonymization services are applied for unified pseudonym handling. Healthcare, clinical research and pseudonymization systems are generally disconnected. Hence, the aim of this research work is to integrate these applications and to evaluate the workflow of clinical research. METHODS We analyzed and identified technical solutions for legislation compliant automatic pseudonym generation and for the integration into EHR as well as electronic data capture (EDC) systems. The Mainzelliste was used as pseudonymization service, which is available as open source solution and compliant with the data privacy concept in Germany. Subject of the integration was the local EHR and an in-house developed EDC system. A time and motion study was conducted to evaluate the effects on the workflow. RESULTS Integration of EHR, pseudonymization service and EDC systems is technically feasible and leads to a less fragmented usage of all applications. Generated pseudonyms are obtained from the service hosted at a trusted third party and can now be used in the EDC as well as in the EHR system for direct access and re-identification. The evaluation of 90 registration iterations shows that the time for documentation has been significantly reduced in average by 39.6 s (56.3%) from 71 ± 8 s to 31 ± 5 s per registered study patient. CONCLUSIONS By incorporating EHR, EDC and pseudonymization systems, it is now feasible to support multicenter studies and registers out of an integrated system landscape within a hospital. Optimizing the workflow of patient registration for clinical research allows reduction of double data entry and transcription errors as well as a seamless transition from clinical routine to research data collection.
Collapse
Affiliation(s)
- Philipp Bruland
- Institute of Medical Informatics, University of Münster, Münster, Germany.
| | - Justin Doods
- Institute of Medical Informatics, University of Münster, Münster, Germany.
| | - Tobias Brix
- Institute of Medical Informatics, University of Münster, Münster, Germany.
| | - Martin Dugas
- Institute of Medical Informatics, University of Münster, Münster, Germany.
| | - Michael Storck
- Institute of Medical Informatics, University of Münster, Münster, Germany.
| |
Collapse
|
11
|
Wagholikar KB, Mendis M, Dessai P, Sanz J, Law S, Gilson M, Sanders S, Vangala M, Bell DS, Murphy SN. Automating Installation of the Integrating Biology and the Bedside (i2b2) Platform. BIOMEDICAL INFORMATICS INSIGHTS 2018; 10:1178222618777749. [PMID: 29887730 PMCID: PMC5989048 DOI: 10.1177/1178222618777749] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2018] [Accepted: 04/25/2018] [Indexed: 11/17/2022]
Abstract
Informatics for Integrating Biology and the Bedside (i2b2) is an open source clinical data analytics platform used at more than 150 institutions for querying patient data. An i2b2 installation (called hive) comprises several i2b2 cells that provide different functionalities. Given the complex architecture of i2b2 installation, creating a working installation of the platform is challenging for new users. This is despite the availability of extensive documentation for i2b2 and access to a large and active mailing list community of i2b2 users. To address this problem, we have created an automated installation package, called i2b2-quickstart, which automatically downloads the latest i2b2 source code and dependencies, and compiles and configures the i2b2 cells to create a functional i2b2 hive installation. This package will serve as a convenient starting point and reference implementation that will facilitate researchers in the installation and exploration of the i2b2 platform.
Collapse
Affiliation(s)
- Kavishwar B Wagholikar
- Massachusetts General Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA.,Partners HealthCare, Boston, MA, USA
| | | | - Pralav Dessai
- University of California, Los Angeles, Los Angeles, CA, USA
| | - Javier Sanz
- University of California, Los Angeles, Los Angeles, CA, USA
| | - Sindy Law
- University of California, San Francisco, San Francisco, CA, USA
| | - Micheal Gilson
- University of California, San Francisco, San Francisco, CA, USA
| | - Stephan Sanders
- University of California, San Francisco, San Francisco, CA, USA
| | | | - Douglas S Bell
- University of California, Los Angeles, Los Angeles, CA, USA
| | - Shawn N Murphy
- Massachusetts General Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA.,Partners HealthCare, Boston, MA, USA
| |
Collapse
|
12
|
Baum B, Christoph J, Engel I, Löbe M, Mate S, Stäubert S, Drepper J, Prokosch HU, Winter A, Sax U, Bauer CRKD, Ganslandt T. Integrated Data Repository Toolkit (IDRT). Methods Inf Med 2018; 55:125-35. [DOI: 10.3414/me15-01-0082] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2015] [Accepted: 09/15/2015] [Indexed: 12/17/2022]
Abstract
SummaryBackground: In recent years, research data warehouses moved increasingly into the focus of interest of medical research. Nevertheless, there are only a few center-independent infrastructure solutions available. They aim to provide a consolidated view on medical data from various sources such as clinical trials, electronic health records, epidemiological registries or longitudinal cohorts. The i2b2 framework is a well-established solution for such repositories, but it lacks support for importing and integrating clinical data and metadata.Objectives: The goal of this project was to develop a platform for easy integration and administration of data from heterogeneous sources, to provide capabilities for linking them to medical terminologies and to allow for transforming and mapping of data streams for user-specific views.Methods: A suite of three tools has been developed: the i2b2 Wizard for simplifying administration of i2b2, the IDRT Import and Mapping Tool for loading clinical data from various formats like CSV, SQL, CDISC ODM or biobanks and the IDRT i2b2 Web Client Plugin for advanced export options. The Import and Mapping Tool also includes an ontology editor for rearranging and mapping patient data and structures as well as annotating clinical data with medical terminologies, primarily those used in Germany (ICD-10-GM, OPS, ICD-O, etc.).Results: With the three tools functional, new i2b2-based research projects can be created, populated and customized to researcher’s needs in a few hours. Amalgamating data and metadata from different databases can be managed easily. With regards to data privacy a pseudonymization service can be plugged in. Using common ontologies and reference terminologies rather than project-specific ones leads to a consistent understanding of the data semantics.Conclusions: i2b2’s promise is to enable clinical researchers to devise and test new hypothesis even without a deep knowledge in statistical programing. The approach pre -sented here has been tested in a number of scenarios with millions of observations and tens of thousands of patients. Initially mostly observant, trained researchers were able to construct new analyses on their own. Early feedback indicates that timely and extensive access to their “own” data is appreciated most, but it is also lowering the barrier for other tasks, for instance checking data quality and completeness (missing data, wrong coding).
Collapse
|
13
|
Winter A, Takabayashi K, Jahn F, Kimura E, Engelbrecht R, Haux R, Honda M, Hübner UH, Inoue S, Kohl CD, Matsumoto T, Matsumura Y, Miyo K, Nakashima N, Prokosch HU, Staemmler M. Quality Requirements for Electronic Health Record Systems*. A Japanese-German Information Management Perspective. Methods Inf Med 2017; 56:e92-e104. [PMID: 28925415 PMCID: PMC6291988 DOI: 10.3414/me17-05-0002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2017] [Accepted: 06/13/2017] [Indexed: 12/16/2022]
Abstract
BACKGROUND For more than 30 years, there has been close cooperation between Japanese and German scientists with regard to information systems in health care. Collaboration has been formalized by an agreement between the respective scientific associations. Following this agreement, two joint workshops took place to explore the similarities and differences of electronic health record systems (EHRS) against the background of the two national healthcare systems that share many commonalities. OBJECTIVES To establish a framework and requirements for the quality of EHRS that may also serve as a basis for comparing different EHRS. METHODS Donabedian's three dimensions of quality of medical care were adapted to the outcome, process, and structural quality of EHRS and their management. These quality dimensions were proposed before the first workshop of EHRS experts and enriched during the discussions. RESULTS The Quality Requirements Framework of EHRS (QRF-EHRS) was defined and complemented by requirements for high quality EHRS. The framework integrates three quality dimensions (outcome, process, and structural quality), three layers of information systems (processes and data, applications, and physical tools) and three dimensions of information management (strategic, tactical, and operational information management). CONCLUSIONS Describing and comparing the quality of EHRS is in fact a multidimensional problem as given by the QRF-EHRS framework. This framework will be utilized to compare Japanese and German EHRS, notably those that were presented at the second workshop.
Collapse
Affiliation(s)
- Alfred Winter
- Prof. Alfred Winter, University of Leipzig, Institute for Medical Informatics, Statistics and Epidemiology, Haertelstr. 16 -18, 04107 Leipzig, Germany, E-mail:
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Meystre SM, Lovis C, Bürkle T, Tognola G, Budrionis A, Lehmann CU. Clinical Data Reuse or Secondary Use: Current Status and Potential Future Progress. Yearb Med Inform 2017; 26:38-52. [PMID: 28480475 PMCID: PMC6239225 DOI: 10.15265/iy-2017-007] [Citation(s) in RCA: 84] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Indexed: 12/30/2022] Open
Abstract
Objective: To perform a review of recent research in clinical data reuse or secondary use, and envision future advances in this field. Methods: The review is based on a large literature search in MEDLINE (through PubMed), conference proceedings, and the ACM Digital Library, focusing only on research published between 2005 and early 2016. Each selected publication was reviewed by the authors, and a structured analysis and summarization of its content was developed. Results: The initial search produced 359 publications, reduced after a manual examination of abstracts and full publications. The following aspects of clinical data reuse are discussed: motivations and challenges, privacy and ethical concerns, data integration and interoperability, data models and terminologies, unstructured data reuse, structured data mining, clinical practice and research integration, and examples of clinical data reuse (quality measurement and learning healthcare systems). Conclusion: Reuse of clinical data is a fast-growing field recognized as essential to realize the potentials for high quality healthcare, improved healthcare management, reduced healthcare costs, population health management, and effective clinical research.
Collapse
Affiliation(s)
- S. M. Meystre
- Medical University of South Carolina, Charleston, SC, USA
| | - C. Lovis
- Division of Medical Information Sciences, University Hospitals of Geneva, Switzerland
| | - T. Bürkle
- University of Applied Sciences, Bern, Switzerland
| | - G. Tognola
- Institute of Electronics, Computer and Telecommunication Engineering, Italian Natl. Research Council IEIIT-CNR, Milan, Italy
| | - A. Budrionis
- Norwegian Centre for E-health Research, University Hospital of North Norway, Tromsø, Norway
| | - C. U. Lehmann
- Departments of Biomedical Informatics and Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
15
|
Paton C, Karopka T. The Role of Free/Libre and Open Source Software in Learning Health Systems. Yearb Med Inform 2017; 26:53-58. [PMID: 28480476 PMCID: PMC6239249 DOI: 10.15265/iy-2017-006] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Objective: To give an overview of the role of Free/Libre and Open Source Software (FLOSS) in the context of secondary use of patient data to enable Learning Health Systems (LHSs). Methods: We conducted an environmental scan of the academic and grey literature utilising the MedFLOSS database of open source systems in healthcare to inform a discussion of the role of open source in developing LHSs that reuse patient data for research and quality improvement. Results: A wide range of FLOSS is identified that contributes to the information technology (IT) infrastructure of LHSs including operating systems, databases, frameworks, interoperability software, and mobile and web apps. The recent literature around the development and use of key clinical data management tools is also reviewed. Conclusions: FLOSS already plays a critical role in modern health IT infrastructure for the collection, storage, and analysis of patient data. The nature of FLOSS systems to be collaborative, modular, and modifiable may make open source approaches appropriate for building the digital infrastructure for a LHS.
Collapse
Affiliation(s)
- C. Paton
- Group Head for Global Health Informatics, Centre for Tropical Medicine and Global Health, University of Oxford, UK
| | - T. Karopka
- Chair of IMIA OS WG, Chair of EFMI LIFOSS WG, Project Manager, BioCon Valley GmbH, Greifswald, Germany
| |
Collapse
|
16
|
Jannot AS, Zapletal E, Avillach P, Mamzer MF, Burgun A, Degoulet P. The Georges Pompidou University Hospital Clinical Data Warehouse: A 8-years follow-up experience. Int J Med Inform 2017; 102:21-28. [PMID: 28495345 DOI: 10.1016/j.ijmedinf.2017.02.006] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Accepted: 02/11/2017] [Indexed: 12/25/2022]
Abstract
BACKGROUND When developed jointly with clinical information systems, clinical data warehouses (CDWs) facilitate the reuse of healthcare data and leverage clinical research. OBJECTIVE To describe both data access and use for clinical research, epidemiology and health service research of the "Hôpital Européen Georges Pompidou" (HEGP) CDW. METHODS The CDW has been developed since 2008 using an i2b2 platform. It was made available to health professionals and researchers in October 2010. Procedures to access data have been implemented and different access levels have been distinguished according to the nature of queries. RESULTS As of July 2016, the CDW contained the consolidated data of over 860,000 patients followed since the opening of the HEGP hospital in July 2000. These data correspond to more than 122 million clinical item values, 124 million biological item values, and 3.7 million free text reports. The ethics committee of the hospital evaluates all CDW projects that generate secondary data marts. Characteristics of the 74 research projects validated between January 2011 and December 2015 are described. CONCLUSION The use of HEGP CDWs is a key facilitator for clinical research studies. It required however important methodological and organizational support efforts from a biomedical informatics department.
Collapse
Affiliation(s)
- Anne-Sophie Jannot
- Paris Descartes Faculty of Medicine, Paris, France; INSERM UMR 1138-E22: Information Sciences to Support Personalized Medicine, Paris, France; Medical Informatics, Biostatistics and Public Health Department, Georges Pompidou University Hospital, Paris, France.
| | - Eric Zapletal
- Medical Informatics, Biostatistics and Public Health Department, Georges Pompidou University Hospital, Paris, France
| | - Paul Avillach
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Marie-France Mamzer
- Paris Descartes Faculty of Medicine, Paris, France; INSERM EA 4569 Medical Ethics Department
| | - Anita Burgun
- Paris Descartes Faculty of Medicine, Paris, France; INSERM UMR 1138-E22: Information Sciences to Support Personalized Medicine, Paris, France; Medical Informatics, Biostatistics and Public Health Department, Georges Pompidou University Hospital, Paris, France
| | - Patrice Degoulet
- Paris Descartes Faculty of Medicine, Paris, France; INSERM UMR 1138-E22: Information Sciences to Support Personalized Medicine, Paris, France; Medical Informatics, Biostatistics and Public Health Department, Georges Pompidou University Hospital, Paris, France
| |
Collapse
|
17
|
Bruland P, Dugas M. S2O - A software tool for integrating research data from general purpose statistic software into electronic data capture systems. BMC Med Inform Decis Mak 2017; 17:3. [PMID: 28061771 PMCID: PMC5219713 DOI: 10.1186/s12911-016-0402-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2016] [Accepted: 11/22/2016] [Indexed: 11/28/2022] Open
Abstract
Background Data capture for clinical registries or pilot studies is often performed in spreadsheet-based applications like Microsoft Excel or IBM SPSS. Usually, data is transferred into statistic software, such as SAS, R or IBM SPSS Statistics, for analyses afterwards. Spreadsheet-based solutions suffer from several drawbacks: It is generally not possible to ensure a sufficient right and role management; it is not traced who has changed data when and why. Therefore, such systems are not able to comply with regulatory requirements for electronic data capture in clinical trials. In contrast, Electronic Data Capture (EDC) software enables a reliable, secure and auditable collection of data. In this regard, most EDC vendors support the CDISC ODM standard to define, communicate and archive clinical trial meta- and patient data. Advantages of EDC systems are support for multi-user and multicenter clinical trials as well as auditable data. Migration from spreadsheet based data collection to EDC systems is labor-intensive and time-consuming at present. Hence, the objectives of this research work are to develop a mapping model and implement a converter between the IBM SPSS and CDISC ODM standard and to evaluate this approach regarding syntactic and semantic correctness. Results A mapping model between IBM SPSS and CDISC ODM data structures was developed. SPSS variables and patient values can be mapped and converted into ODM. Statistical and display attributes from SPSS are not corresponding to any ODM elements; study related ODM elements are not available in SPSS. The S2O converting tool was implemented as command-line-tool using the SPSS internal Java plugin. Syntactic and semantic correctness was validated with different ODM tools and reverse transformation from ODM into SPSS format. Clinical data values were also successfully transformed into the ODM structure. Conclusion Transformation between the spreadsheet format IBM SPSS and the ODM standard for definition and exchange of trial data is feasible. S2O facilitates migration from Excel- or SPSS-based data collections towards reliable EDC systems. Thereby, advantages of EDC systems like reliable software architecture for secure and traceable data collection and particularly compliance with regulatory requirements are achievable. Electronic supplementary material The online version of this article (doi:10.1186/s12911-016-0402-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Philipp Bruland
- Institute of Medical Informatics, University of Münster, 48149, Münster, Germany.
| | - Martin Dugas
- Institute of Medical Informatics, University of Münster, 48149, Münster, Germany
| |
Collapse
|
18
|
Ferdynus C, Huiart L. [Technical improvement of cohort constitution in administrative health databases: Providing a tool for integration and standardization of data applicable in the French National Health Insurance Database (SNIIRAM)]. Rev Epidemiol Sante Publique 2016; 64:263-9. [PMID: 27592033 DOI: 10.1016/j.respe.2016.02.011] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2015] [Accepted: 02/04/2016] [Indexed: 11/26/2022] Open
Abstract
AIM Administrative health databases such as the French National Heath Insurance Database - SNIIRAM - are a major tool to answer numerous public health research questions. However the use of such data requires complex and time-consuming data management. Our objective was to develop and make available a tool to optimize cohort constitution within administrative health databases. METHODS We developed a process to extract, transform and load (ETL) data from various heterogeneous sources in a standardized data warehouse. This data warehouse is architected as a star schema corresponding to an i2b2 star schema model. We then evaluated the performance of this ETL using data from a pharmacoepidemiology research project conducted in the SNIIRAM database. RESULTS The ETL we developed comprises a set of functionalities for creating SAS scripts. Data can be integrated into a standardized data warehouse. As part of the performance assessment of this ETL, we achieved integration of a dataset from the SNIIRAM comprising more than 900 million lines in less than three hours using a desktop computer. This enables patient selection from the standardized data warehouse within seconds of the request. CONCLUSION The ETL described in this paper provides a tool which is effective and compatible with all administrative health databases, without requiring complex database servers. This tool should simplify cohort constitution in health databases; the standardization of warehouse data facilitates collaborative work between research teams.
Collapse
Affiliation(s)
- C Ferdynus
- Unité de soutien méthodologique, département d'informatique médicale, CHU La-Réunion, allée des Topazes, CS11021, 97400 Saint-Denis, France; Inserm, CIC 1410, 97410 Saint-Pierre, France.
| | - L Huiart
- Unité de soutien méthodologique, département d'informatique médicale, CHU La-Réunion, allée des Topazes, CS11021, 97400 Saint-Denis, France; Inserm, CIC 1410, 97410 Saint-Pierre, France.
| |
Collapse
|
19
|
Haarbrandt B, Tute E, Marschollek M. Automated population of an i2b2 clinical data warehouse from an openEHR-based data repository. J Biomed Inform 2016; 63:277-294. [PMID: 27507090 DOI: 10.1016/j.jbi.2016.08.007] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Revised: 07/24/2016] [Accepted: 08/05/2016] [Indexed: 10/21/2022]
Abstract
BACKGROUND Detailed Clinical Model (DCM) approaches have recently seen wider adoption. More specifically, openEHR-based application systems are now used in production in several countries, serving diverse fields of application such as health information exchange, clinical registries and electronic medical record systems. However, approaches to efficiently provide openEHR data to researchers for secondary use have not yet been investigated or established. METHODS We developed an approach to automatically load openEHR data instances into the open source clinical data warehouse i2b2. We evaluated query capabilities and the performance of this approach in the context of the Hanover Medical School Translational Research Framework (HaMSTR), an openEHR-based data repository. RESULTS Automated creation of i2b2 ontologies from archetypes and templates and the integration of openEHR data instances from 903 patients of a paediatric intensive care unit has been achieved. In total, it took an average of ∼2527s to create 2.311.624 facts from 141.917 XML documents. Using the imported data, we conducted sample queries to compare the performance with two openEHR systems and to investigate if this representation of data is feasible to support cohort identification and record level data extraction. DISCUSSION We found the automated population of an i2b2 clinical data warehouse to be a feasible approach to make openEHR data instances available for secondary use. Such an approach can facilitate timely provision of clinical data to researchers. It complements analytics based on the Archetype Query Language by allowing querying on both, legacy clinical data sources and openEHR data instances at the same time and by providing an easy-to-use query interface. However, due to different levels of expressiveness in the data models, not all semantics could be preserved during the ETL process.
Collapse
Affiliation(s)
- Birger Haarbrandt
- Peter L. Reichertz Institute for Medical Informatics, University of Braunschweig - Institute of Technology and Hanover Medical School, Hanover, Germany.
| | - Erik Tute
- Peter L. Reichertz Institute for Medical Informatics, University of Braunschweig - Institute of Technology and Hanover Medical School, Hanover, Germany
| | - Michael Marschollek
- Peter L. Reichertz Institute for Medical Informatics, University of Braunschweig - Institute of Technology and Hanover Medical School, Hanover, Germany
| |
Collapse
|
20
|
Kaspar M, Ertl M, Fette G, Dietrich G, Toepfer M, Angermann C, Störk S, Puppe F. Data Linkage from Clinical to Study Databases via an R Data Warehouse User Interface. Experiences from a Large Clinical Follow-up Study. Methods Inf Med 2016; 55:381-6. [PMID: 27405886 DOI: 10.3414/me15-02-0015] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2015] [Accepted: 06/15/2016] [Indexed: 11/09/2022]
Abstract
BACKGROUND Data that needs to be documented for clinical studies has often been acquired and documented in clinical routine. Usually this data is manually transferred to Case Report Forms (CRF) and/or directly into an electronic data capture (EDC) system. OBJECTIVES To enhance the documentation process of a large clinical follow-up study targeting patients admitted for acutely decompensated heart failure by accessing the data created during routine and study visits from a hospital information system (HIS) and by transferring it via a data warehouse (DWH) into the study's EDC system. METHODS This project is based on the clinical DWH developed at the University of Würzburg. The DWH was extended by several new data domains including data created by the study team itself. An R user interface was developed for the DWH that allows to access its source data in all its detail, to transform data as comprehensively as possible by R into study-specific variables and to support the creation of data and catalog tables. RESULTS A data flow was established that starts with labeling patients as study patients within the HIS and proceeds with updating the DWH with this label and further data domains at a daily rate. Several study-specific variables were defined using the implemented R user interface of the DWH. This system was then used to export these variables as data tables ready for import into our EDC system. The data tables were then used to initialize the first 296 patients within the EDC system by pseudonym, visit and data values. Afterwards, these records were filled with clinical data on heart failure, vital parameters and time spent on selected wards. CONCLUSIONS This solution focuses on the comprehensive access and transformation of data for a DWH-EDC system linkage. Using this system in a large clinical study has demonstrated the feasibility of this approach for a study with a complex visit schedule.
Collapse
Affiliation(s)
- Mathias Kaspar
- Dr. Mathias Kaspar, Comprehensive Heart Failure Center / DZHI, University Hospital of Würzburg, Straubmühlweg 2a, Haus A9, 97078 Würzburg, Germany, E-mail:
| | | | | | | | | | | | | | | |
Collapse
|
21
|
Rance B, Canuel V, Countouris H, Laurent-Puig P, Burgun A. Integrating Heterogeneous Biomedical Data for Cancer Research: the CARPEM infrastructure. Appl Clin Inform 2016; 7:260-74. [PMID: 27437039 PMCID: PMC4941838 DOI: 10.4338/aci-2015-09-ra-0125] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Accepted: 02/07/2016] [Indexed: 01/19/2023] Open
Abstract
Cancer research involves numerous disciplines. The multiplicity of data sources and their heterogeneous nature render the integration and the exploration of the data more and more complex. Translational research platforms are a promising way to assist scientists in these tasks. In this article, we identify a set of scientific and technical principles needed to build a translational research platform compatible with ethical requirements, data protection and data-integration problems. We describe the solution adopted by the CARPEM cancer research program to design and deploy a platform able to integrate retrospective, prospective, and day-to-day care data. We designed a three-layer architecture composed of a data collection layer, a data integration layer and a data access layer. We leverage a set of open-source resources including i2b2 and tranSMART.
Collapse
Affiliation(s)
- Bastien Rance
- University Hospital Georges Pompidou, Paris, France; INSERM UMR_S 1138, CRC, Paris, France
| | | | - Hector Countouris
- University Hospital Georges Pompidou, Paris, France; INSERM UMR_S 1138, CRC, Paris, France
| | - Pierre Laurent-Puig
- University Hospital Georges Pompidou, Paris, France; Université Paris Sorbonne Cité, Inserm UMR-S 1147, Paris, France
| | - Anita Burgun
- University Hospital Georges Pompidou, Paris, France; INSERM UMR_S 1138, CRC, Paris, France
| |
Collapse
|
22
|
Hume S, Aerts J, Sarnikar S, Huser V. Current applications and future directions for the CDISC Operational Data Model standard: A methodological review. J Biomed Inform 2016; 60:352-62. [PMID: 26944737 PMCID: PMC4837012 DOI: 10.1016/j.jbi.2016.02.016] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2015] [Revised: 02/21/2016] [Accepted: 02/22/2016] [Indexed: 11/25/2022]
Abstract
INTRODUCTION In order to further advance research and development on the Clinical Data Interchange Standards Consortium (CDISC) Operational Data Model (ODM) standard, the existing research must be well understood. This paper presents a methodological review of the ODM literature. Specifically, it develops a classification schema to categorize the ODM literature according to how the standard has been applied within the clinical research data lifecycle. This paper suggests areas for future research and development that address ODM's limitations and capitalize on its strengths to support new trends in clinical research informatics. METHODS A systematic scan of the following databases was performed: (1) ABI/Inform, (2) ACM Digital, (3) AIS eLibrary, (4) Europe Central PubMed, (5) Google Scholar, (5) IEEE Xplore, (7) PubMed, and (8) ScienceDirect. A Web of Science citation analysis was also performed. The search term used on all databases was "CDISC ODM." The two primary inclusion criteria were: (1) the research must examine the use of ODM as an information system solution component, or (2) the research must critically evaluate ODM against a stated solution usage scenario. Out of 2686 articles identified, 266 were included in a title level review, resulting in 183 articles. An abstract review followed, resulting in 121 remaining articles; and after a full text scan 69 articles met the inclusion criteria. RESULTS As the demand for interoperability has increased, ODM has shown remarkable flexibility and has been extended to cover a broad range of data and metadata requirements that reach well beyond ODM's original use cases. This flexibility has yielded research literature that covers a diverse array of topic areas. A classification schema reflecting the use of ODM within the clinical research data lifecycle was created to provide a categorized and consolidated view of the ODM literature. The elements of the framework include: (1) EDC (Electronic Data Capture) and EHR (Electronic Health Record) infrastructure; (2) planning; (3) data collection; (4) data tabulations and analysis; and (5) study archival. The analysis reviews the strengths and limitations of ODM as a solution component within each section of the classification schema. This paper also identifies opportunities for future ODM research and development, including improved mechanisms for semantic alignment with external terminologies, better representation of the CDISC standards used end-to-end across the clinical research data lifecycle, improved support for real-time data exchange, the use of EHRs for research, and the inclusion of a complete study design. CONCLUSIONS ODM is being used in ways not originally anticipated, and covers a diverse array of use cases across the clinical research data lifecycle. ODM has been used as much as a study metadata standard as it has for data exchange. A significant portion of the literature addresses integrating EHR and clinical research data. The simplicity and readability of ODM has likely contributed to its success and broad implementation as a data and metadata standard. Keeping the core ODM model focused on the most fundamental use cases, while using extensions to handle edge cases, has kept the standard easy for developers to learn and use.
Collapse
Affiliation(s)
- Sam Hume
- Dakota State University, College of Business and Information Systems, 820 N Washington Ave, Madison, SD 57042, United States.
| | - Jozef Aerts
- FH Joanneum University of Applied Sciences, Eggenberger Allee 11, 8020 Graz, Austria.
| | - Surendra Sarnikar
- Dakota State University, College of Business and Information Systems, 820 N Washington Ave, Madison, SD 57042, United States.
| | - Vojtech Huser
- Lister Hill National Center for Biomedical Communications, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bld 38a, Rm 9N919, Bethesda, MD 20894, United States.
| |
Collapse
|
23
|
Hackl WO, Ammenwerth E. SPIRIT: Systematic Planning of Intelligent Reuse of Integrated Clinical Routine Data. A Conceptual Best-practice Framework and Procedure Model. Methods Inf Med 2016; 55:114-24. [PMID: 26769124 DOI: 10.3414/me15-01-0045] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2015] [Accepted: 11/11/2015] [Indexed: 12/28/2022]
Abstract
BACKGROUND Secondary use of clinical routine data is receiving an increasing amount of attention in biomedicine and healthcare. However, building and analysing integrated clinical routine data repositories are nontrivial, challenging tasks. As in most evolving fields, recognized standards, well-proven methodological frameworks, or accurately described best-practice approaches for the systematic planning of solutions for secondary use of routine medical record data are missing. OBJECTIVE We propose a conceptual best-practice framework and procedure model for the systematic planning of intelligent reuse of integrated clinical routine data (SPIRIT). METHODS SPIRIT was developed based on a broad literature overview and further refined in two case studies with different kinds of clinical routine data, including process-oriented nursing data from a large hospital group and high-volume multimodal clinical data from a neurologic intensive care unit. RESULTS SPIRIT aims at tailoring secondary use solutions to specific needs of single departments without losing sight of the institution as a whole. It provides a general conceptual best-practice framework consisting of three parts: First, a secondary use strategy for the whole organization is determined. Second, comprehensive analyses are conducted from two different viewpoints to define the requirements regarding a clinical routine data reuse solution at the system level from the data perspective (BOTTOM UP) and at the strategic level from the future users perspective (TOP DOWN). An obligatory clinical context analysis (IN BETWEEN) facilitates refinement, combination, and integration of the different requirements. The third part of SPIRIT is dedicated to implementation, which comprises design and realization of clinical data integration and management as well as data analysis solutions. CONCLUSIONS The SPIRIT framework is intended to be used to systematically plan the intelligent reuse of clinical routine data for multiple purposes, which often was not intended when the primary clinical documentation systems were implemented. SPIRIT helps to overcome this gap. It can be applied in healthcare institutions of any size or specialization and allows a stepwise setup and evolution of holistic clinical routine data reuse solutions.
Collapse
Affiliation(s)
- W O Hackl
- Dr. Werner O. Hackl, Institute of Biomedical Informatics, UMIT - University for Health Sciences, Medical Informatics and Technology, Eduard Wallnöfer Zentrum 1, 6060 Hall in Tirol, Austria, E-mail:
| | | |
Collapse
|
24
|
Firnkorn D, Ganzinger M, Muley T, Thomas M, Knaup P. A Generic Data Harmonization Process for Cross-linked Research and Network Interaction. Construction and Application for the Lung Cancer Phenotype Database of the German Center for Lung Research. Methods Inf Med 2015; 54:455-60. [PMID: 26394900 DOI: 10.3414/me14-02-0030] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2014] [Accepted: 09/01/2015] [Indexed: 11/09/2022]
Abstract
OBJECTIVE Joint data analysis is a key requirement in medical research networks. Data are available in heterogeneous formats at each network partner and their harmonization is often rather complex. The objective of our paper is to provide a generic approach for the harmonization process in research networks. We applied the process when harmonizing data from three sites for the Lung Cancer Phenotype Database within the German Center for Lung Research. METHODS We developed a spreadsheet-based solution as tool to support the harmonization process for lung cancer data and a data integration procedure based on Talend Open Studio. RESULTS The harmonization process consists of eight steps describing a systematic approach for defining and reviewing source data elements and standardizing common data elements. The steps for defining common data elements and harmonizing them with local data definitions are repeated until consensus is reached. Application of this process for building the phenotype database led to a common basic data set on lung cancer with 285 structured parameters. The Lung Cancer Phenotype Database was realized as an i2b2 research data warehouse. CONCLUSION Data harmonization is a challenging task requiring informatics skills as well as domain knowledge. Our approach facilitates data harmonization by providing guidance through a uniform process that can be applied in a wide range of projects.
Collapse
Affiliation(s)
- D Firnkorn
- Daniel Firnkorn, Heidelberg University, Institute of Medical Biometry and Informatics, Im Neuenheimer Feld 305, 69120 Heidelberg, Germany, E-mail:
| | | | | | | | | |
Collapse
|
25
|
Stäubert S, Schaaf M, Jahn F, Brandner R, Winter A. Modeling Interoperable Information Systems with 3LGM² and IHE. Methods Inf Med 2015; 54:398-405. [PMID: 26394817 DOI: 10.3414/me14-02-0027] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 09/04/2015] [Indexed: 11/09/2022]
Abstract
BACKGROUND Strategic planning of information systems (IS) in healthcare requires descriptions of the current and the future IS state. Enterprise architecture planning (EAP) tools like the 3LGM² tool help to build up and to analyze IS models. A model of the planned architecture can be derived from an analysis of current state IS models. Building an interoperable IS, i. e. an IS consisting of interoperable components, can be considered a relevant strategic information management goal for many IS in healthcare. Integrating the healthcare enterprise (IHE) is an initiative which targets interoperability by using established standards. OBJECTIVES To link IHE concepts to 3LGM² concepts within the 3LGM² tool. To describe how an information manager can be supported in handling the complex IHE world and planning interoperable IS using 3LGM² models. To describe how developers or maintainers of IHE profiles can be supported by the representation of IHE concepts in 3LGM². METHODS Conceptualization and concept mapping methods are used to assign IHE concepts such as domains, integration profiles actors and transactions to the concepts of the three-layer graph-based meta-model (3LGM²). RESULTS IHE concepts were successfully linked to 3LGM² concepts. An IHE-master-model, i. e. an abstract model for IHE concepts, was modeled with the help of 3LGM² tool. Two IHE domains were modeled in detail (ITI, QRPH). We describe two use cases for the representation of IHE concepts and IHE domains as 3LGM² models. Information managers can use the IHE-master-model as reference model for modeling interoperable IS based on IHE profiles during EAP activities. IHE developers are supported in analyzing consistency of IHE concepts with the help of the IHE-master-model and functions of the 3LGM² tool CONCLUSION The complex relations between IHE concepts can be modeled by using the EAP method 3LGM². 3LGM² tool offers visualization and analysis features which are now available for the IHE-master-model. Thus information managers and IHE developers can use or develop IHE profiles systematically. In order to improve the usability and handling of the IHE-master-model and its usage as a reference model, some further refinements have to be done. Evaluating the use of the IHE-master-model by information managers and IHE developers is subject to further research.
Collapse
Affiliation(s)
- S Stäubert
- Sebastian Stäubert, Institute for Medical Informatics, Statistics and Epidemiology (IMISE), University of Leipzig, Härtelstr. 16, 04107 Leipzig, Germany
| | | | | | | | | |
Collapse
|
26
|
Rossi E, Rosa M, Rossi L, Priori A, Marceglia S. WebBioBank: A new platform for integrating clinical forms and shared neurosignal analyses to support multi-centre studies in Parkinson’s Disease. J Biomed Inform 2014; 52:92-104. [DOI: 10.1016/j.jbi.2014.08.014] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2013] [Revised: 07/07/2014] [Accepted: 08/28/2014] [Indexed: 11/27/2022]
|
27
|
Harris DR, Henderson DW, Kavuluru R, Stromberg AJ, Johnson TR. Using common table expressions to build a scalable Boolean query generator for clinical data warehouses. IEEE J Biomed Health Inform 2014; 18:1607-13. [PMID: 25192572 DOI: 10.1109/jbhi.2013.2292591] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
We present a custom, Boolean query generator utilizing common-table expressions (CTEs) that is capable of scaling with big datasets. The generator maps user-defined Boolean queries, such as those interactively created in clinical-research and general-purpose healthcare tools, into SQL. We demonstrate the effectiveness of this generator by integrating our study into the Informatics for Integrating Biology and the Bedside (i2b2) query tool and show that it is capable of scaling. Our custom generator replaces and outperforms the default query generator found within the Clinical Research Chart cell of i2b2. In our experiments, 16 different types of i2b2 queries were identified by varying four constraints: date, frequency, exclusion criteria, and whether selected concepts occurred in the same encounter. We generated nontrivial, random Boolean queries based on these 16 types; the corresponding SQL queries produced by both generators were compared by execution times. The CTE-based solution significantly outperformed the default query generator and provided a much more consistent response time across all query types (M = 2.03, SD = 6.64 versus M = 75.82, SD = 238.88 s). Without costly hardware upgrades, we provide a scalable solution based on CTEs with very promising empirical results centered on performance gains. The evaluation methodology used for this provides a means of profiling clinical data warehouse performance.
Collapse
|
28
|
Schreiweis B, Trinczek B, Köpcke F, Leusch T, Majeed RW, Wenk J, Bergh B, Ohmann C, Röhrig R, Dugas M, Prokosch HU. Comparison of electronic health record system functionalities to support the patient recruitment process in clinical trials. Int J Med Inform 2014; 83:860-8. [PMID: 25189709 DOI: 10.1016/j.ijmedinf.2014.08.005] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2013] [Revised: 08/13/2014] [Accepted: 08/14/2014] [Indexed: 10/24/2022]
Abstract
OBJECTIVES Reusing data from electronic health records for clinical and translational research and especially for patient recruitment has been tackled in a broader manner since about a decade. Most projects found in the literature however focus on standalone systems and proprietary implementations at one particular institution often for only one singular trial and no generic evaluation of EHR systems for their applicability to support the patient recruitment process does yet exist. Thus we sought to assess whether the current generation of EHR systems in Germany provides modules/tools, which can readily be applied for IT-supported patient recruitment scenarios. METHODS We first analysed the EHR portfolio implemented at German University Hospitals and then selected 5 sites with five different EHR implementations covering all major commercial systems applied in German University Hospitals. Further, major functionalities required for patient recruitment support have been defined and the five sample EHRs and their standard tools have been compared to the major functionalities. RESULTS In our analysis of the site's hospital information system environments (with four commercial EHR systems and one self-developed system) we found that - even though no dedicated module for patient recruitment has been provided - most EHR products comprise generic tools such as workflow engines, querying capabilities, report generators and direct SQL-based database access which can be applied as query modules, screening lists and notification components for patient recruitment support. A major limitation of all current EHR products however is that they provide no dedicated data structures and functionalities for implementing and maintaining a local trial registry. CONCLUSIONS At the five sites with standard EHR tools the typical functionalities of the patient recruitment process could be mostly implemented. However, no EHR component is yet directly dedicated to support research requirements such as patient recruitment. We recommend for future developments that EHR customers and vendors focus much more on the provision of dedicated patient recruitment modules.
Collapse
Affiliation(s)
- Björn Schreiweis
- Center for Information Technology and Medical Engineering, Heidelberg University Hospital, Speyerer Straße 4, 69115 Heidelberg, Germany.
| | - Benjamin Trinczek
- Institute of Medical Informatics, University of Münster, Albert-Schweitzer-Campus 1, A11, 48149 Münster, Germany
| | - Felix Köpcke
- Chair of Medical Informatics, Friedrich-Alexander-University Erlangen-Nuremberg, Krankenhausstraße 12, 91054 Erlangen, Germany
| | - Thomas Leusch
- Department of Information- and Communication-Technology, Düsseldorf University Hospital, Moorenstraße 5, 40225 Düsseldorf, Germany
| | - Raphael W Majeed
- Department of Medical Informatics in Anesthesiology and Intensive Care Medicine, Justus-Liebig University Gießen, Rudolf-Buchheimstraße 7, 35385 Gießen, Germany
| | - Joachim Wenk
- Coordination Centre for Clinical Trials, Faculty of Medicine, Heinrich Heine University Düsseldorf, Moorenstraße 5, 40225 Düsseldorf, Germany
| | - Björn Bergh
- Center for Information Technology and Medical Engineering, Heidelberg University Hospital, Speyerer Straße 4, 69115 Heidelberg, Germany
| | - Christian Ohmann
- Coordination Centre for Clinical Trials, Faculty of Medicine, Heinrich Heine University Düsseldorf, Moorenstraße 5, 40225 Düsseldorf, Germany
| | - Rainer Röhrig
- Department of Medical Informatics in Anesthesiology and Intensive Care Medicine, Justus-Liebig University Gießen, Rudolf-Buchheimstraße 7, 35385 Gießen, Germany
| | - Martin Dugas
- Institute of Medical Informatics, University of Münster, Albert-Schweitzer-Campus 1, A11, 48149 Münster, Germany
| | - Hans-Ulrich Prokosch
- Chair of Medical Informatics, Friedrich-Alexander-University Erlangen-Nuremberg, Krankenhausstraße 12, 91054 Erlangen, Germany
| |
Collapse
|
29
|
Trinczek B, Köpcke F, Leusch T, Majeed RW, Schreiweis B, Wenk J, Bergh B, Ohmann C, Röhrig R, Prokosch HU, Dugas M. Design and multicentric implementation of a generic software architecture for patient recruitment systems re-using existing HIS tools and routine patient data. Appl Clin Inform 2014; 5:264-83. [PMID: 24734138 DOI: 10.4338/aci-2013-07-ra-0047] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2013] [Accepted: 01/26/2014] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVE (1) To define features and data items of a Patient Recruitment System (PRS); (2) to design a generic software architecture of such a system covering the requirements; (3) to identify implementation options available within different Hospital Information System (HIS) environments; (4) to implement five PRS following the architecture and utilizing the implementation options as proof of concept. METHODS Existing PRS were reviewed and interviews with users and developers conducted. All reported PRS features were collected and prioritized according to their published success and user's request. Common feature sets were combined into software modules of a generic software architecture. Data items to process and transfer were identified for each of the modules. Each site collected implementation options available within their respective HIS environment for each module, provided a prototypical implementation based on available implementation possibilities and supported the patient recruitment of a clinical trial as a proof of concept. RESULTS 24 commonly reported and requested features of a PRS were identified, 13 of them prioritized as being mandatory. A UML version 2 based software architecture containing 5 software modules covering these features was developed. 13 data item groups processed by the modules, thus required to be available electronically, have been identified. Several implementation options could be identified for each module, most of them being available at multiple sites. Utilizing available tools, a PRS could be implemented in each of the five participating German university hospitals. CONCLUSION A set of required features and data items of a PRS has been described for the first time. The software architecture covers all features in a clear, well-defined way. The variety of implementation options and the prototypes show that it is possible to implement the given architecture in different HIS environments, thus enabling more sites to successfully support patient recruitment in clinical trials.
Collapse
Affiliation(s)
- B Trinczek
- Institute of Medical Informatics, University of Münster , Germany
| | - F Köpcke
- Chair of Medical Informatics, Friedrich-Alexander-University Erlangen-Nuremberg , Germany
| | - T Leusch
- Department of Information- and Communication-Technology, Düsseldorf University Hospital , Germany
| | - R W Majeed
- Department of Anesthesia and Intensive Care Medicine, Justus-Liebig University Gießen , Germany
| | - B Schreiweis
- Center for Information Technology and Medical Engineering, Heidelberg University Hospital , Germany
| | - J Wenk
- Coordination Centre for Clinical Trials, Faculty of Medicine, Heinrich Heine University Düsseldorf , Germany
| | - B Bergh
- Center for Information Technology and Medical Engineering, Heidelberg University Hospital , Germany
| | - C Ohmann
- Coordination Centre for Clinical Trials, Faculty of Medicine, Heinrich Heine University Düsseldorf , Germany
| | - R Röhrig
- Department of Anesthesia and Intensive Care Medicine, Justus-Liebig University Gießen , Germany
| | - H U Prokosch
- Chair of Medical Informatics, Friedrich-Alexander-University Erlangen-Nuremberg , Germany
| | - M Dugas
- Institute of Medical Informatics, University of Münster , Germany
| |
Collapse
|
30
|
SHRINE: enabling nationally scalable multi-site disease studies. PLoS One 2013; 8:e55811. [PMID: 23533569 PMCID: PMC3591385 DOI: 10.1371/journal.pone.0055811] [Citation(s) in RCA: 91] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2012] [Accepted: 01/04/2013] [Indexed: 11/19/2022] Open
Abstract
Results of medical research studies are often contradictory or cannot be reproduced. One reason is that there may not be enough patient subjects available for observation for a long enough time period. Another reason is that patient populations may vary considerably with respect to geographic and demographic boundaries thus limiting how broadly the results apply. Even when similar patient populations are pooled together from multiple locations, differences in medical treatment and record systems can limit which outcome measures can be commonly analyzed. In total, these differences in medical research settings can lead to differing conclusions or can even prevent some studies from starting. We thus sought to create a patient research system that could aggregate as many patient observations as possible from a large number of hospitals in a uniform way. We call this system the ‘Shared Health Research Information Network’, with the following properties: (1) reuse electronic health data from everyday clinical care for research purposes, (2) respect patient privacy and hospital autonomy, (3) aggregate patient populations across many hospitals to achieve statistically significant sample sizes that can be validated independently of a single research setting, (4) harmonize the observation facts recorded at each institution such that queries can be made across many hospitals in parallel, (5) scale to regional and national collaborations. The purpose of this report is to provide open source software for multi-site clinical studies and to report on early uses of this application. At this time SHRINE implementations have been used for multi-site studies of autism co-morbidity, juvenile idiopathic arthritis, peripartum cardiomyopathy, colorectal cancer, diabetes, and others. The wide range of study objectives and growing adoption suggest that SHRINE may be applicable beyond the research uses and participating hospitals named in this report.
Collapse
|
31
|
Ries M, Prokosch HU, Beckmann MW, Bürkle T. Single-Source Tumor Documentation - Reusing Oncology Data for Different Purposes. ACTA ACUST UNITED AC 2013; 36:136-41. [DOI: 10.1159/000348528] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|