Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Pennington JW, Ruth B, Italia MJ, Miller J, Wrazien S, Loutrel JG, Crenshaw EB, White PS. Harvest: an open platform for developing web-based biomedical data discovery and reporting applications. J Am Med Inform Assoc 2013;21:379-83. [PMID: 24131510 PMCID: PMC3932456 DOI: 10.1136/amiajnl-2013-001825] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open

For:	Pennington JW, Ruth B, Italia MJ, Miller J, Wrazien S, Loutrel JG, Crenshaw EB, White PS. Harvest: an open platform for developing web-based biomedical data discovery and reporting applications. J Am Med Inform Assoc 2013;21:379-83. [PMID: 24131510 PMCID: PMC3932456 DOI: 10.1136/amiajnl-2013-001825] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open

Number

Cited by Other Article(s)

Rashid R, Copelli S, Silverstein JC, Becich MJ. REDCap and the National Mesothelioma Virtual Bank-a scalable and sustainable model for rare disease biorepositories. J Am Med Inform Assoc 2023;30:1634-1644. [PMID: 37487555 PMCID: PMC10531116 DOI: 10.1093/jamia/ocad132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 05/16/2023] [Accepted: 07/10/2023] [Indexed: 07/26/2023] Open

Abstract

OBJECTIVE

Rare disease research requires data sharing networks to power translational studies. We describe novel use of Research Electronic Data Capture (REDCap), a web application for managing clinical data, by the National Mesothelioma Virtual Bank, a federated biospecimen, and data sharing network.

MATERIALS AND METHODS

National Mesothelioma Virtual Bank (NMVB) uses REDCap to integrate honest broker activities, enabling biospecimen and associated clinical data provisioning to investigators. A Web Portal Query tool was developed to source and visualize REDCap data in interactive, faceted search, enabling cohort discovery by public users. An AWS Lambda function behind an API calculates the counts visually presented, while protecting record level data. The user-friendly interface, quick responsiveness, automatic generation from REDCap, and flexibility to new data, was engineered to sustain the NMVB research community.

RESULTS

NMVB implementations enabled a network of 8 research institutions with over 2000 mesothelioma cases, including clinical annotations and biospecimens, and public users' cohort discovery and summary statistics. NMVB usage and impact is demonstrated by high website visits (>150 unique queries per month), resource use requests (>50 letter of interests), and citations (>900) to papers published using NMVB resources.

DISCUSSION

NMVB's REDCap implementation and query tool is a framework for implementing federated and integrated rare disease biobanks and registries. Advantages of this framework include being low-cost, modular, scalable, and efficient. Future advances to NVMB's implementations will include incorporation of -omics data and development of downstream analysis tools to advance mesothelioma and rare disease research.

CONCLUSION

NVMB presents a framework for integrating biobanks and patient registries to enable translational research for rare diseases.

Collapse

Pennington JW, Ruth B, Miller JM, Peterson J, Xu B, Masino A, Krantz I, Manganella J, Gomes T, Stiles D, Kenna M, Hood LJ, Germiller J, Crenshaw EB. Perspective on the Development of a Large-Scale Clinical Data Repository for Pediatric Hearing Research. Ear Hear 2021;41:231-238. [PMID: 31408044 PMCID: PMC7007829 DOI: 10.1097/aud.0000000000000779] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Affiliation(s)

Jeffrey W. Pennington Department of Biomedical and Health Informatics, The Children’s Hospital Of Philadelphia, Philadelphia, PA, USA
Byron Ruth Department of Biomedical and Health Informatics, The Children’s Hospital Of Philadelphia, Philadelphia, PA, USA
Jeffrey M. Miller Department of Biomedical and Health Informatics, The Children’s Hospital Of Philadelphia, Philadelphia, PA, USA
Joy Peterson Center for Childhood Communication, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
Baichen Xu Center for Childhood Communication, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
Aaron Masino Department of Biomedical and Health Informatics, The Children’s Hospital Of Philadelphia, Philadelphia, PA, USA
Ian Krantz Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA The Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
Juliana Manganella Department of Otolaryngology and Communication Enhancement, Boston Children's Hospital, Boston, MA, USA
Tamar Gomes Department of Otolaryngology and Communication Enhancement, Boston Children's Hospital, Boston, MA, USA
Derek Stiles Department of Otolaryngology and Communication Enhancement, Boston Children's Hospital, Boston, MA, USA
Margaret Kenna Department of Otolaryngology and Communication Enhancement, Boston Children's Hospital, Boston, MA, USA
Linda J. Hood Department of Hearing and Speech Sciences, Vanderbilt Bill Wilkerson Center, Vanderbilt University, Nashville, TN, USA
John Germiller Division of Otolaryngology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA Department of Otorhinolaryngology: Head and Neck Surgery, The Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
E. Bryan Crenshaw Center for Childhood Communication, The Children's Hospital of Philadelphia, Philadelphia, PA, USA Department of Otorhinolaryngology: Head and Neck Surgery, The Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA

Collapse

Gagalova KK, Leon Elizalde MA, Portales-Casamar E, Görges M. What You Need to Know Before Implementing a Clinical Research Data Warehouse: Comparative Review of Integrated Data Repositories in Health Care Institutions. JMIR Form Res 2020;4:e17687. [PMID: 32852280 PMCID: PMC7484778 DOI: 10.2196/17687] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Revised: 06/09/2020] [Accepted: 07/17/2020] [Indexed: 12/23/2022] Open

Abstract

Background

Integrated data repositories (IDRs), also referred to as clinical data warehouses, are platforms used for the integration of several data sources through specialized analytical tools that facilitate data processing and analysis. IDRs offer several opportunities for clinical data reuse, and the number of institutions implementing an IDR has grown steadily in the past decade.

Objective

The architectural choices of major IDRs are highly diverse and determining their differences can be overwhelming. This review aims to explore the underlying models and common features of IDRs, provide a high-level overview for those entering the field, and propose a set of guiding principles for small- to medium-sized health institutions embarking on IDR implementation.

Methods

We reviewed manuscripts published in peer-reviewed scientific literature between 2008 and 2020, and selected those that specifically describe IDR architectures. Of 255 shortlisted articles, we found 34 articles describing 29 different architectures. The different IDRs were analyzed for common features and classified according to their data processing and integration solution choices.

Results

Despite common trends in the selection of standard terminologies and data models, the IDRs examined showed heterogeneity in the underlying architecture design. We identified 4 common architecture models that use different approaches for data processing and integration. These different approaches were driven by a variety of features such as data sources, whether the IDR was for a single institution or a collaborative project, the intended primary data user, and purpose (research-only or including clinical or operational decision making).

Conclusions

IDR implementations are diverse and complex undertakings, which benefit from being preceded by an evaluation of requirements and definition of scope in the early planning stage. Factors such as data source diversity and intended users of the IDR influence data flow and synchronization, both of which are crucial factors in IDR architecture planning.

Collapse

Trifan A, Oliveira JL. Patient data discovery platforms as enablers of biomedical and translational research: A systematic review. J Biomed Inform 2019;93:103154. [PMID: 30922867 DOI: 10.1016/j.jbi.2019.103154] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2018] [Revised: 03/15/2019] [Accepted: 03/18/2019] [Indexed: 11/28/2022]

Dietrich G, Krebs J, Fette G, Ertl M, Kaspar M, Störk S, Puppe F. Ad Hoc Information Extraction for Clinical Data Warehouses. Methods Inf Med 2018;57:e22-e29. [PMID: 29801178 PMCID: PMC6193399 DOI: 10.3414/me17-02-0010] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Abstract

Background: Clinical Data Warehouses (CDW) reuse Electronic health records (EHR) to make their data retrievable for research purposes or patient recruitment for clinical trials. However, much information are hidden in unstructured data like discharge letters. They can be preprocessed and converted to structured data via information extraction (IE), which is unfortunately a laborious task and therefore usually not available for most of the text data in CDW.

Objectives: The goal of our work is to provide an ad hoc IE service that allows users to query text data ad hoc in a manner similar to querying structured data in a CDW. While search engines just return text snippets, our systems also returns frequencies (e.g. how many patients exist with “heart failure” including textual synonyms or how many patients have an LVEF < 45) based on the content of discharge letters or textual reports for special investigations like heart echo. Three subtasks are addressed: (1) To recognize and to exclude negations and their scopes, (2) to extract concepts, i.e. Boolean values and (3) to extract numerical values.

Methods: We implemented an extended version of the NegEx-algorithm for German texts that detects negations and determines their scope. Furthermore, our document oriented CDW PaDaWaN was extended with query functions, e.g. context sensitive queries and regex queries, and an extraction mode for computing the frequencies for Boolean and numerical values.

Results: Evaluations in chest X-ray reports and in discharge letters showed high F1-scores for the three subtasks: Detection of negated concepts in chest X-ray reports with an F1-score of 0.99 and in discharge letters with 0.97; of Boolean values in chest X-ray reports about 0.99, and of numerical values in chest X-ray reports and discharge letters also around 0.99 with the exception of the concept age.

Discussion: The advantages of an ad hoc IE over a standard IE are the low development effort (just entering the concept with its variants), the promptness of the results and the adaptability by the user to his or her particular question. Disadvantage are usually lower accuracy and confidence.

This ad hoc information extraction approach is novel and exceeds existing systems: Roogle [ 1 ] extracts predefined concepts from texts at preprocessing and makes them retrievable at runtime. Dr. Warehouse [ 2 ] applies negation detection and indexes the produced subtexts which include affirmed findings. Our approach combines negation detection and the extraction of concepts. But the extraction does not take place during preprocessing, but at runtime. That provides an ad hoc, dynamic, interactive and adjustable information extraction of random concepts and even their values on the fly at runtime.

Conclusions: We developed an ad hoc information extraction query feature for Boolean and numerical values within a CDW with high recall and precision based on a pipeline that detects and removes negations and their scope in clinical texts.

Collapse

Tao S, Cui L, Wu X, Zhang GQ. Facilitating Cohort Discovery by Enhancing Ontology Exploration, Query Management and Query Sharing for Large Clinical Data Repositories. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018;2017:1685-1694. [PMID: 29854239 PMCID: PMC5977665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Williams R, Kontopantelis E, Buchan I, Peek N. Clinical code set engineering for reusing EHR data for research: A review. J Biomed Inform 2017;70:1-13. [PMID: 28442434 DOI: 10.1016/j.jbi.2017.04.010] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2017] [Revised: 03/21/2017] [Accepted: 04/13/2017] [Indexed: 01/26/2023]

Abstract

INTRODUCTION

The construction of reliable, reusable clinical code sets is essential when re-using Electronic Health Record (EHR) data for research. Yet code set definitions are rarely transparent and their sharing is almost non-existent. There is a lack of methodological standards for the management (construction, sharing, revision and reuse) of clinical code sets which needs to be addressed to ensure the reliability and credibility of studies which use code sets.

OBJECTIVE

To review methodological literature on the management of sets of clinical codes used in research on clinical databases and to provide a list of best practice recommendations for future studies and software tools.

METHODS

We performed an exhaustive search for methodological papers about clinical code set engineering for re-using EHR data in research. This was supplemented with papers identified by snowball sampling. In addition, a list of e-phenotyping systems was constructed by merging references from several systematic reviews on this topic, and the processes adopted by those systems for code set management was reviewed.

RESULTS

Thirty methodological papers were reviewed. Common approaches included: creating an initial list of synonyms for the condition of interest (n=20); making use of the hierarchical nature of coding terminologies during searching (n=23); reviewing sets with clinician input (n=20); and reusing and updating an existing code set (n=20). Several open source software tools (n=3) were discovered.

DISCUSSION

There is a need for software tools that enable users to easily and quickly create, revise, extend, review and share code sets and we provide a list of recommendations for their design and implementation.

CONCLUSION

Research re-using EHR data could be improved through the further development, more widespread use and routine reporting of the methods by which clinical codes were selected.

Collapse

Felmeister AS, Masino AJ, Rivera TJ, Resnick AC, Pennington JW. The biorepository portal toolkit: an honest brokered, modular service oriented software tool set for biospecimen-driven translational research. BMC Genomics 2016;17 Suppl 4:434. [PMID: 27535360 PMCID: PMC5001241 DOI: 10.1186/s12864-016-2797-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

Abstract

BACKGROUND

High throughput molecular sequencing and increased biospecimen variety have introduced significant informatics challenges for research biorepository infrastructures. We applied a modular system integration approach to develop an operational biorepository management system. This method enables aggregation of the clinical, specimen and genomic data collected for biorepository resources.

METHODS

We introduce an electronic Honest Broker (eHB) and Biorepository Portal (BRP) open source project that, in tandem, allow for data integration while protecting patient privacy. This modular approach allows data and specimens to be associated with a biorepository subject at any time point asynchronously. This lowers the bar to develop new research projects based on scientific merit without institutional review for a proposal.

RESULTS

By facilitating the automated de-identification of specimen and associated clinical and genomic data we create a future proofed specimen set that can withstand new workflows and be connected to new associated information over time. Thus facilitating collaborative advanced genomic and tissue research.

CONCLUSIONS

As of Janurary of 2016 there are 23 unique protocols/patient cohorts being managed in the Biorepository Portal (BRP). There are over 4000 unique subject records in the electronic honest broker (eHB), over 30,000 specimens accessioned and 8 institutions participating in various biobanking activities using this tool kit. We specifically set out to build rich annotation of biospecimens with longitudinal clinical data; BRP/REDCap integration for multi-institutional repositories; EMR integration; further annotated specimens with genomic data specific to a domain; build application hooks for experiments at the specimen level integrated with analytic software; while protecting privacy per the Office of Civil Rights (OCR) and HIPAA.

Collapse

Badgeley MA, Shameer K, Glicksberg BS, Tomlinson MS, Levin MA, McCormick PJ, Kasarskis A, Reich DL, Dudley JT. EHDViz: clinical dashboard development using open-source technologies. BMJ Open 2016;6:e010579. [PMID: 27013597 PMCID: PMC4809078 DOI: 10.1136/bmjopen-2015-010579] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Abstract

OBJECTIVE

To design, develop and prototype clinical dashboards to integrate high-frequency health and wellness data streams using interactive and real-time data visualisation and analytics modalities.

MATERIALS AND METHODS

We developed a clinical dashboard development framework called electronic healthcare data visualization (EHDViz) toolkit for generating web-based, real-time clinical dashboards for visualising heterogeneous biomedical, healthcare and wellness data. The EHDViz is an extensible toolkit that uses R packages for data management, normalisation and producing high-quality visualisations over the web using R/Shiny web server architecture. We have developed use cases to illustrate utility of EHDViz in different scenarios of clinical and wellness setting as a visualisation aid for improving healthcare delivery.

RESULTS

Using EHDViz, we prototyped clinical dashboards to demonstrate the contextual versatility of EHDViz toolkit. An outpatient cohort was used to visualise population health management tasks (n=14,221), and an inpatient cohort was used to visualise real-time acuity risk in a clinical unit (n=445), and a quantified-self example using wellness data from a fitness activity monitor worn by a single individual was also discussed (n-of-1). The back-end system retrieves relevant data from data source, populates the main panel of the application and integrates user-defined data features in real-time and renders output using modern web browsers. The visualisation elements can be customised using health features, disease names, procedure names or medical codes to populate the visualisations. The source code of EHDViz and various prototypes developed using EHDViz are available in the public domain at http://ehdviz.dudleylab.org.

CONCLUSIONS

Collaborative data visualisations, wellness trend predictions, risk estimation, proactive acuity status monitoring and knowledge of complex disease indicators are essential components of implementing data-driven precision medicine. As an open-source visualisation framework capable of integrating health assessment, EHDViz aims to be a valuable toolkit for rapid design, development and implementation of scalable clinical data visualisation dashboards.

Collapse

Affiliation(s)

Marcus A Badgeley Harris Center for Precision Wellness, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA
Khader Shameer Harris Center for Precision Wellness, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA
Benjamin S Glicksberg Harris Center for Precision Wellness, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA
Max S Tomlinson Harris Center for Precision Wellness, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA
Matthew A Levin Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA Department of Anesthesiology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA
Patrick J McCormick Department of Anesthesiology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA
Andrew Kasarskis Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA
David L Reich Department of Anesthesiology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA
Joel T Dudley Harris Center for Precision Wellness, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA

Collapse

Xu J, Rasmussen LV, Shaw PL, Jiang G, Kiefer RC, Mo H, Pacheco JA, Speltz P, Zhu Q, Denny JC, Pathak J, Thompson WK, Montague E. Review and evaluation of electronic health records-driven phenotype algorithm authoring tools for clinical and translational research. J Am Med Inform Assoc 2015. [PMID: 26224336 DOI: 10.1093/jamia/ocv070] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

McArt DG, Blayney JK, Boyle DP, Irwin GW, Moran M, Hutchinson RA, Bankhead P, Kieran D, Wang Y, Dunne PD, Kennedy RD, Mullan PB, Harkin DP, Catherwood MA, James JA, Salto-Tellez M, Hamilton PW. PICan: An integromics framework for dynamic cancer biomarker discovery. Mol Oncol 2015;9:1234-40. [PMID: 25814194 DOI: 10.1016/j.molonc.2015.02.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Revised: 12/23/2014] [Accepted: 02/05/2015] [Indexed: 02/05/2023] Open

Affiliation(s)

Darragh G McArt Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
Jaine K Blayney Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
David P Boyle Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
Gareth W Irwin Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
Michael Moran Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
Ryan A Hutchinson Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
Peter Bankhead Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
Declan Kieran Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
Yinhai Wang Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
Philip D Dunne Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
Richard D Kennedy Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
Paul B Mullan Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
D Paul Harkin Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
Mark A Catherwood Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
Jacqueline A James Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom
Manuel Salto-Tellez Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom.
Peter W Hamilton Centre for Cancer Research and Cell Biology (CCRCB), Queen's University Belfast, Belfast, United Kingdom.

Collapse

Dixit A, Dobson RJB. CohortExplorer: A Generic Application Programming Interface for Entity Attribute Value Database Schemas. JMIR Med Inform 2014;2:e32. [PMID: 25601296 PMCID: PMC4288104 DOI: 10.2196/medinform.3339] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2014] [Revised: 08/03/2014] [Accepted: 09/19/2014] [Indexed: 12/03/2022] Open

Abstract

BACKGROUND

Most electronic data capture (EDC) and electronic data management (EDM) systems developed to collect and store clinical data from participants recruited into studies are based on generic entity-attribute-value (EAV) database schemas which enable rapid and flexible deployment in a range of study designs. The drawback to such schemas is that they are cumbersome to query with structured query language (SQL). The problem increases when researchers involved in multiple studies use multiple electronic data capture and management systems each with variation on the EAV schema.

OBJECTIVE

The aim of this study is to develop a generic application which allows easy and rapid exploration of data and metadata stored under EAV schemas that are organized into a survey format (questionnaires/events, questions, values), in other words, the Clinical Data Interchange Standards Consortium (CDISC) Observational Data Model (ODM).

METHODS

CohortExplorer is written in Perl programming language and uses the concept of SQL abstract which allows the SQL query to be treated like a hash (key-value pairs).

RESULTS

We have developed a tool, CohortExplorer, which once configured for a EAV system will "plug-n-play" with EAV schemas, enabling the easy construction of complex queries through an abstracted interface. To demonstrate the utility of the CohortExplorer system, we show how it can be used with the popular EAV based frameworks; Opal (OBiBa) and REDCap.

CONCLUSIONS

The application is available under a GPL-3+ license at the CPAN website. Currently the application only provides datasource application programming interfaces (APIs) for Opal and REDCap. In the future the application will be available with datasource APIs for all major electronic data capture and management systems such as OpenClinica and LabKey. At present the application is only compatible with EAV systems where the metadata is organized into surveys, questionnaires and events. Further work is needed to make the application compatible with EAV schemas where the metadata is organized into hierarchies such as Informatics for Integrating Biology & the Bedside (i2b2). A video tutorial demonstrating the application setup, datasource configuration, and search features is available on YouTube. The application source code is available at the GitHub website and the users are encouraged to suggest new features and contribute to the development of APIs for new EAV systems.

Collapse

Hanauer DA, Hruby GW, Fort DG, Rasmussen LV, Mendonça EA, Weng C. What Is Asked in Clinical Data Request Forms? A Multi-site Thematic Analysis of Forms Towards Better Data Access Support. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2014;2014:616-25. [PMID: 25954367 PMCID: PMC4419980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]

Secondary use of clinical data: the Vanderbilt approach. J Biomed Inform 2014;52:28-35. [PMID: 24534443 DOI: 10.1016/j.jbi.2014.02.003] [Citation(s) in RCA: 168] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2013] [Revised: 12/21/2013] [Accepted: 02/04/2014] [Indexed: 01/04/2023]