1
|
Lim YMF, Asselbergs FW, Bagheri A, Denaxas S, Tay WT, Voors A, Lam CSP, Koudstaal S, Grobbee DE, Vaartjes I. Eligibility of Asian and European registry patients for phase III trials in heart failure with reduced ejection fraction. ESC Heart Fail 2024. [PMID: 38984466 DOI: 10.1002/ehf2.14751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 01/29/2024] [Accepted: 02/19/2024] [Indexed: 07/11/2024] Open
Abstract
AIMS Traditional approaches to designing clinical trials for heart failure (HF) have historically relied on expertise and past practices. However, the evolving landscape of healthcare, marked by the advent of novel data science applications and increased data availability, offers a compelling opportunity to transition towards a data-driven paradigm in trial design. This research aims to evaluate the scope and determinants of disparities between clinical trials and registries by leveraging natural language processing for the analysis of trial eligibility criteria. The findings contribute to the establishment of a robust design framework for guiding future HF trials. METHODS AND RESULTS Interventional phase III trials registered for HF on ClinicalTrials.gov as of the end of 2021 were identified. Natural language processing was used to extract and structure the eligibility criteria for quantitative analysis. The most common criteria for HF with reduced ejection fraction (HFrEF) were applied to estimate patient eligibility as a proportion of registry patients in the ASIAN-HF (N = 4868) and BIOSTAT-CHF registries (N = 2545). Of the 375 phase III trials for HF, 163 HFrEF trials were identified. In these trials, the most frequently encountered inclusion criteria were New York Heart Association (NYHA) functional class (69%), worsening HF (23%), and natriuretic peptides (18%), whereas the most frequent comorbidity-based exclusion criteria were acute coronary syndrome (64%), renal disease (55%), and valvular heart disease (47%). On average, 20% of registry patients were eligible for HFrEF trials. Eligibility distributions did not differ (P = 0.18) between Asian [median eligibility 0.20, interquartile range (IQR) 0.08-0.43] and European registry populations (median 0.17, IQR 0.06-0.39). With time, HFrEF trials became more restrictive, where patient eligibility declined from 0.40 in 1985-2005 to 0.19 in 2016-2022 (P = 0.03). When frequency among trials is taken into consideration, the eligibility criteria that were most restrictive were prior myocardial infarction, NYHA class, age, and prior HF hospitalization. CONCLUSIONS Based on 14 trial criteria, only one-fifth of registry patients were eligible for phase III HFrEF trials. Overall eligibility rates did not differ between the Asian and European patient cohorts.
Collapse
Affiliation(s)
- Yvonne Mei Fong Lim
- Julius Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Institute for Clinical Research, National Institutes of Health, Ministry of Health Malaysia, Shah Alam, Malaysia
| | - Folkert W Asselbergs
- Institute of Health Informatics, University College London, London, UK
- The National Institute for Health Research University College London Hospitals Biomedical Research Centre, University College London, London, UK
- Department of Cardiology, Amsterdam Cardiovascular Sciences, Amsterdam University Medical Centre, University of Amsterdam, Amsterdam, The Netherlands
| | - Ayoub Bagheri
- Department of Methodology and Statistics, Utrecht University, Utrecht, The Netherlands
| | - Spiros Denaxas
- Institute of Health Informatics, UCL BHF Research Accelerator and Health Data Research UK, University College London, London, UK
- British Heart Foundation Data Science Center, London, UK
| | - Wan Ting Tay
- National Heart Centre Singapore, Singapore, Singapore
| | - Adriaan Voors
- Department of Cardiology, University Medical Center Groningen, Groningen, The Netherlands
| | | | - Stefan Koudstaal
- Department of Cardiology, Groene Hart Ziekenhuis, Gouda, The Netherlands
| | - Diederick E Grobbee
- Julius Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Julius Clinical, Zeist, The Netherlands
| | - Ilonca Vaartjes
- Julius Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
2
|
Thangaraj PM, Oikonomou EK, Dhingra LS, Aminorroaya A, Jayaram R, Suchard MA, Khera R. Computational Phenomapping of Randomized Clinical Trials to Enable Assessment of their Real-world Representativeness and Personalized Inference. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.15.24306285. [PMID: 38798457 PMCID: PMC11118629 DOI: 10.1101/2024.05.15.24306285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Importance Randomized clinical trials (RCTs) are the standard for defining an evidence-based approach to managing disease, but their generalizability to real-world patients remains challenging to quantify. Objective To develop a multidimensional patient variable mapping algorithm to quantify the similarity and representation of electronic health record (EHR) patients corresponding to an RCT and estimate the putative treatment effects in real-world settings based on individual treatment effects observed in an RCT. Design A retrospective analysis of the Treatment of Preserved Cardiac Function Heart Failure with an Aldosterone Antagonist Trial (TOPCAT; 2006-2012) and a multi-hospital patient cohort from the electronic health record (EHR) in the Yale New Haven Hospital System (YNHHS; 2015-2023). Setting A multicenter international RCT (TOPCAT) and multi-hospital patient cohort (YNHHS). Participants All TOPCAT participants and patients with heart failure with preserved ejection fraction (HFpEF) and ≥1 hospitalization within YNHHS. Exposures 63 pre-randomization characteristics measured across the TOPCAT and YNNHS cohorts. Main Outcomes and Measures Real-world generalizability of the RCT TOPCAT using a multidimensional phenotypic distance metric between TOPCAT and YNHHS cohorts. Estimation of the individualized treatment effect of spironolactone use on all-cause mortality within the YNHHS cohort based on phenotypic distance from the TOPCAT cohort. Results There were 3,445 patients in TOPCAT and 11,712 HFpEF patients across five hospital sites. Across the 63 TOPCAT variables mapped by clinicians to the EHR, there were larger differences between TOPCAT and each of the 5 EHR sites (median SMD 0.200, IQR 0.037-0.410) than between the 5 EHR sites (median SMD 0.062, IQR 0.010-0.130). The synthesis of these differences across covariates using our multidimensional similarity score also suggested substantial phenotypic dissimilarity between the TOPCAT and EHR cohorts. By phenotypic distance, a majority (55%) of TOPCAT participants were closer to each other than any individual EHR patient. Using a TOPCAT-derived model of individualized treatment benefit from spironolactone, those predicted to derive benefit and receiving spironolactone in the EHR cohorts had substantially better outcomes compared with predicted benefit and not receiving the medication (HR 0.74, 95% CI 0.62-0.89). Conclusions and Relevance We propose a novel approach to evaluating the real-world representativeness of RCT participants against corresponding patients in the EHR across the full multidimensional spectrum of the represented phenotypes. This enables the evaluation of the implications of RCTs for real-world patients. KEY POINTS Question: How can we examine the multi-dimensional generalizability of randomized clinical trials (RCT) to real-world patient populations?Findings: We demonstrate a novel phenotypic distance metric comparing an RCT to real-world populations in a large multicenter RCT of heart failure patients and the corresponding patients in multisite electronic health records (EHRs). Across 63 pre-randomization characteristics, pairwise assessments of members of the RCT and EHR cohorts were more discordant from each other than between members of the EHR cohort (median standardized mean difference 0.200 [0.037-0.410] vs 0.062 [0.010-0.130]), with a majority (55%) of RCT participants closer to each other than any individual EHR patient. The approach also enabled the quantification of expected real world outcomes based on effects observed in the RCT.Meaning: A multidimensional phenotypic distance metric quantifies the generalizability of RCTs to a given population while also offering an avenue to examine expected real-world patient outcomes based on treatment effects observed in the RCT.
Collapse
|
3
|
Su Q, Cheng G, Huang J. A review of research on eligibility criteria for clinical trials. Clin Exp Med 2023; 23:1867-1879. [PMID: 36602707 PMCID: PMC9815064 DOI: 10.1007/s10238-022-00975-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 12/06/2022] [Indexed: 01/06/2023]
Abstract
The purpose of this paper is to systematically sort out and analyze the cutting-edge research on the eligibility criteria of clinical trials. Eligibility criteria are important prerequisites for the success of clinical trials. It directly affects the final results of the clinical trials. Inappropriate eligibility criteria will lead to insufficient recruitment, which is an important reason for the eventual failure of many clinical trials. We have investigated the research status of eligibility criteria for clinical trials on academic platforms such as arXiv and NIH. We have classified and sorted out all the papers we found, so that readers can understand the frontier research in this field. Eligibility criteria are the most important part of a clinical trial study. The ultimate goal of research in this field is to formulate more scientific and reasonable eligibility criteria and speed up the clinical trial process. The global research on the eligibility criteria of clinical trials is mainly divided into four main aspects: natural language processing, patient pre-screening, standard evaluation, and clinical trial query. Compared with the past, people are now using new technologies to study eligibility criteria from a new perspective (big data). In the research process, complex disease concepts, how to choose a suitable dataset, how to prove the validity and scientific of the research results, are challenges faced by researchers (especially for computer-related researchers). Future research will focus on the selection and improvement of artificial intelligence algorithms related to clinical trials and related practical applications such as databases, knowledge graphs, and dictionaries.
Collapse
Affiliation(s)
- Qianmin Su
- Department of Computer Science, School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, No. 333 Longteng Road, Shanghai, 201620, China.
| | - Gaoyi Cheng
- Department of Computer Science, School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, No. 333 Longteng Road, Shanghai, 201620, China
| | - Jihan Huang
- Center for Drug Clinical Research, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China
| |
Collapse
|
4
|
Butala NM, Secemsky E, Kazi DS, Song Y, Strom JB, Faridi KF, Brennan JM, Elmariah S, Shen C, Yeh RW. Applicability of Transcatheter Aortic Valve Replacement Trials to Real-World Clinical Practice: Findings From EXTEND-CoreValve. JACC Cardiovasc Interv 2021; 14:2112-2123. [PMID: 34620389 DOI: 10.1016/j.jcin.2021.08.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 07/28/2021] [Accepted: 08/03/2021] [Indexed: 01/15/2023]
Abstract
OBJECTIVES The aim of this study was to examine the applicability of pivotal transcatheter aortic valve replacement (TAVR) trials to the real-world population of Medicare patients undergoing TAVR. BACKGROUND It is unclear whether randomized controlled trial results of novel cardiovascular devices apply to patients encountered in clinical practice. METHODS Characteristics of patients enrolled in the U.S. CoreValve pivotal trials were compared with those of the population of Medicare beneficiaries who underwent TAVR in U.S. clinical practice between November 2, 2011, and December 31, 2017. Inverse probability weighting was used to reweight the trial cohort on the basis of Medicare patient characteristics, and a "real-world" treatment effect was estimated. RESULTS A total of 2,026 patients underwent TAVR in the U.S. CoreValve pivotal trials, and 135,112 patients underwent TAVR in the Medicare cohort. Trial patients were mostly similar to real-world patients at baseline, though trial patients were more likely to have hypertension (50% vs 39%) and coagulopathy (25% vs 17%), whereas real-world patients were more likely to have congestive heart failure (75% vs 68%) and frailty. The estimated real-world treatment effect of TAVR was an 11.4% absolute reduction in death or stroke (95% CI: 7.50%-14.92%) and an 8.7% absolute reduction in death (95% CI: 5.20%-12.32%) at 1 year with TAVR compared with conventional therapy (surgical aortic valve replacement for intermediate- and high-risk patients and medical therapy for extreme-risk patients). CONCLUSIONS The trial and real-world populations were mostly similar, with some notable differences. Nevertheless, the extrapolated real-world treatment effect was at least as high as the observed trial treatment effect, suggesting that the absolute benefit of TAVR in clinical trials is similar to the benefit of TAVR in the U.S. real-world setting.
Collapse
Affiliation(s)
- Neel M Butala
- Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology, Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA; Cardiology Division, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Eric Secemsky
- Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology, Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
| | - Dhruv S Kazi
- Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology, Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
| | - Yang Song
- Baim Institute for Clinical Research, Boston, Massachusetts, USA
| | - Jordan B Strom
- Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology, Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
| | - Kamil F Faridi
- Section of Cardiology, Department of Medicine, Yale School of Medicine, New Haven, Connecticut, USA
| | - J Matthew Brennan
- Duke Clinical Research Institute, Duke University School of Medicine, Durham, North Carolina, USA
| | - Sammy Elmariah
- Cardiology Division, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Changyu Shen
- Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology, Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
| | - Robert W Yeh
- Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology, Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
| |
Collapse
|
5
|
Sun Y, Butler A, Diallo I, Kim JH, Ta C, Rogers JR, Liu H, Weng C. A Framework for Systematic Assessment of Clinical Trial Population Representativeness Using Electronic Health Records Data. Appl Clin Inform 2021; 12:816-825. [PMID: 34496418 DOI: 10.1055/s-0041-1733846] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Clinical trials are the gold standard for generating robust medical evidence, but clinical trial results often raise generalizability concerns, which can be attributed to the lack of population representativeness. The electronic health records (EHRs) data are useful for estimating the population representativeness of clinical trial study population. OBJECTIVES This research aims to estimate the population representativeness of clinical trials systematically using EHR data during the early design stage. METHODS We present an end-to-end analytical framework for transforming free-text clinical trial eligibility criteria into executable database queries conformant with the Observational Medical Outcomes Partnership Common Data Model and for systematically quantifying the population representativeness for each clinical trial. RESULTS We calculated the population representativeness of 782 novel coronavirus disease 2019 (COVID-19) trials and 3,827 type 2 diabetes mellitus (T2DM) trials in the United States respectively using this framework. With the use of overly restrictive eligibility criteria, 85.7% of the COVID-19 trials and 30.1% of T2DM trials had poor population representativeness. CONCLUSION This research demonstrates the potential of using the EHR data to assess the clinical trials population representativeness, providing data-driven metrics to inform the selection and optimization of eligibility criteria.
Collapse
Affiliation(s)
- Yingcheng Sun
- Department of Biomedical Informatics, Columbia University, New York, New York, United States
| | - Alex Butler
- Department of Biomedical Informatics, Columbia University, New York, New York, United States.,Department of Medicine, Columbia University, New York, New York, United States
| | - Ibrahim Diallo
- Department of Biomedical Informatics, Columbia University, New York, New York, United States
| | - Jae Hyun Kim
- Department of Biomedical Informatics, Columbia University, New York, New York, United States
| | - Casey Ta
- Department of Biomedical Informatics, Columbia University, New York, New York, United States
| | - James R Rogers
- Department of Biomedical Informatics, Columbia University, New York, New York, United States
| | - Hao Liu
- Department of Biomedical Informatics, Columbia University, New York, New York, United States
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, New York, United States
| |
Collapse
|
6
|
Bhanot K, Qi M, Erickson JS, Guyon I, Bennett KP. The Problem of Fairness in Synthetic Healthcare Data. ENTROPY (BASEL, SWITZERLAND) 2021; 23:1165. [PMID: 34573790 PMCID: PMC8468495 DOI: 10.3390/e23091165] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 08/25/2021] [Accepted: 08/30/2021] [Indexed: 11/16/2022]
Abstract
Access to healthcare data such as electronic health records (EHR) is often restricted by laws established to protect patient privacy. These restrictions hinder the reproducibility of existing results based on private healthcare data and also limit new research. Synthetically-generated healthcare data solve this problem by preserving privacy and enabling researchers and policymakers to drive decisions and methods based on realistic data. Healthcare data can include information about multiple in- and out- patient visits of patients, making it a time-series dataset which is often influenced by protected attributes like age, gender, race etc. The COVID-19 pandemic has exacerbated health inequities, with certain subgroups experiencing poorer outcomes and less access to healthcare. To combat these inequities, synthetic data must "fairly" represent diverse minority subgroups such that the conclusions drawn on synthetic data are correct and the results can be generalized to real data. In this article, we develop two fairness metrics for synthetic data, and analyze all subgroups defined by protected attributes to analyze the bias in three published synthetic research datasets. These covariate-level disparity metrics revealed that synthetic data may not be representative at the univariate and multivariate subgroup-levels and thus, fairness should be addressed when developing data generation methods. We discuss the need for measuring fairness in synthetic healthcare data to enable the development of robust machine learning models to create more equitable synthetic healthcare datasets.
Collapse
Affiliation(s)
- Karan Bhanot
- Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA; (M.Q.); (K.P.B.)
- OptumLabs, Eden Prairie, MN 55344, USA
| | - Miao Qi
- Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA; (M.Q.); (K.P.B.)
| | - John S. Erickson
- Rensselaer Institute for Data Exploration and Applications, Troy, NY 12180, USA;
| | - Isabelle Guyon
- LISN, CNRS/INRIA, Université Paris-Saclay, 91190 Gif-sur-Yvette, France;
- ChaLearn, San Francisco, CA 94115, USA
| | - Kristin P. Bennett
- Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA; (M.Q.); (K.P.B.)
- Rensselaer Institute for Data Exploration and Applications, Troy, NY 12180, USA;
- Department of Mathematics, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
| |
Collapse
|
7
|
Qi M, Cahan O, Foreman MA, Gruen DM, Das AK, Bennett KP. Quantifying representativeness in randomized clinical trials using machine learning fairness metrics. JAMIA Open 2021; 4:ooab077. [PMID: 34568771 PMCID: PMC8460438 DOI: 10.1093/jamiaopen/ooab077] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 08/19/2021] [Accepted: 09/03/2021] [Indexed: 02/01/2023] Open
Abstract
OBJECTIVE We help identify subpopulations underrepresented in randomized clinical trials (RCTs) cohorts with respect to national, community-based or health system target populations by formulating population representativeness of RCTs as a machine learning (ML) fairness problem, deriving new representation metrics, and deploying them in easy-to-understand interactive visualization tools. MATERIALS AND METHODS We represent RCT cohort enrollment as random binary classification fairness problems, and then show how ML fairness metrics based on enrollment fraction can be efficiently calculated using easily computed rates of subpopulations in RCT cohorts and target populations. We propose standardized versions of these metrics and deploy them in an interactive tool to analyze 3 RCTs with respect to type 2 diabetes and hypertension target populations in the National Health and Nutrition Examination Survey. RESULTS We demonstrate how the proposed metrics and associated statistics enable users to rapidly examine representativeness of all subpopulations in the RCT defined by a set of categorical traits (eg, gender, race, ethnicity, smoking status, and blood pressure) with respect to target populations. DISCUSSION The normalized metrics provide an intuitive standardized scale for evaluating representation across subgroups, which may have vastly different enrollment fractions and rates in RCT study cohorts. The metrics are beneficial complements to other approaches (eg, enrollment fractions) used to identify generalizability and health equity of RCTs. CONCLUSION By quantifying the gaps between RCT and target populations, the proposed methods can support generalizability evaluation of existing RCT cohorts. The interactive visualization tool can be readily applied to identified underrepresented subgroups with respect to any desired source or target populations.
Collapse
Affiliation(s)
- Miao Qi
- Department of Computer Science, Rensselaer Polytechnic Institute, Troy, New York, USA
| | - Owen Cahan
- Department of Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, New York, USA
| | - Morgan A Foreman
- Center for Computational Health, IBM Research, Cambridge, Massachusetts, USA
| | - Daniel M Gruen
- Department of Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, New York, USA
| | - Amar K Das
- Center for Computational Health, IBM Research, Cambridge, Massachusetts, USA
| | - Kristin P Bennett
- Department of Computer Science, Rensselaer Polytechnic Institute, Troy, New York, USA
- Department of Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, New York, USA
| |
Collapse
|
8
|
Rogers JR, Hripcsak G, Cheung YK, Weng C. Clinical comparison between trial participants and potentially eligible patients using electronic health record data: A generalizability assessment method. J Biomed Inform 2021; 119:103822. [PMID: 34044156 DOI: 10.1016/j.jbi.2021.103822] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 05/19/2021] [Accepted: 05/20/2021] [Indexed: 01/21/2023]
Abstract
OBJECTIVE To present a generalizability assessment method that compares baseline clinical characteristics of trial participants (TP) to potentially eligible (PE) patients as presented in their electronic health record (EHR) data while controlling for clinical setting and recruitment period. METHODS For each clinical trial, a clinical event was defined to identify patients of interest using available EHR data from one clinical setting during the trial's recruitment timeframe. The trial's eligibility criteria were then applied and patients were separated into two mutually exclusive groups: (1) TP, which were patients that participated in the trial per trial enrollment data; (2) PE, the remaining patients. The primary outcome was standardized differences in clinical characteristics between TP and PE per trial. A standardized difference was considered prominent if its absolute value was greater than or equal to 0.1. The secondary outcome was the difference in mean propensity scores (PS) between TP and PE per trial, in which the PS represented prediction for a patient to be in the trial. Three diverse trials were selected for illustration: one focused on hepatitis C virus (HCV) patients receiving a liver transplantation; one focused on leukemia patients and lymphoma patients; and one focused on appendicitis patients. RESULTS For the HCV trial, 43 TP and 83 PE were found, with 61 characteristics evaluated. Prominent differences were found among 69% of characteristics, with a mean PS difference of 0.13. For the leukemia/lymphoma trial, 23 TP and 23 PE were found, with 39 characteristics evaluated. Prominent differences were found among 82% of characteristics, with a mean PS difference of 0.76. For the appendicitis trial, 123 TP and 242 PE were found, with 52 characteristics evaluated. Prominent differences were found among 52% of characteristics, with a mean PS difference of 0.15. CONCLUSIONS Differences in clinical characteristics were observed between TP and PE among all three trials. In two of the three trials, not all of the differences necessarily compromised trial generalizability and subsets of PE could be considered similar to their corresponding TP. In the remaining trial, lack of generalizability appeared present, but may be a result of other factors such as small sample size or site recruitment strategy. These inconsistent findings suggest eligibility criteria alone are sometimes insufficient in defining a target group to generalize to. With caveats in limited scalability, EHR data quality, and lack of patient perspective on trial participation, this generalizability assessment method that incorporates control for temporality and clinical setting promise to better pinpoint clinical patterns and trial considerations.
Collapse
Affiliation(s)
- James R Rogers
- Department of Biomedical Informatics, Columbia University, New York, NY, United States
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University, New York, NY, United States; Medical Informatics Services, New York-Presbyterian Hospital, New York, NY, United States
| | - Ying Kuen Cheung
- Department of Biostatistics, Columbia University, New York, NY, United States
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, NY, United States.
| |
Collapse
|
9
|
A knowledge base of clinical trial eligibility criteria. J Biomed Inform 2021; 117:103771. [PMID: 33813032 DOI: 10.1016/j.jbi.2021.103771] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 03/25/2021] [Accepted: 03/30/2021] [Indexed: 11/23/2022]
Abstract
OBJECTIVE We present the Clinical Trial Knowledge Base, a regularly updated knowledge base of discrete clinical trial eligibility criteria equipped with a web-based user interface for querying and aggregate analysis of common eligibility criteria. MATERIALS AND METHODS We used a natural language processing (NLP) tool named Criteria2Query (Yuan et al., 2019) to transform free text clinical trial eligibility criteria from ClinicalTrials.gov into discrete criteria concepts and attributes encoded using the widely adopted Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) and stored in a relational SQL database. A web application accessible via RESTful APIs was implemented to enable queries and visual aggregate analyses. We demonstrate CTKB's potential role in EHR phenotype knowledge engineering using ten validated phenotyping algorithms. RESULTS At the time of writing, CTKB contained 87,504 distinctive OMOP CDM standard concepts, including Condition (47.82%), Drug (23.01%), Procedure (13.73%), Measurement (24.70%) and Observation (5.28%), with 34.78% for inclusion criteria and 65.22% for exclusion criteria, extracted from 352,110 clinical trials. The average hit rate of criteria concepts in eMERGE phenotype algorithms is 77.56%. CONCLUSION CTKB is a novel comprehensive knowledge base of discrete eligibility criteria concepts with the potential to enable knowledge engineering for clinical trial cohort definition, clinical trial population representativeness assessment, electronical phenotyping, and data gap analyses for using electronic health records to support clinical trial recruitment.
Collapse
|
10
|
Sivesind TE, Runion T, Branda M, Schilling LM, Dellavalle RP. Dermatologic Research Potential of the Observational Health Data Sciences and Informatics (OHDSI) Network. Dermatology 2021; 238:44-52. [PMID: 33735862 DOI: 10.1159/000514536] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 01/18/2021] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND The Observational Health Data Sciences and Informatics (OHDSI) network enables access to billions of deidentified, standardized health records and built-in analytics software for observational health research, with numerous potential applications to dermatology. While the use of the OHDSI has increased steadily over the past several years, review of the literature reveals few studies utilizing OHDSI in dermatology. To our knowledge, the University of Colorado School of Medicine is unique in its use of OHDSI for dermatology big data research. SUMMARY A PubMed search was conducted in August 2020, followed by a literature review, with 24 of the 72 screened articles selected for inclusion. In this review, we discuss the ways OHDSI has been used to compile and analyze data, improve prediction and estimation capabilities, and inform treatment guidelines across specialties. We also discuss the potential for OHDSI in dermatology - specifically, ways that it could reveal adherence to available guidelines, establish standardized protocols, and ensure health equity. Key Messages: OHDSI has demonstrated broad utility in medicine. Adoption of OHDSI by the field of dermatology would facilitate big data research, allow for examination of current prescribing and treatment patterns without clear best practice guidelines, improve the dermatologic knowledge base and, by extension, improve patient outcomes.
Collapse
Affiliation(s)
- Torunn Elise Sivesind
- Department of Dermatology, University of Colorado School of Medicine, Aurora, Colorado, USA,
| | - Taylor Runion
- Rocky Vista University College of Osteopathic Medicine, Parker, Colorado, USA
| | - Megan Branda
- Department of Biostatistics and Informatics, University of Colorado School of Medicine, Aurora, Colorado, USA
| | - Lisa M Schilling
- Department of Medicine, Data Science to Patient Value Program Aurora, University of Colorado School of Medicine, Aurora, Colorado, USA
| | - Robert P Dellavalle
- Department of Dermatology, University of Colorado School of Medicine, Aurora, Colorado, USA
| |
Collapse
|
11
|
Li Q, Guo Y, He Z, Zhang H, George TJ, Bian J. Using Real-World Data to Rationalize Clinical Trials Eligibility Criteria Design: A Case Study of Alzheimer's Disease Trials. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2021; 2020:717-726. [PMID: 33936446 PMCID: PMC8075542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Low trial generalizability is a concern. The Food and Drug Administration had guidance on broadening trial eligibility criteria to enroll underrepresented populations. However, investigators are hesitant to do so because of concerns over patient safety. There is a lack of methods to rationalize criteria design. In this study, we used data from a large research network to assess how adjustments of eligibility criteria can jointly affect generalizability and patient safety (i.e the number of serious adverse events [SAEs]). We first built a model to predict the number of SAEs. Then, leveraging an a priori generalizability assessment algorithm, we assessed the changes in the number of predicted SAEs and the generalizability score, simulating the process of dropping exclusion criteria and increasing the upper limit of continuous eligibility criteria. We argued that broadening of eligibility criteria should balance between potential increases of SAEs and generalizability using donepezil trials as a case study.
Collapse
Affiliation(s)
- Qian Li
- University of Florida, Gainesville, Florida, USA
| | - Yi Guo
- University of Florida, Gainesville, Florida, USA
| | - Zhe He
- Florida State University, Tallahassee, Florida, USA
| | - Hansi Zhang
- University of Florida, Gainesville, Florida, USA
| | | | - Jiang Bian
- University of Florida, Gainesville, Florida, USA
| |
Collapse
|
12
|
He Z, Barrett LA, Rizvi R, Tang X, Payrovnaziri SN, Zhang R. Assessing the Use and Perception of Dietary Supplements Among Obese Patients with National Health and Nutrition Examination Survey. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2020; 2020:231-240. [PMID: 32477642 PMCID: PMC7233063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Complementary alternative medicine, especially dietary supplements (DS), has gained increasing popularity for weight loss due to its availability without prescription, price, and ease of use. Besides weight loss, there are various perceived, potential benefits linked to DS use. However, health consumers with limited health literacy may not adequately know the benefits and risk of overdose for DS. In this project, we aim to gain a better understanding of the use of DS products among obese people as well as the perceived benefits of these products. We identified obese adults after combining the National Health and Nutrition Examination Survey data collected from 2003 to 2014. We found that there is a knowledge gap between the reported benefits of major DS by obese adults and the existing DS knowledge base and label database. This gap may inform the design of patient education material on DS usage in the future.
Collapse
Affiliation(s)
- Zhe He
- School of Information, Florida State University, Tallahassee, Florida, USA
| | - Laura A Barrett
- School of Information, Florida State University, Tallahassee, Florida, USA
| | - Rubina Rizvi
- Institute for Health Informatics and Department of Pharmaceutical Care & Health Systems, University of Minnesota, Minneapolis, Minnesota, USA
| | - Xiang Tang
- Department of Statistics, Florida State University, Tallahassee, Florida, USA
| | | | - Rui Zhang
- Institute for Health Informatics and Department of Pharmaceutical Care & Health Systems, University of Minnesota, Minneapolis, Minnesota, USA
| |
Collapse
|
13
|
He Z, Tang X, Yang X, Guo Y, George TJ, Charness N, Quan Hem KB, Hogan W, Bian J. Clinical Trial Generalizability Assessment in the Big Data Era: A Review. Clin Transl Sci 2020; 13:675-684. [PMID: 32058639 PMCID: PMC7359942 DOI: 10.1111/cts.12764] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 01/25/2020] [Indexed: 01/04/2023] Open
Abstract
Clinical studies, especially randomized, controlled trials, are essential for generating evidence for clinical practice. However, generalizability is a long‐standing concern when applying trial results to real‐world patients. Generalizability assessment is thus important, nevertheless, not consistently practiced. We performed a systematic review to understand the practice of generalizability assessment. We identified 187 relevant articles and systematically organized these studies in a taxonomy with three dimensions: (i) data availability (i.e., before or after trial (a priori vs. a posteriori generalizability)); (ii) result outputs (i.e., score vs. nonscore); and (iii) populations of interest. We further reported disease areas, underrepresented subgroups, and types of data used to profile target populations. We observed an increasing trend of generalizability assessments, but < 30% of studies reported positive generalizability results. As a priori generalizability can be assessed using only study design information (primarily eligibility criteria), it gives investigators a golden opportunity to adjust the study design before the trial starts. Nevertheless, < 40% of the studies in our review assessed a priori generalizability. With the wide adoption of electronic health records systems, rich real‐world patient databases are increasingly available for generalizability assessment; however, informatics tools are lacking to support the adoption of generalizability assessment practice.
Collapse
Affiliation(s)
- Zhe He
- School of Information, Florida State University, Tallahassee, Florida, USA
| | - Xiang Tang
- Department of Statistics, Florida State University, Tallahassee, Florida, USA
| | - Xi Yang
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Yi Guo
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Thomas J George
- Hematology & Oncology, Department of Medicine, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Neil Charness
- Department of Psychology, Florida State University, Tallahassee, Florida, USA
| | - Kelsa Bartley Quan Hem
- Calder Memorial Library, Miller School of Medicine, University of Miami, Miami, Florida, USA
| | - William Hogan
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| |
Collapse
|
14
|
Li Q, He Z, Guo Y, Zhang H, George TJ, Hogan W, Charness N, Bian J. Assessing the Validity of a a priori Patient-Trial Generalizability Score using Real-world Data from a Large Clinical Data Research Network: A Colorectal Cancer Clinical Trial Case Study. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2020; 2019:1101-1110. [PMID: 32308907 PMCID: PMC7153072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Existing trials had not taken enough consideration of their population representativeness, which can lower the effectiveness when the treatment is applied in real-world clinical practice. We analyzed the eligibility criteria of Bevacizumab colorectal cancer treatment trials, assessed their a priori generalizability, and examined how it affects patient outcomes when applied in real-world clinical settings. To do so, we extracted patient-level data from a large collection of electronic health records (EHRs) from the OneFlorida consortium. We built a zero-inflated negative binomial model using a composite patient-trial generalizability (cPTG) score to predict patients' clinical outcomes (i.e., number of serious adverse events, [SAEs]). Our study results provide a body of evidence that 1) the cPTG scores can predict patient outcomes; and 2) patients who are more similar to the study population in the trials that were used to develop the treatment will have a significantly lower possibility to experience serious adverse events.
Collapse
Affiliation(s)
- Qian Li
- University of Florida, Gainesville, FL, USA
| | - Zhe He
- Florida State University, Tallahassee, FL, USA
| | - Yi Guo
- University of Florida, Gainesville, FL, USA
| | | | | | | | | | - Jiang Bian
- University of Florida, Gainesville, FL, USA
| |
Collapse
|
15
|
Lehmann HP, Downs SM. Desiderata for sharable computable biomedical knowledge for learning health systems. Learn Health Syst 2018; 2:e10065. [PMID: 31245589 PMCID: PMC6508769 DOI: 10.1002/lrh2.10065] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2018] [Revised: 07/02/2018] [Accepted: 07/03/2018] [Indexed: 01/02/2023] Open
Abstract
In this commentary, we work out the specific desired functions required for sharing knowledge objects (based on statistical models) presumably to be used for clinical decision support derived from a learning health system, and, in so doing, discuss the implications for novel knowledge architectures. We will demonstrate how decision models, implemented as influence diagrams, satisfy the desiderata. The desiderata include locally validate discrimination, locally validate calibration, locally recalculate thresholds by incorporating local preferences, provide explanation, enable monitoring, enable debiasing, account for generalizability, account for semantic uncertainty, shall be findable, and others as necessary and proper. We demonstrate how formal decision models, especially when implemented as influence diagrams based on Bayesian networks, support both the knowledge artifact itself (the "primary decision") and the "meta-decision" of whether to deploy the knowledge artifact. We close with a research and development agenda to put this framework into place.
Collapse
|
16
|
Sen A, Goldstein A, Chakrabarti S, Shang N, Kang T, Yaman A, Ryan PB, Weng C. The representativeness of eligible patients in type 2 diabetes trials: a case study using GIST 2.0. J Am Med Inform Assoc 2017; 25:239-247. [PMID: 29025047 PMCID: PMC7378875 DOI: 10.1093/jamia/ocx091] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2017] [Revised: 06/23/2017] [Accepted: 08/08/2017] [Indexed: 01/23/2023] Open
Abstract
Objective The population representativeness of a clinical study is influenced by how real-world patients qualify for the study. We analyze the representativeness of eligible patients for multiple type 2 diabetes trials and the relationship between representativeness and other trial characteristics. Methods Sixty-nine study traits available in the electronic health record data for 2034 patients with type 2 diabetes were used to profile the target patients for type 2 diabetes trials. A set of 1691 type 2 diabetes trials was identified from ClinicalTrials.gov, and their population representativeness was calculated using the published Generalizability Index of Study Traits 2.0 metric. The relationships between population representativeness and number of traits and between trial duration and trial metadata were statistically analyzed. A focused analysis with only phase 2 and 3 interventional trials was also conducted. Results A total of 869 of 1691 trials (51.4%) and 412 of 776 phase 2 and 3 interventional trials (53.1%) had a population representativeness of <5%. The overall representativeness was significantly correlated with the representativeness of the Hba1c criterion. The greater the number of criteria or the shorter the trial, the less the representativeness. Among the trial metadata, phase, recruitment status, and start year were found to have a statistically significant effect on population representativeness. For phase 2 and 3 interventional trials, only start year was significantly associated with representativeness. Conclusions Our study quantified the representativeness of multiple type 2 diabetes trials. The common low representativeness of type 2 diabetes trials could be attributed to specific study design requirements of trials or safety concerns. Rather than criticizing the low representativeness, we contribute a method for increasing the transparency of the representativeness of clinical trials.
Collapse
Affiliation(s)
- Anando Sen
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Andrew Goldstein
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Shreya Chakrabarti
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Ning Shang
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Tian Kang
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Anil Yaman
- Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, Netherlands
| | - Patrick B Ryan
- Department of Biomedical Informatics, Columbia University, New York, NY, USA.,Janssen Research and Development, Titusville, NJ, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| |
Collapse
|
17
|
Chakrabarti S, Sen A, Huser V, Hruby GW, Rusanov A, Albers DJ, Weng C. An Interoperable Similarity-based Cohort Identification Method Using the OMOP Common Data Model version 5.0. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2017; 1:1-18. [PMID: 28776047 DOI: 10.1007/s41666-017-0005-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Cohort identification for clinical studies tends to be laborious, time-consuming, and expensive. Developing automated or semi-automated methods for cohort identification is one of the "holy grails" in the field of biomedical informatics. We propose a high-throughput similarity-based cohort identification algorithm by applying numerical abstractions on Electronic Health Records (EHR) data. We implement this algorithm using the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), which enables sites using this standardized EHR data representation to avail this algorithm with minimum effort for local implementation. We validate its performance for a retrospective cohort identification task on six clinical trials conducted at the Columbia University Medical Center. Our algorithm achieves an average Area Under the Curve (AUC) of 0.966 and an average Precision at 5 of 0.983. This interoperable method promises to achieve efficient cohort identification in EHR databases. We discuss suitable applications of our method and its limitations and propose warranted future work.
Collapse
Affiliation(s)
- Shreya Chakrabarti
- Department of Biomedical Informatics, Columbia University, New York NY 10032
| | - Anando Sen
- Department of Biomedical Informatics, Columbia University, New York NY 10032
| | - Vojtech Huser
- National Institute of Health, National Library of Medicine, Bethesda, MD 20892
| | - Gregory W Hruby
- Department of Biomedical Informatics, Columbia University, New York NY 10032
| | - Alexander Rusanov
- Department of Anesthesiology, Columbia University, New York NY 10032
| | - David J Albers
- Department of Biomedical Informatics, Columbia University, New York NY 10032
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York NY 10032
| |
Collapse
|
18
|
He Z, Gonzalez-Izquierdo A, Denaxas S, Sura A, Guo Y, Hogan WR, Shenkman E, Bian J. Comparing and Contrasting A Priori and A Posteriori Generalizability Assessment of Clinical Trials on Type 2 Diabetes Mellitus. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2017; 2017:849-858. [PMID: 29854151 PMCID: PMC5977671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Abstract
Clinical trials are indispensable tools for evidence-based medicine. However, they are often criticized for poor generalizability. Traditional trial generalizability assessment can only be done after the trial results are published, which compares the enrolled patients with a convenience sample of real-world patients. However, the proliferation of electronic data in clinical trial registries and clinical data warehouses offer a great opportunity to assess the generalizability during the design phase of a new trial. In this work, we compared and contrasted a priori (based on eligibility criteria) and a posteriori (based on enrolled patients) generalizability of Type 2 diabetes clinical trials. Further, we showed that comparing the study population selected by the clinical trial eligibility criteria to the real-world patient population is a good indicator of the generalizability of trials. Our findings demonstrate that the a priori generalizability of a trial is comparable to its a posteriori generalizability in identifying restrictive quantitative eligibility criteria.
Collapse
Affiliation(s)
- Zhe He
- Florida State University, Tallahassee, FL, USA
| | | | | | | | - Yi Guo
- University of Florida, Gainesville, FL, USA
| | | | | | - Jiang Bian
- University of Florida, Gainesville, FL, USA
| |
Collapse
|