1
|
Littell JH. The Logic of Generalization From Systematic Reviews and Meta-Analyses of Impact Evaluations. EVALUATION REVIEW 2024; 48:427-460. [PMID: 38261473 DOI: 10.1177/0193841x241227481] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2024]
Abstract
Systematic reviews and meta-analyses are viewed as potent tools for generalized causal inference. These reviews are routinely used to inform decision makers about expected effects of interventions. However, the logic of generalization from research reviews to diverse policy and practice contexts is not well developed. Building on sampling theory, concerns about epistemic uncertainty, and principles of generalized causal inference, this article presents a pragmatic approach to generalizability assessment for use with systematic reviews and meta-analyses. This approach is applied to two systematic reviews and meta-analyses of effects of "evidence-based" psychosocial interventions for youth and families. Evaluations included in systematic reviews are not necessarily representative of populations and treatments of interest. Generalizability of results is limited by high risks of bias, uncertain estimates, and insufficient descriptive data from impact evaluations. Systematic reviews and meta-analyses can be used to test generalizability claims, explore heterogeneity, and identify potential moderators of effects. These reviews can also produce pooled estimates that are not representative of any larger sets of studies, programs, or people. Further work is needed to improve the conduct and reporting of impact evaluations and systematic reviews, and to develop practical approaches to generalizability assessment and guide applications of interventions in diverse policy and practice contexts.
Collapse
Affiliation(s)
- Julia H Littell
- Graduate School of Social Work and Social Research, Bryn Mawr College, Bryn Mawr, PA, USA
| |
Collapse
|
2
|
Santner V, Riepl HS, Posch F, Wallner M, Rainer PP, Ablasser K, Kolesnik E, Hoeller V, Zach D, Schwegel N, Kreuzer P, Lueger A, Petutschnigg J, Pieske B, Zirlik A, Edelmann F, Verheyen N. Non-eligibility for pivotal HFpEF/HFmrEF outcome trials and mortality in a contemporary heart failure cohort. Eur J Intern Med 2023; 118:73-81. [PMID: 37517939 DOI: 10.1016/j.ejim.2023.07.027] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 06/27/2023] [Accepted: 07/23/2023] [Indexed: 08/01/2023]
Abstract
Pivotal outcome trials targeting heart failure with preserved (HFpEF) and mildly-reduced ejection fraction (HFmrEF) may have excluded patients at highest risk of poor outcomes. We aimed to assess eligibility for HFpEF/HFmrEF outcome trials in an unselected heart failure cohort and its association with all-cause mortality. Among 32.028 patients presenting to a tertiary care center emergency unit for any reason between August 2018 and July 2019, we identified 407 admissions with evident HFpEF and HFmrEF. Eligibility criteria for pivotal trials CHARM-Preserved, I-PRESERVE, TOPCAT, PARAGON-HF, EMPEROR-Preserved and DELIVER were assessed by chart review. The proportions of admissions fulfilling HFpEF/HFmrEF trial eligibility criteria were 88% for CHARM-Preserved, 40% for I-PRESERVE, 35% for TOPCAT, 28% for PARAGON-HF, 51% for EMPEROR-Preserved, and 49% for DELIVER. During a median follow-up of 1.9 years, death-from-any-cause occurred in 121 cases (30%). Twenty-four-month overall survival estimates for non-eligible and eligible admissions were 53% vs. 76% for CHARM-Preserved (HR=2.32, 95% CI: 1.47-3.67, p<0.001), 62% vs. 87% for I-PRESERVE (HR=2.97, 1.85-4.77, p<0.001), 67% vs. 84% for TOPCAT (HR=2.04, 1.29-3.24, p = 0.002), 68% vs. 85% for PARAGONHF (HR=2.28, 1.33-3.90, p = 0.003), 64% vs. 81% for EMPEROR-Preserved (HR=1.90, 1.27-2.84, p = 0.002), and 65% vs. 80% for DELIVER (HR=1.71, 1.14-2.57, p = 0.010). Exclusion criteria independently predicting death were eGFR <20 ml/min/1.73 m2, COPD with home oxygen therapy, and severe valvular heart disease. Conclusively, in a contemporary HFpEF/HFmrEF cohort, non-eligibility for outcome trials predicted for strongly increased mortality. HFpEF/HFmrEF patients at highest mortality risk were likely underrepresented in previous outcome trials and their treatment remains an unmet medical need.
Collapse
Affiliation(s)
- Viktoria Santner
- Division of Cardiology, University Heart Center and Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Hermann S Riepl
- Division of Cardiology, University Heart Center and Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Florian Posch
- Division of Hematology, Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Markus Wallner
- Division of Cardiology, University Heart Center and Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Peter P Rainer
- Division of Cardiology, University Heart Center and Department of Internal Medicine, Medical University of Graz, Graz, Austria; Department of Medicine, St. Johann in Tirol General Hospital, St. Johann in Tirol, Austria; BioTechMed Graz, Graz, Austria
| | - Klemens Ablasser
- Division of Cardiology, University Heart Center and Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Ewald Kolesnik
- Division of Cardiology, University Heart Center and Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Viktoria Hoeller
- Division of Cardiology, University Heart Center and Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - David Zach
- Division of Cardiology, University Heart Center and Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Nora Schwegel
- Division of Cardiology, University Heart Center and Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Philipp Kreuzer
- Emergency Medicine Unit, Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Andreas Lueger
- Emergency Medicine Unit, Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Johannes Petutschnigg
- Department of Internal Medicine and Cardiology, Charité-Universitätsmedizin Berlin, Campus Virchow Klinikum, Berlin, Germany; German Center for Cardiovascular Research, Partner Site Berlin, Germany
| | | | - Andreas Zirlik
- Division of Cardiology, University Heart Center and Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Frank Edelmann
- Department of Internal Medicine and Cardiology, Charité-Universitätsmedizin Berlin, Campus Virchow Klinikum, Berlin, Germany; German Center for Cardiovascular Research, Partner Site Berlin, Germany
| | - Nicolas Verheyen
- Division of Cardiology, University Heart Center and Department of Internal Medicine, Medical University of Graz, Graz, Austria.
| |
Collapse
|
3
|
Sun Y, Butler A, Diallo I, Kim JH, Ta C, Rogers JR, Liu H, Weng C. A Framework for Systematic Assessment of Clinical Trial Population Representativeness Using Electronic Health Records Data. Appl Clin Inform 2021; 12:816-825. [PMID: 34496418 DOI: 10.1055/s-0041-1733846] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Clinical trials are the gold standard for generating robust medical evidence, but clinical trial results often raise generalizability concerns, which can be attributed to the lack of population representativeness. The electronic health records (EHRs) data are useful for estimating the population representativeness of clinical trial study population. OBJECTIVES This research aims to estimate the population representativeness of clinical trials systematically using EHR data during the early design stage. METHODS We present an end-to-end analytical framework for transforming free-text clinical trial eligibility criteria into executable database queries conformant with the Observational Medical Outcomes Partnership Common Data Model and for systematically quantifying the population representativeness for each clinical trial. RESULTS We calculated the population representativeness of 782 novel coronavirus disease 2019 (COVID-19) trials and 3,827 type 2 diabetes mellitus (T2DM) trials in the United States respectively using this framework. With the use of overly restrictive eligibility criteria, 85.7% of the COVID-19 trials and 30.1% of T2DM trials had poor population representativeness. CONCLUSION This research demonstrates the potential of using the EHR data to assess the clinical trials population representativeness, providing data-driven metrics to inform the selection and optimization of eligibility criteria.
Collapse
Affiliation(s)
- Yingcheng Sun
- Department of Biomedical Informatics, Columbia University, New York, New York, United States
| | - Alex Butler
- Department of Biomedical Informatics, Columbia University, New York, New York, United States.,Department of Medicine, Columbia University, New York, New York, United States
| | - Ibrahim Diallo
- Department of Biomedical Informatics, Columbia University, New York, New York, United States
| | - Jae Hyun Kim
- Department of Biomedical Informatics, Columbia University, New York, New York, United States
| | - Casey Ta
- Department of Biomedical Informatics, Columbia University, New York, New York, United States
| | - James R Rogers
- Department of Biomedical Informatics, Columbia University, New York, New York, United States
| | - Hao Liu
- Department of Biomedical Informatics, Columbia University, New York, New York, United States
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, New York, United States
| |
Collapse
|
4
|
He Z, Tang X, Yang X, Guo Y, George TJ, Charness N, Quan Hem KB, Hogan W, Bian J. Clinical Trial Generalizability Assessment in the Big Data Era: A Review. Clin Transl Sci 2020; 13:675-684. [PMID: 32058639 PMCID: PMC7359942 DOI: 10.1111/cts.12764] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 01/25/2020] [Indexed: 01/04/2023] Open
Abstract
Clinical studies, especially randomized, controlled trials, are essential for generating evidence for clinical practice. However, generalizability is a long‐standing concern when applying trial results to real‐world patients. Generalizability assessment is thus important, nevertheless, not consistently practiced. We performed a systematic review to understand the practice of generalizability assessment. We identified 187 relevant articles and systematically organized these studies in a taxonomy with three dimensions: (i) data availability (i.e., before or after trial (a priori vs. a posteriori generalizability)); (ii) result outputs (i.e., score vs. nonscore); and (iii) populations of interest. We further reported disease areas, underrepresented subgroups, and types of data used to profile target populations. We observed an increasing trend of generalizability assessments, but < 30% of studies reported positive generalizability results. As a priori generalizability can be assessed using only study design information (primarily eligibility criteria), it gives investigators a golden opportunity to adjust the study design before the trial starts. Nevertheless, < 40% of the studies in our review assessed a priori generalizability. With the wide adoption of electronic health records systems, rich real‐world patient databases are increasingly available for generalizability assessment; however, informatics tools are lacking to support the adoption of generalizability assessment practice.
Collapse
Affiliation(s)
- Zhe He
- School of Information, Florida State University, Tallahassee, Florida, USA
| | - Xiang Tang
- Department of Statistics, Florida State University, Tallahassee, Florida, USA
| | - Xi Yang
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Yi Guo
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Thomas J George
- Hematology & Oncology, Department of Medicine, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Neil Charness
- Department of Psychology, Florida State University, Tallahassee, Florida, USA
| | - Kelsa Bartley Quan Hem
- Calder Memorial Library, Miller School of Medicine, University of Miami, Miami, Florida, USA
| | - William Hogan
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| |
Collapse
|
5
|
Boye KS, Riddle MC, Gerstein HC, Mody R, Garcia‐Perez L, Karanikas CA, Lage MJ, Riesmeyer JS, Lakshmanan MC. Generalizability of glucagon-like peptide-1 receptor agonist cardiovascular outcome trials to the overall type 2 diabetes population in the United States. Diabetes Obes Metab 2019; 21:1299-1304. [PMID: 30714309 PMCID: PMC6593714 DOI: 10.1111/dom.13649] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/01/2018] [Revised: 01/25/2019] [Accepted: 02/01/2019] [Indexed: 12/29/2022]
Abstract
AIM To examine the generalizability of results from glucagon-like peptide-1 receptor agonist (GLP-1 RA) cardiovascular outcome trials (CVOTs) in the US type 2 diabetes (T2D) population. MATERIALS AND METHODS Patients enrolled or eligible for inclusion in four CVOTs (EXSCEL, LEADER, REWIND, and SUSTAIN-6) were examined in reference to a retrospective clinical database weighted to match the age and sex distribution of the US adult T2D population. We descriptively compared key baseline characteristics of the populations enrolled in each trial to those of the reference population and estimated the proportions of individuals in the reference population represented by those in the trials for each characteristic. We also estimated the proportions of individuals in the reference population that might have been enrolled in each trial based upon meeting the trial inclusion and exclusion (I/E) criteria. RESULTS No trial's enrolled population perfectly matched the reference population in key characteristics. The EXSCEL population most closely matched in mean age (62.7 vs. 60.5 years) and percentage with estimated glomerular filtration rate <60 (18.6 vs. 17.3%), while REWIND most closely matched in HbA1c, sex distribution, and proportion with a prior myocardial infarction. Based on I/E criteria, 42.6% of the reference population were eligible for enrolment in REWIND, versus 15.9% in EXSCEL, 13.0% in SUSTAIN-6, and 12.9% in LEADER. CONCLUSIONS Although none of the trials are fully representative of the general population, among the four trials examined, results from baseline REWIND were found to be more generalizable to the US adult T2D population than those of other GLP-1 RA CVOTs.
Collapse
Affiliation(s)
| | - Matthew C. Riddle
- Division of Endocrinology, Diabetes & Clinical NutritionOregon Health & Science UniversityPortlandOregon
| | - Hertzel C. Gerstein
- McMaster University and Hamilton Health Sciences CenterPopulation Health Research InstituteHamiltonOntarioCanada
| | - Reema Mody
- Eli Lilly and CompanyIndianapolisIndiana
| | | | | | | | | | | |
Collapse
|
6
|
Webster-Clark MA, Sanoff HK, Stürmer T, Peacock Hinton S, Lund JL. Diagnostic Assessment of Assumptions for External Validity: An Example Using Data in Metastatic Colorectal Cancer. Epidemiology 2019; 30:103-111. [PMID: 30252687 PMCID: PMC6269648 DOI: 10.1097/ede.0000000000000926] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
BACKGROUND Methods developed to estimate intervention effects in external target populations assume that all important effect measure modifiers have been identified and appropriately modeled. Propensity score-based diagnostics can be used to assess the plausibility of these assumptions for weighting methods. METHODS We demonstrate the use of these diagnostics when assessing the transportability of treatment effects from the standard of care for metastatic colorectal cancer control arm in a phase III trial (HORIZON III) to a target population of 1,942 Medicare beneficiaries age 65+ years. RESULTS In an unadjusted comparison, control arm participants had lower mortality compared with target population patients treated with the standard of care therapy (trial vs. target hazard ratio [HR] = 0.72, 95% confidence interval [CI], 0.58, 0.89). Applying inverse odds of sampling weights attenuated the trial versus target HR (weighted HR = 0.96, 95% CI = 0.73, 1.26). However, whether unadjusted or weighted, hazards did not appear proportional. At 6 months of follow-up, mortality was lower in the weighted trial population than the target population (weighted trial vs. target risk difference [RD] = -0.07, 95% CI = -0.13, -0.01), but not at 12 months (weighted RD = 0.00, 95% CI = -0.09, 0.09). CONCLUSION These diagnostics suggest that direct transport of treatment effects from HORIZON III to the Medicare population is not valid. However, the proposed sampling model might allow valid transport of the treatment effects on longer-term mortality from HORIZON III to the Medicare population treated in clinical practice. See video abstract at, http://links.lww.com/EDE/B435.
Collapse
Affiliation(s)
| | - Hanna K Sanoff
- Department of Medicine, University of North Carolina, Chapel Hill, NC
| | - Til Stürmer
- From the Department of Epidemiology, University of North Carolina, Chapel Hill, NC
| | | | - Jennifer L Lund
- From the Department of Epidemiology, University of North Carolina, Chapel Hill, NC
| |
Collapse
|
7
|
Laffin LJ, Besser SA, Alenghat FJ. A data-zone scoring system to assess the generalizability of clinical trial results to individual patients. Eur J Prev Cardiol 2018; 26:569-575. [PMID: 30477321 DOI: 10.1177/2047487318815967] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
INTRODUCTION Evaluating the applicability of a clinical trial to a specific patient is difficult. A novel framework, the Trial Score, was created to quantify the generalizability of a trial's result based on participants' baseline characteristics and not on the trial's inclusion and exclusion criteria. METHODS For each Systolic Blood Pressure Intervention Trial (SPRINT) participant, the Euclidean distance in six-dimensional space from the theoretical "average" participant was calculated to produce an individual Trial Score that incorporates multiple distinct continuous-variable baseline characteristics. We prospectively defined the "data-rich," "data-limited," and "data-free" zones as Trial Scores < 90th percentile, the 90th-97.5th percentile, and >97.5th percentile, respectively. Trial Scores were then calculated for National Health and Nutrition Examination Survey participants to map data zones of the general population. Individual participant data from the Action to Control Cardiovascular Risk in Diabetes blood pressure trial (ACCORD-BP) was used to test if participants further from the average SPRINT participant behave differently than the overall SPRINT results. RESULTS The National Health and Nutrition Examination Survey cohort and the ACCORD-BP trial demonstrate large percentages of participants in SPRINT's data-free and data-limited zones. Time-to-event rates seen with intensive and standard blood pressure control in SPRINT were the same as ACCORD-BP participants within SPRINT's data-rich zone (hazard ratio 0.97, p = 0.84 and hazard ratio 0.95, p = 0.70). However, these rates were significantly different than those of ACCORD-BP participants outside SPRINT's data-rich zone (hazard ratio 0.64, p < 0.01 and hazard ratio 0.77, p < 0.01). CONCLUSIONS ACCORD-BP participants with SPRINT Trial Scores in the 90th percentile or below have similar event rates to SPRINT participants in both the intensive and standard blood pressure groups. Quantifying the difference between an individual patient and the average clinical trial participant holds promise as a tool to more precisely determine applicability of a specific trial to individual patients.
Collapse
Affiliation(s)
- Luke J Laffin
- 1 Department of Cardiovascular Medicine, Cleveland Clinic Foundation, USA
| | | | | |
Collapse
|
8
|
Mønsted T. Achieving veracity: A study of the development and use of an information system for data analysis in preventive healthcare. Health Informatics J 2018; 25:491-499. [PMID: 30198372 DOI: 10.1177/1460458218796665] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Within healthcare, information systems are increasingly developed to enable automatic analysis of the large amounts of data that are accumulated. A prerequisite for the practical use of such data analysis is the veracity of the output, that is, that the analysis is clinically valid. Whereas most research focuses on the technical configuration and clinical precision of data analysis systems, the purpose of this article is to investigate how veracity is achieved in practice. Based on a study of a project in Denmark aimed at developing an algorithm for stratification of citizens in preventive healthcare, this article confirms that achieving veracity requires close attention to the clinical validity of the algorithm. It also concludes, however, that the veracity in practice hinges critically on the citizens' ability to report high-quality data and the ability of the health professionals to interpret the outcome in the context of existing care practices.
Collapse
|