1
|
Ahn SJ. Real-World Research on Retinal Diseases Using Health Claims Database: A Narrative Review. Diagnostics (Basel) 2024; 14:1568. [PMID: 39061705 PMCID: PMC11276298 DOI: 10.3390/diagnostics14141568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 07/05/2024] [Accepted: 07/17/2024] [Indexed: 07/28/2024] Open
Abstract
Real-world data (RWD) has emerged as a crucial component in understanding and improving patient outcomes across various medical conditions, including retinal diseases. Health claims databases, generated from healthcare reimbursement claims, offer a comprehensive source of RWD, providing insights into patient outcomes, healthcare utilization, and treatment effectiveness. However, the use of these databases for research also presents unique challenges. This narrative review explores the role of real-world research on retinal diseases using health claims databases, highlighting their advantages, limitations, and potential contributions to advancing our understanding and management of the diseases. The review examines the applications of health claims databases in retinal disease research, including epidemiological studies, comparative effectiveness and safety analyses, economic burden assessments, and evaluations of patient outcomes and quality of care. Previous findings demonstrate the value of these databases in generating prevalence and incidence estimates, identifying risk factors and predictors, evaluating treatment effectiveness and safety, and understanding healthcare utilization patterns and costs associated with retinal diseases. Despite their strengths, health claims databases face challenges related to data limitations, biases, privacy concerns, and methodological issues. Accordingly, the review also explores future directions and opportunities, including advancements in data collection and analysis, integration with electronic health records, collaborative research networks and consortia, and the evolving regulatory landscape. These developments are expected to enhance the utility of health claims databases for retinal disease research, resulting in more comprehensive and impactful findings across diverse retinal disorders and robust real-world insights from a large population.
Collapse
Affiliation(s)
- Seong Joon Ahn
- Department of Ophthalmology, Hanyang University Hospital, Hanyang University College of Medicine, Seoul 04763, Republic of Korea
| |
Collapse
|
2
|
Wittner R, Holub P, Mascia C, Frexia F, Müller H, Plass M, Allocca C, Betsou F, Burdett T, Cancio I, Chapman A, Chapman M, Courtot M, Curcin V, Eder J, Elliot M, Exter K, Goble C, Golebiewski M, Kisler B, Kremer A, Leo S, Lin‐Gibson S, Marsano A, Mattavelli M, Moore J, Nakae H, Perseil I, Salman A, Sluka J, Soiland‐Reyes S, Strambio‐De‐Castillia C, Sussman M, Swedlow JR, Zatloukal K, Geiger J. Toward a common standard for data and specimen provenance in life sciences. Learn Health Syst 2024; 8:e10365. [PMID: 38249839 PMCID: PMC10797572 DOI: 10.1002/lrh2.10365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 03/17/2023] [Accepted: 03/24/2023] [Indexed: 01/23/2024] Open
Abstract
Open and practical exchange, dissemination, and reuse of specimens and data have become a fundamental requirement for life sciences research. The quality of the data obtained and thus the findings and knowledge derived is thus significantly influenced by the quality of the samples, the experimental methods, and the data analysis. Therefore, a comprehensive and precise documentation of the pre-analytical conditions, the analytical procedures, and the data processing are essential to be able to assess the validity of the research results. With the increasing importance of the exchange, reuse, and sharing of data and samples, procedures are required that enable cross-organizational documentation, traceability, and non-repudiation. At present, this information on the provenance of samples and data is mostly either sparse, incomplete, or incoherent. Since there is no uniform framework, this information is usually only provided within the organization and not interoperably. At the same time, the collection and sharing of biological and environmental specimens increasingly require definition and documentation of benefit sharing and compliance to regulatory requirements rather than consideration of pure scientific needs. In this publication, we present an ongoing standardization effort to provide trustworthy machine-actionable documentation of the data lineage and specimens. We would like to invite experts from the biotechnology and biomedical fields to further contribute to the standard.
Collapse
Affiliation(s)
- Rudolf Wittner
- BBMRI‐ERICGrazAustria
- Institute of Computer Science & Faculty of InformaticsMasaryk UniversityBrnoCzechia
| | - Petr Holub
- BBMRI‐ERICGrazAustria
- Institute of Computer Science & Faculty of InformaticsMasaryk UniversityBrnoCzechia
| | - Cecilia Mascia
- CRS4—Center for Advanced StudiesResearch and Development in SardiniaPulaItaly
| | - Francesca Frexia
- CRS4—Center for Advanced StudiesResearch and Development in SardiniaPulaItaly
| | | | | | - Clare Allocca
- National Institute of Standards and TechnologyGaithersburgMarylandUSA
| | - Fay Betsou
- Biological Resource Center of Institut Pasteur (CRBIP)ParisFrance
| | - Tony Burdett
- EMBL's European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - Ibon Cancio
- Plentzia Marine Station (PiE‐UPV/EHU)University of the Basque Country, EMBRC‐SpainBilbaoSpain
| | | | | | | | | | | | - Mark Elliot
- Department of Social Statistics, School of Social SciencesUniversity of ManchesterManchesterUK
| | - Katrina Exter
- Flanders Marine Institute (VLIZ), EMBRC‐BelgiumOstendBelgium
| | - Carole Goble
- Department of Computer ScienceUniversity of ManchesterManchesterUK
| | - Martin Golebiewski
- Heidelberg Institute for Theoretical Studies (HITS gGmbH)HeidelbergGermany
| | | | | | - Simone Leo
- CRS4—Center for Advanced StudiesResearch and Development in SardiniaPulaItaly
| | | | - Anna Marsano
- Department of BiomedicineUniversity of BaselBaselSwitzerland
| | - Marco Mattavelli
- SCI‐STI‐MMÉcole Politechnique Fédérale de LausanneLausanneSwitzerland
| | - Josh Moore
- Centre for Gene Regulation and Expression and Division of Computational Biology, School of Life SciencesUniversity of DundeeDundeeUK
- German BioImaging–Gesellschaft für Mikroskopie und Bildanalyse e.V.KonstanzGermany
| | - Hiroki Nakae
- Japan bio‐Measurement and Analysis ConsortiumTokyoJapan
| | - Isabelle Perseil
- INSERM–Institut National de la Sante et de la Recherche MedicaleParisFrance
| | - Ayat Salman
- Standards Council of CanadaOttawaOntarioCanada
- Canadian Primary Care Sentinel Surveillance Network (CPCSSN) Department of Family MedicineQueen's UniversityKingstonOntarioCanada
| | - James Sluka
- Biocomplexity InstituteIndiana UniversityBloomingtonIndianaUSA
| | - Stian Soiland‐Reyes
- Department of Computer ScienceUniversity of ManchesterManchesterUK
- Informatics InstituteUniversity of AmsterdamAmsterdamThe Netherlands
| | | | - Michael Sussman
- US Department of AgricultureWashingtonDistrict of ColumbiaUSA
| | - Jason R. Swedlow
- Centre for Gene Regulation and Expression and Division of Computational Biology, School of Life SciencesUniversity of DundeeDundeeUK
| | | | - Jörg Geiger
- Interdisciplinary Bank of Biomaterials and Data Würzburg (ibdw)WürzburgGermany
| |
Collapse
|
3
|
Shu D, Li X, Her Q, Wong J, Li D, Wang R, Toh S. Combining meta-analysis with multiple imputation for one-step, privacy-protecting estimation of causal treatment effects in multi-site studies. Res Synth Methods 2023; 14:742-763. [PMID: 37527843 DOI: 10.1002/jrsm.1660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 03/10/2023] [Accepted: 06/28/2023] [Indexed: 08/03/2023]
Abstract
Missing data complicates statistical analyses in multi-site studies, especially when it is not feasible to centrally pool individual-level data across sites. We combined meta-analysis with within-site multiple imputation for one-step estimation of the average causal effect (ACE) of a target population comprised of all individuals from all data-contributing sites within a multi-site distributed data network, without the need for sharing individual-level data to handle missing data. We considered two orders of combination and three choices of weights for meta-analysis, resulting in six approaches. The first three approaches, denoted as RR + metaF, RR + metaR and RR + std, first combined results from imputed data sets within each site using Rubin's rules and then meta-analyzed the combined results across sites using fixed-effect, random-effects and sample-standardization weights, respectively. The last three approaches, denoted as metaF + RR, metaR + RR and std + RR, first meta-analyzed results across sites separately for each imputation and then combined the meta-analysis results using Rubin's rules. Simulation results confirmed very good performance of RR + std and std + RR under various missing completely at random and missing at random settings. A direct application of the inverse-variance weighted meta-analysis based on site-specific ACEs can lead to biased results for the targeted network-wide ACE in the presence of treatment effect heterogeneity by site, demonstrating the need to clearly specify the target population and estimand and properly account for potential site heterogeneity in meta-analyses seeking to draw causal interpretations. An illustration using a large administrative claims database is presented.
Collapse
Affiliation(s)
- Di Shu
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
- Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
- Clinical Futures, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, USA
| | - Xiaojuan Li
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, USA
| | - Qoua Her
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, USA
| | - Jenna Wong
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, USA
| | - Dongdong Li
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, USA
| | - Rui Wang
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| | - Sengwee Toh
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, USA
| |
Collapse
|
4
|
Li D, Lu W, Shu D, Toh S, Wang R. Distributed Cox proportional hazards regression using summary-level information. Biostatistics 2023; 24:776-794. [PMID: 35195675 PMCID: PMC10345997 DOI: 10.1093/biostatistics/kxac006] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Revised: 11/22/2021] [Accepted: 01/30/2022] [Indexed: 07/20/2023] Open
Abstract
Individual-level data sharing across multiple sites can be infeasible due to privacy and logistical concerns. This article proposes a general distributed methodology to fit Cox proportional hazards models without sharing individual-level data in multi-site studies. We make inferences on the log hazard ratios based on an approximated partial likelihood score function that uses only summary-level statistics. This approach can be applied to both stratified and unstratified models, accommodate both discrete and continuous exposure variables, and permit the adjustment of multiple covariates. In particular, the fitting of stratified Cox models can be carried out with only one file transfer of summary-level information. We derive the asymptotic properties of the proposed estimators and compare the proposed estimators with the maximum partial likelihood estimators using pooled individual-level data and meta-analysis methods through simulation studies. We apply the proposed method to a real-world data set to examine the effect of sleeve gastrectomy versus Roux-en-Y gastric bypass on the time to first postoperative readmission.
Collapse
Affiliation(s)
| | - Wenbin Lu
- Department of Statistics, North Carolina State University, Raleigh, NC, 27695, USA
| | - Di Shu
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA, Department of Pediatrics, Childrens Hospital of Philadelphia, Philadelphia, PA, 19104, USA, and Center for Pediatric Clinical Effectiveness, Children’s Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Sengwee Toh
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, 02215, USA
| | - Rui Wang
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, 02215, USA and Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, 02215, USA
| |
Collapse
|
5
|
Li Z, Shen Y, Ning J. Accommodating time-varying heterogeneity in risk estimation under the Cox model: a transfer learning approach. J Am Stat Assoc 2023; 118:2276-2287. [PMID: 38505403 PMCID: PMC10950074 DOI: 10.1080/01621459.2023.2210336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 04/26/2023] [Indexed: 03/21/2024]
Abstract
Transfer learning has attracted increasing attention in recent years for adaptively borrowing information across different data cohorts in various settings. Cancer registries have been widely used in clinical research because of their easy accessibility and large sample size. Our method is motivated by the question of how to utilize cancer registry data as a complement to improve the estimation precision of individual risks of death for inflammatory breast cancer (IBC) patients at The University of Texas MD Anderson Cancer Center. When transferring information for risk estimation based on the cancer registries (i.e., source cohort) to a single cancer center (i.e., target cohort), time-varying population heterogeneity needs to be appropriately acknowledged. However, there is no literature on how to adaptively transfer knowledge on risk estimation with time-to-event data from the source cohort to the target cohort while adjusting for time-varying differences in event risks between the two sources. Our goal is to address this statistical challenge by developing a transfer learning approach under the Cox proportional hazards model. To allow data-adaptive levels of information borrowing, we impose Lasso penalties on the discrepancies in regression coefficients and baseline hazard functions between the two cohorts, which are jointly solved in the proposed transfer learning algorithm. As shown in the extensive simulation studies, the proposed method yields more precise individualized risk estimation than using the target cohort alone. Meanwhile, our method demonstrates satisfactory robustness against cohort differences compared with the method that directly combines the target and source data in the Cox model. We develop a more accurate risk estimation model for the MD Anderson IBC cohort given various treatment and baseline covariates, while adaptively borrowing information from the National Cancer Database to improve risk assessment.
Collapse
Affiliation(s)
- Ziyi Li
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Yu Shen
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jing Ning
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| |
Collapse
|
6
|
Li D, Wong J, Li X, Toh S, Wang R. Imputing missing covariates in time-to-event analysis within distributed research networks: A simulation study. Pharmacoepidemiol Drug Saf 2023; 32:330-340. [PMID: 36380400 DOI: 10.1002/pds.5563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 09/13/2022] [Accepted: 10/26/2022] [Indexed: 11/18/2022]
Abstract
PURPOSE In distributed research network (DRN) settings, multiple imputation cannot be directly implemented because pooling individual-level data are often not feasible. The performance of multiple imputation in combination with meta-analysis is not well understood within DRNs. METHODS To evaluate the performance of imputation for missing baseline covariate data in combination with meta-analysis for time-to-event analysis within DRNs, we compared two parametric algorithms including one approximated linear imputation model (Approx), and one nonlinear substantive model compatible imputation model (SMC), as well as two non-parametric machine learning algorithms including random forest (RF), and classification and regression trees (CART), through simulation studies motivated by a real-world data set. RESULTS Under the setting with small effect sizes (i.e., log-Hazard ratios [logHR]) and homogeneous missingness mechanisms across sites, all imputation methods produced unbiased and more efficient estimates while the complete-case analysis could be biased and inefficient; and under heterogeneous missingness mechanisms, estimates with RF method could have higher efficiency. Estimates from the distributed imputation combined by meta-analysis were similar to those from the imputation using pooled data. When logHRs were large, the SMC imputation algorithm generally performed better than others. CONCLUSIONS These findings suggest the validity and feasibility of imputation within DRNs in the presence of missing covariate data in time-to-event analysis under various settings. The performance of the four imputation algorithms varies with the effect sizes and level of missingness.
Collapse
Affiliation(s)
- Dongdong Li
- Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, Massachusetts, USA
| | - Jenna Wong
- Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, Massachusetts, USA
| | - Xiaojuan Li
- Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, Massachusetts, USA
| | - Sengwee Toh
- Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, Massachusetts, USA
| | - Rui Wang
- Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, Massachusetts, USA.,Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| |
Collapse
|
7
|
Shi X, Pan Z, Miao W. Data Integration in Causal Inference. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL STATISTICS 2023; 15:e1581. [PMID: 36713955 PMCID: PMC9880960 DOI: 10.1002/wics.1581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 02/24/2022] [Accepted: 03/01/2022] [Indexed: 04/12/2023]
Abstract
Integrating data from multiple heterogeneous sources has become increasingly popular to achieve a large sample size and diverse study population. This paper reviews development in causal inference methods that combines multiple datasets collected by potentially different designs from potentially heterogeneous populations. We summarize recent advances on combining randomized clinical trial with external information from observational studies or historical controls, combining samples when no single sample has all relevant variables with application to two-sample Mendelian randomization, distributed data setting under privacy concerns for comparative effectiveness and safety research using real-world data, Bayesian causal inference, and causal discovery methods.
Collapse
Affiliation(s)
- Xu Shi
- Department of BiostatisticsUniversity of MichiganAnn ArborMichiganUSA
| | - Ziyang Pan
- Department of BiostatisticsUniversity of MichiganAnn ArborMichiganUSA
| | - Wang Miao
- Department of Probability and StatisticsPeking UniversityBeijingChina
| |
Collapse
|
8
|
Maro JC, Toh S. Invited Commentary: Go BIG and Go Global-Executing Large-Scale, Multisite Pharmacoepidemiologic Studies Using Real-World Data. Am J Epidemiol 2022; 191:1368-1371. [PMID: 35597819 PMCID: PMC9989341 DOI: 10.1093/aje/kwac096] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 03/02/2022] [Accepted: 04/12/2022] [Indexed: 01/28/2023] Open
Abstract
At the time medical products are approved, we rarely know enough about their comparative safety and effectiveness vis-à-vis alternative therapies to advise patients and providers. Postmarket generation of evidence on rare adverse events following medical product exposure increasingly requires analysis of millions of longitudinal patient records that can provide complete capture of data on patient experiences. In the accompanying article by Pradhan et al. (Am J Epidemiology. 2022;191(8):1352-1367), the authors demonstrate how observational database studies are often the most practical approach, provided these databases are carefully chosen to be "fit for purpose." Distributed data networks with common data models have proliferated in the last 2 decades in pharmacoepidemiology, allowing efficient capture of patient data in a standardized and structured format across disparate real-world data sources. Use of common data models facilitates transparency by allowing standardized programming approaches that can be easily reproduced. The distributed data network architecture, combined with a common data approach, supports not only multisite observational studies but also pragmatic clinical trials. It also helps bridge international boundaries and further increases the sample size and diversity of study populations.
Collapse
Affiliation(s)
- Judith C Maro
- Correspondence to Dr. Judith C. Maro, Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, 401 Park Drive, Suite 401 East, Boston, MA 02215 (e-mail: )
| | | |
Collapse
|
9
|
Moodie EEM, Coulombe J, Danieli C, Renoux C, Shortreed SM. Privacy-preserving estimation of an optimal individualized treatment rule: a case study in maximizing time to severe depression-related outcomes. LIFETIME DATA ANALYSIS 2022; 28:512-542. [PMID: 35499604 PMCID: PMC10805063 DOI: 10.1007/s10985-022-09554-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 03/08/2022] [Indexed: 06/14/2023]
Abstract
Estimating individualized treatment rules-particularly in the context of right-censored outcomes-is challenging because the treatment effect heterogeneity of interest is often small, thus difficult to detect. While this motivates the use of very large datasets such as those from multiple health systems or centres, data privacy may be of concern with participating data centres reluctant to share individual-level data. In this case study on the treatment of depression, we demonstrate an application of distributed regression for privacy protection used in combination with dynamic weighted survival modelling (DWSurv) to estimate an optimal individualized treatment rule whilst obscuring individual-level data. In simulations, we demonstrate the flexibility of this approach to address local treatment practices that may affect confounding, and show that DWSurv retains its double robustness even when performed through a (weighted) distributed regression approach. The work is motivated by, and illustrated with, an analysis of treatment for unipolar depression using the United Kingdom's Clinical Practice Research Datalink.
Collapse
Affiliation(s)
- Erica E M Moodie
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, QC, Canada.
| | - Janie Coulombe
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, QC, Canada
| | - Coraline Danieli
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, QC, Canada
| | - Christel Renoux
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, QC, Canada
- Centre for Clinical Epidemiology, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, QC, Canada
- Department of Neurology and Neurosurgery, McGill University, Montréal, QC, Canada
| | - Susan M Shortreed
- Biostatistics Unit, Kaiser Permanente Washington Health Research Institute, Seattle, USA
- Biostatistics Department, University of Washington, Seattle, USA
| |
Collapse
|
10
|
Orenstein EW, Kandaswamy S, Muthu N, Chaparro JD, Hagedorn PA, Dziorny AC, Moses A, Hernandez S, Khan A, Huth HB, Beus JM, Kirkendall ES. Alert burden in pediatric hospitals: a cross-sectional analysis of six academic pediatric health systems using novel metrics. J Am Med Inform Assoc 2021; 28:2654-2660. [PMID: 34664664 DOI: 10.1093/jamia/ocab179] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 07/02/2021] [Accepted: 09/10/2021] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND Excessive electronic health record (EHR) alerts reduce the salience of actionable alerts. Little is known about the frequency of interruptive alerts across health systems and how the choice of metric affects which users appear to have the highest alert burden. OBJECTIVE (1) Analyze alert burden by alert type, care setting, provider type, and individual provider across 6 pediatric health systems. (2) Compare alert burden using different metrics. MATERIALS AND METHODS We analyzed interruptive alert firings logged in EHR databases at 6 pediatric health systems from 2016-2019 using 4 metrics: (1) alerts per patient encounter, (2) alerts per inpatient-day, (3) alerts per 100 orders, and (4) alerts per unique clinician days (calendar days with at least 1 EHR log in the system). We assessed intra- and interinstitutional variation and how alert burden rankings differed based on the chosen metric. RESULTS Alert burden varied widely across institutions, ranging from 0.06 to 0.76 firings per encounter, 0.22 to 1.06 firings per inpatient-day, 0.98 to 17.42 per 100 orders, and 0.08 to 3.34 firings per clinician day logged in the EHR. Custom alerts accounted for the greatest burden at all 6 sites. The rank order of institutions by alert burden was similar regardless of which alert burden metric was chosen. Within institutions, the alert burden metric choice substantially affected which provider types and care settings appeared to experience the highest alert burden. CONCLUSION Estimates of the clinical areas with highest alert burden varied substantially by institution and based on the metric used.
Collapse
Affiliation(s)
- Evan W Orenstein
- Department of Pediatrics, Emory University School of Medicine, Atlanta, Georgia, USA.,Division of Hospital Medicine, Children's Healthcare of Atlanta, Atlanta, Georgia, USA
| | | | - Naveen Muthu
- Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA.,Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Juan D Chaparro
- Division of Clinical Informatics, Nationwide Children's Hospital, Columbus, Ohio, USA.,Department of Pediatrics, The Ohio State University, Columbus, Ohio, USA
| | - Philip A Hagedorn
- Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio, USA.,Division of Hospital Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
| | - Adam C Dziorny
- Department of Pediatrics, University of Rochester School of Medicine, Rochester, New York, USA.,Division of Critical Care Medicine, Golisano Children's Hospital at Strong, Rochester, New York, USA
| | - Adam Moses
- Center for Healthcare Innovation, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA
| | - Sean Hernandez
- Center for Healthcare Innovation, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA.,Department of General Internal Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA
| | - Amina Khan
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Hannah B Huth
- Center for Healthcare Innovation, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA
| | - Jonathan M Beus
- Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA.,Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Eric S Kirkendall
- Center for Healthcare Innovation, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA.,Department of Pediatrics, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA
| |
Collapse
|
11
|
Wirth FN, Meurers T, Johns M, Prasser F. Privacy-preserving data sharing infrastructures for medical research: systematization and comparison. BMC Med Inform Decis Mak 2021; 21:242. [PMID: 34384406 PMCID: PMC8359765 DOI: 10.1186/s12911-021-01602-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Accepted: 07/31/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Data sharing is considered a crucial part of modern medical research. Unfortunately, despite its advantages, it often faces obstacles, especially data privacy challenges. As a result, various approaches and infrastructures have been developed that aim to ensure that patients and research participants remain anonymous when data is shared. However, privacy protection typically comes at a cost, e.g. restrictions regarding the types of analyses that can be performed on shared data. What is lacking is a systematization making the trade-offs taken by different approaches transparent. The aim of the work described in this paper was to develop a systematization for the degree of privacy protection provided and the trade-offs taken by different data sharing methods. Based on this contribution, we categorized popular data sharing approaches and identified research gaps by analyzing combinations of promising properties and features that are not yet supported by existing approaches. METHODS The systematization consists of different axes. Three axes relate to privacy protection aspects and were adopted from the popular Five Safes Framework: (1) safe data, addressing privacy at the input level, (2) safe settings, addressing privacy during shared processing, and (3) safe outputs, addressing privacy protection of analysis results. Three additional axes address the usefulness of approaches: (4) support for de-duplication, to enable the reconciliation of data belonging to the same individuals, (5) flexibility, to be able to adapt to different data analysis requirements, and (6) scalability, to maintain performance with increasing complexity of shared data or common analysis processes. RESULTS Using the systematization, we identified three different categories of approaches: distributed data analyses, which exchange anonymous aggregated data, secure multi-party computation protocols, which exchange encrypted data, and data enclaves, which store pooled individual-level data in secure environments for access for analysis purposes. We identified important research gaps, including a lack of approaches enabling the de-duplication of horizontally distributed data or providing a high degree of flexibility. CONCLUSIONS There are fundamental differences between different data sharing approaches and several gaps in their functionality that may be interesting to investigate in future work. Our systematization can make the properties of privacy-preserving data sharing infrastructures more transparent and support decision makers and regulatory authorities with a better understanding of the trade-offs taken.
Collapse
Affiliation(s)
- Felix Nikolaus Wirth
- Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany.
| | - Thierry Meurers
- Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
| | - Marco Johns
- Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
| | - Fabian Prasser
- Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
| |
Collapse
|
12
|
Hunt NB, Gardarsdottir H, Bazelier MT, Klungel OH, Pajouheshnia R. A systematic review of how missing data are handled and reported in multi-database pharmacoepidemiologic studies. Pharmacoepidemiol Drug Saf 2021; 30:819-826. [PMID: 33834576 PMCID: PMC8252545 DOI: 10.1002/pds.5245] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 03/24/2021] [Accepted: 04/05/2021] [Indexed: 01/24/2023]
Abstract
Purpose Pharmacoepidemiologic multi‐database studies (MDBS) provide opportunities to better evaluate the safety and effectiveness of medicines. However, the issue of missing data is often exacerbated in MDBS, potentially resulting in bias and precision loss. We sought to measure how missing data are being recorded and addressed in pharmacoepidemiologic MDBS. Methods We conducted a systematic literature search in PubMed for pharmacoepidemiologic MDBS published between 1st January 2018 and 31st December 2019. Included studies were those that used ≥2 distinct databases to assess the same safety/effectiveness outcome associated with a drug exposure. Outcome variables extracted from the studies included strategies to execute a MDBS, reporting of missing data (type, bias evaluation) and the methods used to account for missing data. Results Two thousand seven hundred and twenty‐six articles were identified, and 62 studies were included: using data from either North America (56%), Europe (31%), multiple regions (11%) or East‐Asia (2%). Thirty‐five (56%) articles reported missing data: 11 of these studies reported that this could have introduced bias and 19 studies reported a method to address missing data. Thirteen (68%) carried out a complete case analysis, 2 (11%) applied multiple imputation, 2 (11%) used both methods, 1 (5%) used mean imputation and 1 (5%) substituted information from a similar variable. Conclusions Just over half of the recent pharmacoepidemiologic MDBS reported missing data and two‐thirds of these studies reported how they accounted for it. We should increase our vigilance for database completeness in MDBS by reporting and addressing the missing data that could introduce bias.
Collapse
Affiliation(s)
- Nicholas B Hunt
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Utrecht, The Netherlands
| | - Helga Gardarsdottir
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Utrecht, The Netherlands.,Department of Clinical Pharmacy, University Medical Centre Utrecht, Utrecht, The Netherlands.,Department of Pharmaceutical Sciences, School of Health Sciences, University of Iceland, Reykjavik, Iceland
| | - Marloes T Bazelier
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Utrecht, The Netherlands
| | - Olaf H Klungel
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Utrecht, The Netherlands
| | - Romin Pajouheshnia
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
13
|
A Narrative Review of Methods for Causal Inference and Associated Educational Resources. Qual Manag Health Care 2020; 29:260-269. [PMID: 32991545 DOI: 10.1097/qmh.0000000000000276] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
BACKGROUND AND OBJECTIVES Root cause analysis involves evaluation of causal relationships between exposures (or interventions) and adverse outcomes, such as identification of direct (eg, medication orders missed) and root causes (eg, clinician's fatigue and workload) of adverse rare events. To assess causality requires either randomization or sophisticated methods applied to carefully designed observational studies. In most cases, randomized trials are not feasible in the context of root cause analysis. Using observational data for causal inference, however, presents many challenges in both the design and analysis stages. Methods for observational causal inference often fall outside the toolbox of even well-trained statisticians, thus necessitating workforce training. METHODS This article synthesizes the key concepts and statistical perspectives for causal inference, and describes available educational resources, with a focus on observational clinical data. The target audience for this review is clinical researchers with training in fundamental statistics or epidemiology, and statisticians collaborating with those researchers. RESULTS The available literature includes a number of textbooks and thousands of review articles. However, using this literature for independent study or clinical training programs is extremely challenging for numerous reasons. First, the published articles often assume an advanced technical background with different notations and terminology. Second, they may be written from any number of perspectives across statistics, epidemiology, computer science, or philosophy. Third, the methods are rapidly expanding and thus difficult to capture within traditional publications. Fourth, even the most fundamental aspects of causal inference (eg, framing the causal question as a target trial) often receive little or no coverage. This review presents an overview of (1) key concepts and frameworks for causal inference and (2) online documents that are publicly available for better assisting researchers to gain the necessary perspectives for functioning effectively within a multidisciplinary team. CONCLUSION A familiarity with causal inference methods can help risk managers empirically verify, from observed events, the true causes of adverse sentinel events.
Collapse
|