Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Beesley LJ, Mukherjee B. Case studies in bias reduction and inference for electronic health record data with selection bias and phenotype misclassification. Stat Med 2022;41:5501-5516. [PMID: 36131394 PMCID: PMC9826451 DOI: 10.1002/sim.9579] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 08/12/2022] [Accepted: 08/13/2022] [Indexed: 01/11/2023]

For:	Beesley LJ, Mukherjee B. Case studies in bias reduction and inference for electronic health record data with selection bias and phenotype misclassification. Stat Med 2022;41:5501-5516. [PMID: 36131394 PMCID: PMC9826451 DOI: 10.1002/sim.9579] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 08/12/2022] [Accepted: 08/13/2022] [Indexed: 01/11/2023]

Number

Cited by Other Article(s)

Salvatore M, Kundu R, Shi X, Friese CR, Lee S, Fritsche LG, Mondul AM, Hanauer D, Pearce CL, Mukherjee B. To weight or not to weight? The effect of selection bias in 3 large electronic health record-linked biobanks and recommendations for practice. J Am Med Inform Assoc 2024;31:1479-1492. [PMID: 38742457 PMCID: PMC11187425 DOI: 10.1093/jamia/ocae098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 04/14/2024] [Accepted: 04/18/2024] [Indexed: 05/16/2024] Open

Abstract

OBJECTIVES

To develop recommendations regarding the use of weights to reduce selection bias for commonly performed analyses using electronic health record (EHR)-linked biobank data.

MATERIALS AND METHODS

We mapped diagnosis (ICD code) data to standardized phecodes from 3 EHR-linked biobanks with varying recruitment strategies: All of Us (AOU; n = 244 071), Michigan Genomics Initiative (MGI; n = 81 243), and UK Biobank (UKB; n = 401 167). Using 2019 National Health Interview Survey data, we constructed selection weights for AOU and MGI to represent the US adult population more. We used weights previously developed for UKB to represent the UKB-eligible population. We conducted 4 common analyses comparing unweighted and weighted results.

RESULTS

For AOU and MGI, estimated phecode prevalences decreased after weighting (weighted-unweighted median phecode prevalence ratio [MPR]: 0.82 and 0.61), while UKB estimates increased (MPR: 1.06). Weighting minimally impacted latent phenome dimensionality estimation. Comparing weighted versus unweighted phenome-wide association study for colorectal cancer, the strongest associations remained unaltered, with considerable overlap in significant hits. Weighting affected the estimated log-odds ratio for sex and colorectal cancer to align more closely with national registry-based estimates.

DISCUSSION

Weighting had a limited impact on dimensionality estimation and large-scale hypothesis testing but impacted prevalence and association estimation. When interested in estimating effect size, specific signals from untargeted association analyses should be followed up by weighted analysis.

CONCLUSION

EHR-linked biobanks should report recruitment and selection mechanisms and provide selection weights with defined target populations. Researchers should consider their intended estimands, specify source and target populations, and weight EHR-linked biobank analyses accordingly.

Collapse

Affiliation(s)

Maxwell Salvatore Department of Epidemiology, University of Michigan, Ann Arbor, MI 48109-2029, United States Center for Precision Health Data Science, Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109-2029, United States
Ritoban Kundu Center for Precision Health Data Science, Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109-2029, United States Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109-2029, United States
Xu Shi Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109-2029, United States
Christopher R Friese Rogel Cancer Center, Michigan Medicine, University of Michigan, Ann Arbor, MI 48109-2029, United States Center for Improving Patient and Population Health, School of Nursing, University of Michigan, Ann Arbor, MI 48109-2029, United States Department of Health Management and Policy, University of Michigan, Ann Arbor, MI 48109-2029, United States
Seunggeun Lee Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109-2029, United States Graduate School of Data Science, Seoul National University, Gwanak-gu, Seoul, Republic of Korea
Lars G Fritsche Center for Precision Health Data Science, Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109-2029, United States Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109-2029, United States Rogel Cancer Center, Michigan Medicine, University of Michigan, Ann Arbor, MI 48109-2029, United States
Alison M Mondul Department of Epidemiology, University of Michigan, Ann Arbor, MI 48109-2029, United States Rogel Cancer Center, Michigan Medicine, University of Michigan, Ann Arbor, MI 48109-2029, United States
David Hanauer Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, MI 48109-2054, United States
Celeste Leigh Pearce Department of Epidemiology, University of Michigan, Ann Arbor, MI 48109-2029, United States Rogel Cancer Center, Michigan Medicine, University of Michigan, Ann Arbor, MI 48109-2029, United States
Bhramar Mukherjee Department of Epidemiology, University of Michigan, Ann Arbor, MI 48109-2029, United States Center for Precision Health Data Science, Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109-2029, United States Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109-2029, United States

Collapse

Razzaghi H, Goodwin Davies A, Boss S, Bunnell HT, Chen Y, Chrischilles EA, Dickinson K, Hanauer D, Huang Y, Ilunga KTS, Katsoufis C, Lehmann H, Lemas DJ, Matthews K, Mendonca EA, Morse K, Ranade D, Rosenman M, Taylor B, Walters K, Denburg MR, Forrest CB, Bailey LC. Systematic data quality assessment of electronic health record data to evaluate study-specific fitness: Report from the PRESERVE research study. PLOS DIGITAL HEALTH 2024;3:e0000527. [PMID: 38935590 PMCID: PMC11210795 DOI: 10.1371/journal.pdig.0000527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 05/07/2024] [Indexed: 06/29/2024]

Abstract

Study-specific data quality testing is an essential part of minimizing analytic errors, particularly for studies making secondary use of clinical data. We applied a systematic and reproducible approach for study-specific data quality testing to the analysis plan for PRESERVE, a 15-site, EHR-based observational study of chronic kidney disease in children. This approach integrated widely adopted data quality concepts with healthcare-specific evaluation methods. We implemented two rounds of data quality assessment. The first produced high-level evaluation using aggregate results from a distributed query, focused on cohort identification and main analytic requirements. The second focused on extended testing of row-level data centralized for analysis. We systematized reporting and cataloguing of data quality issues, providing institutional teams with prioritized issues for resolution. We tracked improvements and documented anomalous data for consideration during analyses. The checks we developed identified 115 and 157 data quality issues in the two rounds, involving completeness, data model conformance, cross-variable concordance, consistency, and plausibility, extending traditional data quality approaches to address more complex stratification and temporal patterns. Resolution efforts focused on higher priority issues, given finite study resources. In many cases, institutional teams were able to correct data extraction errors or obtain additional data, avoiding exclusion of 2 institutions entirely and resolving 123 other gaps. Other results identified complexities in measures of kidney function, bearing on the study's outcome definition. Where limitations such as these are intrinsic to clinical data, the study team must account for them in conducting analyses. This study rigorously evaluated fitness of data for intended use. The framework is reusable and built on a strong theoretical underpinning. Significant data quality issues that would have otherwise delayed analyses or made data unusable were addressed. This study highlights the need for teams combining subject-matter and informatics expertise to address data quality when working with real world data.

Collapse

Affiliation(s)

Hanieh Razzaghi Applied Clinical Research Center, Departments of Pediatrics and Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
Amy Goodwin Davies Applied Clinical Research Center, Departments of Pediatrics and Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
Samuel Boss Applied Clinical Research Center, Departments of Pediatrics and Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
H. Timothy Bunnell Biomedical Research Informatics Center, Nemours Children’s Hospital, Wilmington, Delaware, United States of America
Yong Chen Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
Elizabeth A. Chrischilles Department of Epidemiology, College of Public Health, University of Iowa, Iowa City, Iowa, United States of America
Kimberley Dickinson Applied Clinical Research Center, Departments of Pediatrics and Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
David Hanauer Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, Michigan, United States of America
Yungui Huang IT Research and Innovation, Nationwide Children’s Hospital, Columbus, Ohio, United States of America
K. T. Sandra Ilunga Applied Clinical Research Center, Departments of Pediatrics and Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
Chryso Katsoufis Division of Pediatric Nephrology, University of Miami Miller School of Medicine, Miami, Florida United States of America
Harold Lehmann Biomedical Informatics & Data Science Section, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
Dominick J. Lemas Department of Health Outcomes & Biomedical Informatics, University of Florida, Gainesville, FLorida, United States of America
Kevin Matthews Analytics Research Center, Children’s Hospital of Colorado, Aurora, Colorado, United States of America
Eneida A. Mendonca Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
Keith Morse Division of Pediatric Hospital Medicine, Stanford University School of Medicine, Stanford, California, United States of America
Daksha Ranade Biostatistics, Epidemiology, and Analytics in Research (BEAR), Seattle Children’s Hospital, Seattle, Washington, United States of America
Marc Rosenman Department of Pediatrics, Ann & Robert H. Lurie Children’s Hospital, Chicago, Illinois, United States of America
Bradley Taylor Clinical and Translational Science Institute, Medical College of Wisconsin, Milwaukee, Wisconsin, United States of America
Kellie Walters Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Michelle R. Denburg Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America Division of Nephrology, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
Christopher B. Forrest Applied Clinical Research Center, Departments of Pediatrics and Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
L. Charles Bailey Applied Clinical Research Center, Departments of Pediatrics and Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America

Collapse

Garg E, Arguello-Pascualli P, Vishnyakova O, Halevy AR, Yoo S, Brooks JD, Bull SB, Gagnon F, Greenwood CMT, Hung RJ, Lawless JF, Lerner-Ellis J, Dennis JK, Abraham RJS, Garant JM, Thiruvahindrapuram B, Jones SJM, Strug LJ, Paterson AD, Sun L, Elliott LT. Canadian COVID-19 host genetics cohort replicates known severity associations. PLoS Genet 2024;20:e1011192. [PMID: 38517939 PMCID: PMC10990181 DOI: 10.1371/journal.pgen.1011192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 04/03/2024] [Accepted: 02/22/2024] [Indexed: 03/24/2024] Open

Affiliation(s)

Elika Garg Department of Statistics and Actuarial Science, Simon Fraser University, Vancouver, British Columbia, Canada Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, Ontario, Canada
Paola Arguello-Pascualli BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
Olga Vishnyakova Department of Statistics and Actuarial Science, Simon Fraser University, Vancouver, British Columbia, Canada Canada’s Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, Canada
Anat R. Halevy Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, Ontario, Canada
Samantha Yoo Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, Ontario, Canada School of Epidemiology and Public Health, University of Ottawa, Ottawa, Ontario, Canada
Jennifer D. Brooks Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
Shelley B. Bull Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada
France Gagnon Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
Celia M. T. Greenwood Gerald Bronfman Department of Oncology, Department of Epidemiology, Biostatistics and Occupational Health, Department of Human Genetics, McGill University, Montreal, Quebec, Canada Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
Rayjean J. Hung Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada
Jerald F. Lawless Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
Jordan Lerner-Ellis Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada Mount Sinai Hospital, Toronto, Ontario, Canada Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada
Jessica K. Dennis BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
Rohan J. S. Abraham Canada’s Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, Canada
Jean-Michel Garant Canada’s Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, Canada
Bhooma Thiruvahindrapuram Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, Ontario, Canada
Steven J. M. Jones Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada Canada’s Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, Canada
CGEn HostSeq Initiative
Lisa J. Strug Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, Ontario, Canada Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
Andrew D. Paterson Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, Ontario, Canada Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
Lei Sun Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
Lloyd T. Elliott Department of Statistics and Actuarial Science, Simon Fraser University, Vancouver, British Columbia, Canada

Collapse

Salvatore M, Kundu R, Shi X, Friese CR, Lee S, Fritsche LG, Mondul AM, Hanauer D, Pearce CL, Mukherjee B. To weight or not to weight? Studying the effect of selection bias in three large EHR-linked biobanks. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.02.12.24302710. [PMID: 38405832 PMCID: PMC10888982 DOI: 10.1101/2024.02.12.24302710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]

Abstract

Objective

To explore the role of selection bias adjustment by weighting electronic health record (EHR)-linked biobank data for commonly performed analyses.

Materials and methods

We mapped diagnosis (ICD code) data to standardized phecodes from three EHR-linked biobanks with varying recruitment strategies: All of Us (AOU; n=244,071), Michigan Genomics Initiative (MGI; n=81,243), and UK Biobank (UKB; n=401,167). Using 2019 National Health Interview Survey data, we constructed selection weights for AOU and MGI to be more representative of the US adult population. We used weights previously developed for UKB to represent the UKB-eligible population. We conducted four common descriptive and analytic tasks comparing unweighted and weighted results.

Results

For AOU and MGI, estimated phecode prevalences decreased after weighting (weighted-unweighted median phecode prevalence ratio [MPR]: 0.82 and 0.61), while UKB's estimates increased (MPR: 1.06). Weighting minimally impacted latent phenome dimensionality estimation. Comparing weighted versus unweighted PheWAS for colorectal cancer, the strongest associations remained unaltered and there was large overlap in significant hits. Weighting affected the estimated log-odds ratio for sex and colorectal cancer to align more closely with national registry-based estimates.

Discussion

Weighting had limited impact on dimensionality estimation and large-scale hypothesis testing but impacted prevalence and association estimation more. Results from untargeted association analyses should be followed by weighted analysis when effect size estimation is of interest for specific signals.

Conclusion

Collapse

Kim J, Anthopolos R, Zhong J. Bias correction models for electronic health records data in the presence of non-random sampling. Biometrics 2024;80:ujae014. [PMID: 38488466 PMCID: PMC10941326 DOI: 10.1093/biomtc/ujae014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 01/12/2024] [Accepted: 02/20/2024] [Indexed: 03/18/2024]

Fritsche LG, Nam K, Du J, Kundu R, Salvatore M, Shi X, Lee S, Burgess S, Mukherjee B. Uncovering associations between pre-existing conditions and COVID-19 Severity: A polygenic risk score approach across three large biobanks. PLoS Genet 2023;19:e1010907. [PMID: 38113267 PMCID: PMC10763941 DOI: 10.1371/journal.pgen.1010907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 01/03/2024] [Accepted: 12/05/2023] [Indexed: 12/21/2023] Open

Affiliation(s)

Lars G. Fritsche Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
Kisung Nam Graduate School of Data Science, Seoul National University, Seoul, South Korea
Jiacong Du Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
Ritoban Kundu Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
Maxwell Salvatore Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
Xu Shi Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
Seunggeun Lee Graduate School of Data Science, Seoul National University, Seoul, South Korea
Stephen Burgess MRC Biostatistics Unit, University of Cambridge, Cambridge, United Kingdom Cardiovascular Epidemiology Unit, University of Cambridge, Cambridge, United Kingdom
Bhramar Mukherjee Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America Michigan Institute for Data Science, University of Michigan, Ann Arbor, Michigan, United States of America

Collapse

Yin J, Zhao M, Yang L. Comment on: Decreased psoas muscle area is a prognosticator for 90-day and 1-year survival in patients undergoing surgical treatment for spinal metastasis. Clin Nutr 2023;42:2082-2083. [PMID: 37316332 DOI: 10.1016/j.clnu.2023.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 06/01/2023] [Indexed: 06/16/2023]

Ng DQ, Jia S, Wisseh C, Cadiz C, Nguyen M, Lee J, McBane S, Nguyen L, Chan A, Hurley-Kim K. Sociodemographic characteristics differ across routine adult vaccine cohorts: An All of Us descriptive study. J Am Pharm Assoc (2003) 2022;63:582-591.e20. [PMID: 36549934 DOI: 10.1016/j.japh.2022.11.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 11/07/2022] [Accepted: 11/07/2022] [Indexed: 11/15/2022]

Abstract

BACKGROUND

The National Institutes of Health All of Us (AoU) Research Program is currently building a database of 1million+ adult subjects. With it, we describe the characteristics of those with documented vaccinations.

OBJECTIVES

To describe the sociodemographic, health status, and lifestyle factors associated with vaccinations.

METHODS

This is a retrospective study involving data from the AoU program (R2020Q4R2, N = 315,297). Five vaccine cohorts [influenza, hepatitis B (HBV), pneumococcal <65 years old, pneumococcal ≥65 years old, and human papillomavirus (HPV)] were generated based on vaccination history. The influenza cohort comprised participants with documented influenza vaccinations in electronic health records (EHRs) from September 2017 to May 2018. Other vaccine cohorts comprised participants with ≥1 lifetime record(s) of vaccination documented in the EHR by December 2018. The vaccine cohorts were compared to the overall AoU cohort. Descriptive statistics were generated using EHR- and survey-based sociodemographic, health, and lifestyle information. The SAMBA (0.9.0) R package was utilized to adjust for EHR selection and outcome misclassification biases to infer sources of disparity for pneumococcal vaccinations in older adults.

RESULTS

Cohort counts were as follows: influenza (n = 15,346), HBV (n = 6323), pneumococcal <65 (n = 15,217), pneumococcal ≥65 (n = 15,100), and HPV (n = 2125). All vaccine cohorts had higher proportions of White and non-Hispanic/Latino participants compared to the overall AoU cohort. The largest differences were found in pneumococcal age ≥65, with 80.2% White participants compared to 52.9% in the overall study population. Multivariable analysis revealed that race/ethnic disparities in pneumococcal vaccination among older adults were explained by biological sex, income, health insurance, and education-related variables.

CONCLUSION

Racial, ethnic, education, and income characteristics differ across the vaccine cohorts among AoU participants. These findings inform future utilization of large health databases in vaccine epidemiology research and emphasize the need for more targeted interventions that address differences in vaccine uptake.

Collapse