1
|
Potter LN, Yap J, Dempsey W, Wetter DW, Nahum-Shani I. Integrating Intensive Longitudinal Data (ILD) to Inform the Development of Dynamic Theories of Behavior Change and Intervention Design: a Case Study of Scientific and Practical Considerations. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2023; 24:1659-1671. [PMID: 37060480 PMCID: PMC10576833 DOI: 10.1007/s11121-023-01495-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/16/2023] [Indexed: 04/16/2023]
Abstract
The increasing sophistication of mobile and sensing technology has enabled the collection of intensive longitudinal data (ILD) concerning dynamic changes in an individual's state and context. ILD can be used to develop dynamic theories of behavior change which, in turn, can be used to provide a conceptual framework for the development of just-in-time adaptive interventions (JITAIs) that leverage advances in mobile and sensing technology to determine when and how to intervene. As such, JITAIs hold tremendous potential in addressing major public health concerns such as cigarette smoking, which can recur and arise unexpectedly. In tandem, a growing number of studies have utilized multiple methods to collect data on a particular dynamic construct of interest from the same individual. This approach holds promise in providing investigators with a significantly more detailed view of how a behavior change processes unfold within the same individual than ever before. However, nuanced challenges relating to coarse data, noisy data, and incoherence among data sources are introduced. In this manuscript, we use a mobile health (mHealth) study on smokers motivated to quit (Break Free; R01MD010362) to illustrate these challenges. Practical approaches to integrate multiple data sources are discussed within the greater scientific context of developing dynamic theories of behavior change and JITAIs.
Collapse
Affiliation(s)
- Lindsey N Potter
- Center for Health Outcomes and Population Equity (Center for HOPE), Huntsman Cancer Institute, Salt Lake City, UT, USA.
- Department of Population Health Sciences, University of Utah, Salt Lake City, UT, USA.
| | - Jamie Yap
- Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Walter Dempsey
- Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
- Center for Methodologies for Adapting and Personalizing Prevention, Treatment, and Recovery Services for SUD and HIV (MAPS Center), University of Michigan, Ann Arbor, MI, USA
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - David W Wetter
- Center for Health Outcomes and Population Equity (Center for HOPE), Huntsman Cancer Institute, Salt Lake City, UT, USA
- Department of Population Health Sciences, University of Utah, Salt Lake City, UT, USA
| | - Inbal Nahum-Shani
- Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
2
|
Mojtabai R. Estimating the Prevalence of Substance Use Disorders in the US Using the Benchmark Multiplier Method. JAMA Psychiatry 2022; 79:1074-1080. [PMID: 36129721 PMCID: PMC9494265 DOI: 10.1001/jamapsychiatry.2022.2756] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 07/25/2022] [Indexed: 11/14/2022]
Abstract
Importance Prevalence estimates of substance use disorders in the US rely on general population surveys. However, major population groups, such as homeless individuals and institutionalized individuals, are not captured by these surveys, and participants may underreport substance use. Objective To estimate the prevalence of substance use disorders in the US. Design, Setting, and Participants The benchmark multiplier method was used to estimate the prevalence of alcohol, cannabis, opioid, and stimulant use disorders based on data from the Transformed Medicaid Statistical Information System (T-MSIS) (the benchmark) and the National Survey on Drug Use and Health (NSDUH) (the multiplier) for 2018 and 2019. T-MSIS collects administrative data on Medicaid beneficiaries 12 years and older with full or comprehensive benefits. NSDUH is a nationally representative annual cross-sectional survey of people 12 years and older. Data were analyzed from February to June 2022. Main Outcomes and Measures Prevalence of substance use disorders was estimated using the benchmark multiplier method based on T-MSIS and NSDUH data. Confidence intervals for the multiplier method estimates were computed using Monte Carlo simulations. Sensitivity of prevalence estimates to variations in multiplier values was assessed. Results This study included Medicaid beneficiaries 12 years and older accessing treatment services in the past year with diagnoses of alcohol (n = 1 017 308 in 2018; n = 1 041 357 in 2019), cannabis (n = 643 737; n = 644 780), opioid (n = 1 406 455; n = 1 575 219), and stimulant (n = 610 858; n = 657 305) use disorders and NSDUH participants with 12-month DSM-IV alcohol (n = 3390 in 2018; n = 3363 in 2019), cannabis (n = 1426; n = 1604), opioid (n = 448; n = 369), and stimulant (n = 545; n = 559) use disorders. The benchmark multiplier prevalence estimates were higher than NSDUH estimates for every type of substance use disorder in both years and in the combined 2018 to 2019 sample: 20.27% (95% CI, 17.04-24.71) vs 5.34% (95% CI, 5.10-5.58), respectively, for alcohol; 7.57% (95% CI, 5.96-9.93) vs 1.68% (95% CI, 1.59-1.79) for cannabis; 3.46% (95% CI, 2.97-4.12) vs 0.68% (0.60-0.78) for opioid; and 1.91% (95% CI, 1.63-2.30) vs 0.85% (95% CI, 0.75-0.96) for stimulant use disorders. In sensitivity analyses, the differences between the benchmark multiplier method and NSDUH estimates persisted over a wide range of potential multiplier values. Conclusions and Relevance The findings in this study reflect a higher national prevalence of substance use disorders than that represented by NSDUH estimates, suggesting a greater burden of these conditions in the US.
Collapse
Affiliation(s)
- Ramin Mojtabai
- Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University, Baltimore, Maryland
| |
Collapse
|
3
|
Chen S, Haziza D. General purpose multiply robust data integration procedures for handling non‐probability samples. Scand Stat Theory Appl 2022. [DOI: 10.1111/sjos.12605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Sixia Chen
- University of Oklahoma Health Sciences Center, Biostatistics and Epidemiology, 801 NE 13th St Oklahoma City Oklahoma United States
| | - David Haziza
- University of Ottawa, Mathematics and statistics Ottawa Ontario Canada
| |
Collapse
|
4
|
Bayesian Bootstrap in Multiple Frames. STATS 2022. [DOI: 10.3390/stats5020034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Multiple frames are becoming increasingly relevant due to the spread of surveys conducted via registers. In this regard, estimators of population quantities have been proposed, including the multiplicity estimator. In all cases, variance estimation still remains a matter of debate. This paper explores the potential of Bayesian bootstrap techniques for computing such estimators. The suitability of the method, which is compared to the existing frequentist bootstrap, is shown by conducting a small-scale simulation study and a case study.
Collapse
|
5
|
Chen S, Yang S, Kim JK. Nonparametric Mass Imputation for Data Integration. JOURNAL OF SURVEY STATISTICS AND METHODOLOGY 2022; 10:1-24. [PMID: 35083356 PMCID: PMC8784012 DOI: 10.1093/jssam/smaa036] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Data integration combining a probability sample with another nonprobability sample is an emerging area of research in survey sampling. We consider the case when the study variable of interest is measured only in the nonprobability sample, but comparable auxiliary information is available for both data sources. We consider mass imputation for the probability sample using the nonprobability data as the training set for imputation. The parametric mass imputation is sensitive to parametric model assumptions. To develop improved and robust methods, we consider nonparametric mass imputation for data integration. In particular, we consider kernel smoothing for a low-dimensional covariate and generalized additive models for a relatively high-dimensional covariate for imputation. Asymptotic theories and variance estimation are developed. Simulation studies and real applications show the benefits of our proposed methods over parametric counterparts.
Collapse
Affiliation(s)
- Sixia Chen
- Address correspondence to Sixia Chen, Department of Biostatistics and Epidemiology, The University of Oklahoma Health Sciences Center, Oklahoma City, OK 73126-0901, USA; E-mail:
| | | | | |
Collapse
|
6
|
Blumberg SJ, Parker JD, Moyer BC. National Health Interview Survey, COVID-19, and Online Data Collection Platforms: Adaptations, Tradeoffs, and New Directions. Am J Public Health 2021; 111:2167-2175. [PMID: 34878857 PMCID: PMC8667832 DOI: 10.2105/ajph.2021.306516] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/28/2021] [Indexed: 11/04/2022]
Abstract
High-quality data are accurate, relevant, and timely. Large national health surveys have always balanced the implementation of these quality dimensions to meet the needs of diverse users. The COVID-19 pandemic shifted these balances, with both disrupted survey operations and a critical need for relevant and timely health data for decision-making. The National Health Interview Survey (NHIS) responded to these challenges with several operational changes to continue production in 2020. However, data files from the 2020 NHIS were not expected to be publicly available until fall 2021. To fill the gap, the National Center for Health Statistics (NCHS) turned to 2 online data collection platforms-the Census Bureau's Household Pulse Survey (HPS) and the NCHS Research and Development Survey (RANDS)-to collect COVID-19‒related data more quickly. This article describes the adaptations of NHIS and the use of HPS and RANDS during the pandemic in the context of the recently released Framework for Data Quality from the Federal Committee on Statistical Methodology. (Am J Public Health. 2021;111(12):2167-2175. https://doi.org/10.2105/AJPH.2021.306516).
Collapse
Affiliation(s)
- Stephen J Blumberg
- All of the authors are with the National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD
| | - Jennifer D Parker
- All of the authors are with the National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD
| | - Brian C Moyer
- All of the authors are with the National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD
| |
Collapse
|
7
|
van Hasselt M. Data triangulation for substance abuse research. Addiction 2021; 116:2613-2615. [PMID: 34155713 DOI: 10.1111/add.15596] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 05/27/2021] [Indexed: 01/24/2023]
Affiliation(s)
- Martijn van Hasselt
- Department of Economics, The University of North Carolina at Greensboro, Greensboro, NC, USA
| |
Collapse
|
8
|
Frederiksen KS, Hesse M, Grittner U, Pedersen MU. Estimating perceived parental substance use disorder: Using register data to adjust for non-participation in survey research. Addict Behav 2021; 119:106897. [PMID: 33878599 DOI: 10.1016/j.addbeh.2021.106897] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 02/22/2021] [Accepted: 02/24/2021] [Indexed: 11/18/2022]
Abstract
AIMS To estimate the prevalence of parental substance use disorder (PSUD) in the general population based on young adults' reports adjusted for non-participation using register-based indicators of PSUD. DESIGN A national sample survey study combined with a retrospective register-based study. Setting Denmark. Participants 10,414 young people (aged 15-25 years) invited to two national sample surveys in 2014 and 2015 (5,755 participants and 4,659 non-participants). MEASUREMENTS A crude prevalence of PSUD was calculated based on participants' reports. Parental data from medical, mortality, prescription, and treatment registers (from the young adults' birth until the time of the surveys) were used to estimate a register-based prevalence of PSUD for both participants and non-participants. Differences between participants and non-participants were analysed using bivariate comparisons. Inverse probability weighting was used to adjust for bias due to non-participation. The crude prevalence of PSUD based on survey data was adjusted using the ratio of incidence proportion of the register-based PSUD compared with the survey-based PSUD. FINDINGS A total of 731 (12.7%) of the 5,755 survey participants reported PSUD. Register-based PSUD was more common among non-participants (856/4,659; 18.4%) compared with participants (738/5,755; 12.8%, OR = 1.53, 95% CI 1.38-1.70). The adjusted estimate of the survey-based PSUD increased by 2.5 percentage points, from 12.7% to 15.2%. CONCLUSIONS In the absence of register data, youth-reported PSUD is likely to underestimate the number of young people experiencing PSUD.
Collapse
Affiliation(s)
| | - Morten Hesse
- Centre for Alcohol and Drug Research, Aarhus University, Denmark
| | - Ulrike Grittner
- Institute of Biometry and Clinical Epidemiology, Charité - Universitätsmedizin Berlin, Germany; Berlin Institute of Health, Berlin, Germany
| | | |
Collapse
|
9
|
DEVER JILLA, AMAYA ASHLEY, SRIVASTAV ANUP, LU PENGJUN, ROYCROFT JESSICA, STANLEY MARSHICA, STRINGER MCHRISTOPHER, BOSTWICK MICHAELG, GREBY STACIEM, SANTIBANEZ TAMMYA, WILLIAMS WALTERW. FIT FOR PURPOSE IN ACTION: DESIGN, IMPLEMENTATION, AND EVALUATION OF THE NATIONAL INTERNET FLU SURVEY. JOURNAL OF SURVEY STATISTICS AND METHODOLOGY 2021; 9:449-476. [PMID: 36060551 PMCID: PMC9434706 DOI: 10.1093/jssam/smz050] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Researchers strive to design and implement high-quality surveys to maximize the utility of the data collected. The definitions of quality and usefulness, however, vary from survey to survey and depend on the analytic needs. Survey teams must evaluate the trade-offs of various decisions, such as when results are needed and their required level of precision, in addition to practical constraints like budget, before finalizing the design. Characteristics within the concept of fit for purpose (FfP) can provide the framework for considering the trade-offs. Furthermore, this tool can enable an evaluation of quality for the resulting estimates. Implementation of a FfP framework in this context, however, is not straightforward. In this article, we provide the reader with a glimpse of a FfP framework in action for obtaining estimates on early season influenza vaccination coverage estimates and on knowledge, attitudes, behaviors, and barriers related to influenza and influenza prevention among civilian noninstitutionalized adults aged 18 years and older in the United States. The result is the National Internet Flu Survey (NIFS), an annual, two-week internet survey sponsored by the US Centers for Disease Control and Prevention. In addition to critical design decisions, we use the established NIFS FfP framework to discuss the quality of the NIFS in meeting the intended objectives. We highlight aspects that work well and other survey traits requiring further evaluation. Differences found in comparing the NIFS to the National Flu Survey, the National Health Interview Survey, and Behavioral Risk Factor Surveillance System are discussed via their respective FfP characteristics. The findings presented here highlight the importance of the FfP framework for designing surveys, defining data quality, and providing a set a metrics used to advertise the intended use of the survey data and results.
Collapse
Affiliation(s)
- JILL A. DEVER
- Address correspondence to Jill A. Dever, RTI International, 701 13th St. NW, Suite 750, Washington, DC 20005-3967, USA;
| | - ASHLEY AMAYA
- RTI International, 701 13th St NW, Suite 750, Washington, DC 20005-3967, USA
| | - ANUP SRIVASTAV
- National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention, 1600 Clifton Road, Atlanta, GA 30329, USA and Leidos Inc., 11951 Freedom Drive, Reston, VA 20190, USA
| | - PENG-JUN LU
- National Center for Immunization and Respiratory Diseases, National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention, 1600 Clifton Road, Atlanta, GA 30329, USA
| | - JESSICA ROYCROFT
- RTI International, 3040 East Cornwallis Road, Research Triangle Park, NC, 27709-2194, USA
| | - MARSHICA STANLEY
- RTI International, 3040 East Cornwallis Road, Research Triangle Park, NC, 27709-2194, USA
| | - M. CHRISTOPHER STRINGER
- formerly at RTI International, is with the U.S. Census Bureau, 4600 Silver Hill Road, Hillcrest Heights, MD 20746, USA
| | - MICHAEL G. BOSTWICK
- formerly at RTI International, is with Squarespace, 8 Clarkson St, New York, NY 10014, USA
| | - STACIE M. GREBY
- National Center for Immunization and Respiratory Diseases, National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention, 1600 Clifton Road, Atlanta, GA 30329, USA
| | - TAMMY A. SANTIBANEZ
- National Center for Immunization and Respiratory Diseases, National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention, 1600 Clifton Road, Atlanta, GA 30329, USA
| | - WALTER W. WILLIAMS
- National Center for Immunization and Respiratory Diseases, National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention, 1600 Clifton Road, Atlanta, GA 30329, USA
| |
Collapse
|
10
|
King JH, Hall MAK, Goodman RA, Posner SF. Life in Data Sets: Locating and Accessing Data on the Health of Americans Across the Life Span. JOURNAL OF PUBLIC HEALTH MANAGEMENT AND PRACTICE 2021; 27:E126-E142. [PMID: 31688741 PMCID: PMC7190403 DOI: 10.1097/phh.0000000000001079] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
CONTEXT The US government manages a large number of data sets, including federally funded data collection activities that examine infectious and chronic conditions, as well as risk and protective factors for adverse health outcomes. Although there currently is no mature, comprehensive metadata repository of existing data sets, US federal agencies are working to develop and make metadata repositories available that will improve discoverability. However, because these repositories are not yet operating at full capacity, researchers must rely on their own knowledge of the field to identify available data sets. PROGRAM OR POLICY We sought to identify and consolidate a practical and annotated listing of those data sets. IMPLEMENTATION AND/OR DISSEMINATION Creative use of data resources to address novel questions is an important research skill in a wide range of fields including public health. This report identifies, promotes, and encourages the use of a range of data sources for health, behavior, economic, and policy research efforts across the life span. EVALUATION We identified and organized 28 federal data sets by the age-group of primary focus; not all groups are mutually exclusive. These data sets collectively represent a rich source of information that can be used to conduct descriptive epidemiologic studies. DISCUSSION The data sets identified in this article are not intended to represent an exhaustive list of all available data sets. Rather, we present an introduction/overview of the current federal data collection landscape and some of its largest and most frequently utilized data sets.
Collapse
Affiliation(s)
- Jaron Hoani King
- Department of Public Health, Brigham Young University, Provo, Utah (Mr King); Cherokee Nation Assurance (Ms Hall), National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia (Ms Hall and Dr Posner); and Department of Family and Preventive Medicine, Emory University School of Medicine, Atlanta, Georgia (Dr Goodman)
| | | | | | | |
Collapse
|
11
|
Kislaya I, Leite A, Perelman J, Machado A, Torres AR, Tolonen H, Nunes B. Combining self-reported and objectively measured survey data to improve hypertension prevalence estimates: Portuguese experience. Arch Public Health 2021; 79:45. [PMID: 33827693 PMCID: PMC8028082 DOI: 10.1186/s13690-021-00562-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 03/15/2021] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Accurate data on hypertension is essential to inform decision-making. Hypertension prevalence may be underestimated by population-based surveys due to misclassification of health status by participants. Therefore, adjustment for misclassification bias is required when relying on self-reports. This study aims to quantify misclassification bias in self-reported hypertension prevalence and prevalence ratios in the Portuguese component of the European Health Interview Survey (INS2014), and illustrate application of multiple imputation (MIME) for bias correction using measured high blood pressure data from the first Portuguese health examination survey (INSEF). METHODS We assumed that objectively measured hypertension status was missing for INS2014 participants (n = 13,937) and imputed it using INSEF (n = 4910) as auxiliary data. Self-reported, objectively measured and MIME-corrected hypertension prevalence and prevalence ratios (PR) by sex, age group and education were estimated. Bias in self-reported and MIME-corrected estimates were computed using objectively measured INSEF data as a gold-standard. RESULTS Self-reported INS2014 data underestimated hypertension prevalence in all population subgroups, with misclassification bias ranging from 5.2 to 18.6 percentage points (pp). After MIME-correction, prevalence estimates increased and became closer to objectively measured ones, with bias reduction to 0 pp - 5.7 pp. Compared to objectively measured INSEF, self-reported INS2014 data considerably underestimated prevalence ratio by sex (PR = 0.8, 95CI = [0.7, 0.9] vs. PR = 1.2, 95CI = [1.1, 1.4]). MIME successfully corrected direction of association with sex in bivariate (PR = 1.1, 95CI = [1.0, 1.3]) and multivariate analyses (PR = 1.2, 95CI = [1.0, 1.3]). Misclassification bias in hypertension prevalence ratios by education and age group were less pronounced and did not require correction in multivariate analyses. CONCLUSIONS Our results highlight the importance of misclassification bias analysis in self-reported hypertension. Multiple imputation is a feasible approach to adjust for misclassification bias in prevalence estimates and exposure-outcomes associations in survey data.
Collapse
Affiliation(s)
- Irina Kislaya
- Departament of Epidemiology, National Health Institute Doutor Ricardo Jorge, Lisbon, Portugal.
- NOVA National School of Public Health, Public Health Research Centre, Universidade NOVA de Lisboa, Lisbon, Portugal.
- Comprehensive Health Research Center (CHRC), Universidade NOVA de Lisboa, Lisbon, Portugal.
| | - Andreia Leite
- NOVA National School of Public Health, Public Health Research Centre, Universidade NOVA de Lisboa, Lisbon, Portugal
- Comprehensive Health Research Center (CHRC), Universidade NOVA de Lisboa, Lisbon, Portugal
| | - Julian Perelman
- NOVA National School of Public Health, Public Health Research Centre, Universidade NOVA de Lisboa, Lisbon, Portugal
- Comprehensive Health Research Center (CHRC), Universidade NOVA de Lisboa, Lisbon, Portugal
| | - Ausenda Machado
- Departament of Epidemiology, National Health Institute Doutor Ricardo Jorge, Lisbon, Portugal
- NOVA National School of Public Health, Public Health Research Centre, Universidade NOVA de Lisboa, Lisbon, Portugal
- Comprehensive Health Research Center (CHRC), Universidade NOVA de Lisboa, Lisbon, Portugal
| | - Ana Rita Torres
- Departament of Epidemiology, National Health Institute Doutor Ricardo Jorge, Lisbon, Portugal
| | - Hanna Tolonen
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare (THL), Helsinki, Finland
| | - Baltazar Nunes
- Departament of Epidemiology, National Health Institute Doutor Ricardo Jorge, Lisbon, Portugal
- NOVA National School of Public Health, Public Health Research Centre, Universidade NOVA de Lisboa, Lisbon, Portugal
- Comprehensive Health Research Center (CHRC), Universidade NOVA de Lisboa, Lisbon, Portugal
| |
Collapse
|
12
|
Kim J, Tam S. Data Integration by Combining Big Data and Survey Sample Data for Finite Population Inference. Int Stat Rev 2020. [DOI: 10.1111/insr.12434] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Jae‐Kwang Kim
- Department of Statistics Iowa State University Ames Iowa USA
| | - Siu‐Ming Tam
- Methodology Division, Australian Bureau of Statistics, Canberra and School of Mathematics and Statistics University of Wollongong Wollongong New South Wales Australia
| |
Collapse
|
13
|
Parker J, Miller K, He Y, Scanlon P, Cai B, Shin HC, Parsons V, Irimata K. Overview and Initial Results of the National Center for Health Statistics' Research and Development Survey. STATISTICAL JOURNAL OF THE IAOS 2020; 36:1199-1211. [PMID: 35923778 PMCID: PMC9345606 DOI: 10.3233/sji-200678] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The National Center for Health Statistics is assessing the usefulness of recruited web panels in multiple research areas. One research area examines the use of close-ended probe questions and split-panel experiments for evaluating question-response patterns. Another research area is the development of statistical methodology to leverage the strength of national survey data to evaluate, and possibly improve, health estimates from recruited panels. Recruited web panels, with their lower cost and faster production cycle, in combination with established population health surveys, may be useful for some purposes for statistical agencies. Our initial results indicate that web survey data from a recruited panel can be used for question evaluation studies without affecting other survey content. However, the success of these data to provide estimates that align with those from large national surveys will depend on many factors, including further understanding of design features of the recruited panel (e.g. coverage and mode effects), the statistical methods and covariates used to obtain the original and adjusted weights, and the health outcomes of interest.
Collapse
Affiliation(s)
- Jennifer Parker
- Division of Research and Methodology, National Center for Health Statistics, 3311 Toledo Road, #4650, Hyattsville, MD 20782 USA
| | - Kristen Miller
- Division of Research and Methodology, National Center for Health Statistics, 3311 Toledo Road, #4650, Hyattsville, MD 20782 USA
| | - Yulei He
- Division of Research and Methodology, National Center for Health Statistics, 3311 Toledo Road, #4650, Hyattsville, MD 20782 USA
| | - Paul Scanlon
- Division of Research and Methodology, National Center for Health Statistics, 3311 Toledo Road, #4650, Hyattsville, MD 20782 USA
| | - Bill Cai
- Division of Research and Methodology, National Center for Health Statistics, 3311 Toledo Road, #4650, Hyattsville, MD 20782 USA
| | - Hee-Choon Shin
- Division of Research and Methodology, National Center for Health Statistics, 3311 Toledo Road, #4650, Hyattsville, MD 20782 USA
| | - Van Parsons
- Division of Research and Methodology, National Center for Health Statistics, 3311 Toledo Road, #4650, Hyattsville, MD 20782 USA
| | - Katherine Irimata
- Division of Research and Methodology, National Center for Health Statistics, 3311 Toledo Road, #4650, Hyattsville, MD 20782 USA
| |
Collapse
|
14
|
RAGHUNATHAN TRIVELLORE, GHOSH KAUSHIK, ROSEN ALLISON, IMBRIANO PAUL, STEWART SUSAN, BONDARENKO IRINA, MESSER KASSANDRA, BERGLUND PATRICIA, SHAFFER JAMES, CUTLER DAVID. COMBINING INFORMATION FROM MULTIPLE DATA SOURCES TO ASSESS POPULATION HEALTH. JOURNAL OF SURVEY STATISTICS AND METHODOLOGY 2020; 9:598-625. [PMID: 34337089 PMCID: PMC8324014 DOI: 10.1093/jssam/smz047] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Information about an extensive set of health conditions on a well-defined sample of subjects is essential for assessing population health, gauging the impact of various policies, modeling costs, and studying health disparities. Unfortunately, there is no single data source that provides accurate information about health conditions. We combine information from several administrative and survey data sets to obtain model-based dummy variables for 107 health conditions (diseases, preventive measures, and screening for diseases) for elderly (age 65 and older) subjects in the Medicare Current Beneficiary Survey (MCBS) over the fourteen-year period, 1999-2012. The MCBS has prevalence of diseases assessed based on Medicare claims and provides detailed information on all health conditions but is prone to underestimation bias. The National Health and Nutrition Examination Survey (NHANES), on the other hand, collects self-reports and physical/laboratory measures only for a subset of the 107 health conditions. Neither source provides complete information, but we use them together to derive model-based corrected dummy variables in MCBS for the full range of existing health conditions using a missing data and measurement error model framework. We create multiply imputed dummy variables and use them to construct the prevalence rate and trend estimates. The broader goal, however, is to use these corrected or modeled dummy variables for a multitude of policy analysis, cost modeling, and analysis of other relationships either using them as predictors or as outcome variables.
Collapse
Affiliation(s)
- TRIVELLORE RAGHUNATHAN
- Address correspondence to Trivellore Raghunathan, Department of Biostatistics, 1415 Washington Heights, University of Michigan, Ann Arbor, MI 48109, USA;
| | - KAUSHIK GHOSH
- National Bureau of Economic Research (NBER), 1050 Massachusetts Ave, Cambridge, MA 02138
| | - ALLISON ROSEN
- Department of Quantitative Health Sciences University of Massachusetts Medical School 368 Plantation Street, AS9-1083, Worcester, MA 01655; NBER
| | - PAUL IMBRIANO
- Department of Biostatistics, 1415 Washington Heights, University of Michigan, Ann Arbor, MI 48109
| | - SUSAN STEWART
- National Bureau of Economic Research (NBER), 1050 Massachusetts Ave, Cambridge, MA 02138
| | - IRINA BONDARENKO
- Department of Biostatistics, 1415 Washington Heights, University of Michigan, Ann Arbor, MI 48109
| | - KASSANDRA MESSER
- Survey Research Center, Institute for Social Research, 426 Thompson Street, University of Michigan, Ann Arbor, MI 48106
| | - PATRICIA BERGLUND
- Survey Research Center, Institute for Social Research, 426 Thompson Street, University of Michigan, Ann Arbor, MI 48106
| | | | - DAVID CUTLER
- Department of Economics, Harvard University, 1805 Cambridge St, Cambridge, MA 02138; NBER
| |
Collapse
|
15
|
Waal T, Delden A, Scholtus S. Multi‐source Statistics: Basic Situations and Methods. Int Stat Rev 2020. [DOI: 10.1111/insr.12352] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Ton Waal
- Statistics Netherlands PO Box 24500 The Hague 2490 HA The Netherlands
- Department of Methods and StatisticsTilburg University PO Box 90153 Tilburg 5000 LE The Netherlands
| | - Arnout Delden
- Statistics Netherlands PO Box 24500 The Hague 2490 HA The Netherlands
| | - Sander Scholtus
- Statistics Netherlands PO Box 24500 The Hague 2490 HA The Netherlands
| |
Collapse
|
16
|
Kalton G. Developments in Survey Research over the Past 60 Years: A Personal Perspective. Int Stat Rev 2018. [DOI: 10.1111/insr.12287] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
17
|
Affiliation(s)
- Mary E. Thompson
- Department of Statistics and Actuarial ScienceUniversity of Waterloo Waterloo, ON N2L 3G1 Canada
| |
Collapse
|
18
|
Marker DA, Mardon R, Jenkins F, Campione J, Nooney J, Li J, Saydeh S, Zhang X, Shrestha S, Rolka D. State-level estimation of diabetes and prediabetes prevalence: Combining national and local survey data and clinical data. Stat Med 2018; 37:3975-3990. [PMID: 29931829 DOI: 10.1002/sim.7848] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2017] [Revised: 02/22/2018] [Accepted: 05/18/2018] [Indexed: 11/11/2022]
Abstract
Many statisticians and policy researchers are interested in using data generated through the normal delivery of health care services, rather than carefully designed and implemented population-representative surveys, to estimate disease prevalence. These larger databases allow for the estimation of smaller geographies, for example, states, at potentially lower expense. However, these health care records frequently do not cover all of the population of interest and may not collect some covariates that are important for accurate estimation. In a recent paper, the authors have described how to adjust for the incomplete coverage of administrative claims data and electronic health records at the state or local level. This article illustrates how to adjust and combine multiple data sets, namely, national surveys, state-level surveys, claims data, and electronic health record data, to improve estimates of diabetes and prediabetes prevalence, along with the estimates of the method's accuracy. We demonstrate and validate the method using data from three jurisdictions (Alabama, California, and New York City). This method can be applied more generally to other areas and other data sources.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Sharon Saydeh
- Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Xuanping Zhang
- Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Sundar Shrestha
- Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Deborah Rolka
- Centers for Disease Control and Prevention, Atlanta, GA, USA
| |
Collapse
|
19
|
|