1
|
Sun J, Van Baelen L, Plettinckx E, Crawford FW. Dependence-Robust Confidence Intervals for Capture-Recapture Surveys. JOURNAL OF SURVEY STATISTICS AND METHODOLOGY 2023; 11:1133-1154. [PMID: 37975066 PMCID: PMC10646701 DOI: 10.1093/jssam/smac031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2023]
Abstract
Capture-recapture (CRC) surveys are used to estimate the size of a population whose members cannot be enumerated directly. CRC surveys have been used to estimate the number of Coronavirus Disease 2019 (COVID-19) infections, people who use drugs, sex workers, conflict casualties, and trafficking victims. When k-capture samples are obtained, counts of unit captures in subsets of samples are represented naturally by a 2 k contingency table in which one element-the number of individuals appearing in none of the samples-remains unobserved. In the absence of additional assumptions, the population size is not identifiable (i.e., point identified). Stringent assumptions about the dependence between samples are often used to achieve point identification. However, real-world CRC surveys often use convenience samples in which the assumed dependence cannot be guaranteed, and population size estimates under these assumptions may lack empirical credibility. In this work, we apply the theory of partial identification to show that weak assumptions or qualitative knowledge about the nature of dependence between samples can be used to characterize a nontrivial confidence set for the true population size. We construct confidence sets under bounds on pairwise capture probabilities using two methods: test inversion bootstrap confidence intervals and profile likelihood confidence intervals. Simulation results demonstrate well-calibrated confidence sets for each method. In an extensive real-world study, we apply the new methodology to the problem of using heterogeneous survey data to estimate the number of people who inject drugs in Brussels, Belgium.
Collapse
Affiliation(s)
- Jinghao Sun
- is a PhD Candidate in Biostatistics at the Yale School of Public Health, New Haven, CT, USA
| | - Luk Van Baelen
- is a Senior Scientist in the Department of Epidemiology and Public Health at the Sciensano, Rue Juliette Wytsmanstraat, 14, Brussels 1050, Belgium
| | - Els Plettinckx
- is a Principal Research Scientist at the Department of Epidemiology and Public Health at the Sciensano, Rue Juliette Wytsmanstraat, 14, Brussels 1050, Belgium
| | - Forrest W Crawford
- is an Associate Professor of Biostatistics, Statistics & Data Science, Operations, and Ecology & Evolutionary Biology at the Yale University, New Haven, CT, USA
| |
Collapse
|
2
|
Thompson K, Barocas JA, Delcher C, Bae J, Hammerslag L, Wang J, Chandler R, Villani J, Walsh S, Talbert J. The prevalence of opioid use disorder in Kentucky's counties: A two-year multi-sample capture-recapture analysis. Drug Alcohol Depend 2023; 242:109710. [PMID: 36469995 PMCID: PMC9772240 DOI: 10.1016/j.drugalcdep.2022.109710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 10/23/2022] [Accepted: 11/18/2022] [Indexed: 11/23/2022]
Abstract
BACKGROUND Kentucky has one of the highest opioid overdose mortality rates in the United States. Accurate estimates of people with opioid use disorder (OUD) are critical to plan for the scope of interventions required to reduce overdose and opioid misuse. Commonly used household surveys are known to underestimate OUD at the state-level and do not provide county-level estimates. METHODS We performed a multi-sample capture-recapture analysis to estimate OUD prevalence in Kentucky in 2018 and 2019. We utilized four statewide datasets that were linked at the individual level: 1) Registry of Vital Statistics, 2) Emergency Medical Services (EMS), 3) Kentucky's Prescription Drug Monitoring Program (PDMP), and 4) Kentucky Medicaid. We included persons aged 18-64 years who resided in Kentucky between 2018 and 2019. We identified individuals with administrative data consistent with OUD in each of the datasets, including a fatal opioid-involved overdose (Vital Statistics), EMS runs for suspected opioid overdose, receipt of buprenorphine for OUD treatment (PDMP), or Medicaid claims for OUD. Observed and estimated counts of OUD cases and prevalence of OUD among the adult population in Kentucky. RESULTS The estimated statewide OUD prevalence was 5.5 % and 5.9 % for 2018 and 2019, respectively, ranging from 1.3 % to 17.7 % across Kentucky counties. As expected, counties with the highest OUD rates were Appalachian counties (eastern area) of the state. CONCLUSIONS Our analysis reveals a substantially larger proportion of KY residents have OUD than previously estimated. Our approach offers a model for states needing county-level estimates of OUD.
Collapse
Affiliation(s)
- Katherine Thompson
- Dr. Bing Zhang Department of Statistics, College of Arts and Sciences, University of Kentucky, Lexington, KY, United States
| | - Joshua A Barocas
- Sections of General Internal Medicine and Infectious Diseases, University of Colorado School of Medicine, Aurora, CO, United States.
| | - Chris Delcher
- Institute for Pharmaceutical Outcomes and Policy, College of Pharmacy, University of Kentucky, Lexington, KY, United States; Department of Pharmacy Practice & Science, College of Pharmacy, University of Kentucky, Lexington, KY, United States
| | - Jungjun Bae
- Institute for Pharmaceutical Outcomes and Policy, College of Pharmacy, University of Kentucky, Lexington, KY, United States; Department of Pharmacy Practice & Science, College of Pharmacy, University of Kentucky, Lexington, KY, United States
| | - Lindsey Hammerslag
- Institute for Biomedical Informatics, University of Kentucky, Lexington, KY, United States
| | - Jianing Wang
- Boston University School of Public Health, Boston, MA, United States
| | | | | | - Sharon Walsh
- Center on Drug and Alcohol Research, College of Medicine, University of Kentucky, Lexington, KY, United States; Department of Behavioral Science, College of Medicine, University of Kentucky, Lexington, KY, United States
| | - Jeffery Talbert
- Institute for Biomedical Informatics, University of Kentucky, Lexington, KY, United States; Division of Biomedical Informatics, Department of Internal Medicine, College of Medicine, University of Kentucky, Lexington, KY, United States
| |
Collapse
|
3
|
Farcomeni A. How many refugees and migrants died trying to reach Europe? Joint population size and total estimation. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Alessio Farcomeni
- Department of Economics and Finance, University of Rome “Tor Vergata”
| |
Collapse
|
4
|
Altieri L, Farcomeni A, Fegatelli DA. Continuous time-interaction processes for population size estimation, with an application to drug dealing in Italy. Biometrics 2022. [PMID: 35289395 DOI: 10.1111/biom.13662] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Accepted: 03/07/2022] [Indexed: 11/30/2022]
Abstract
We introduce a time-interaction point process where the occurrence of an event can increase (self-excitement) or reduce (self-correction) the probability of future events. Self-excitement and self-correction are allowed to be triggered by the same event, at different time scales; other effects such as those of covariates, unobserved heterogeneity, and temporal dependence are also allowed in the model. We focus on capture-recapture data, as our work is motivated by an original example about estimation of the total number of drug dealers in Italy. To do so, we derive a conditional likelihood formulation where only subjects with at least one capture are involved in the inference process. The result is a novel and flexible continuous-time population size estimator. A simulation study and the analysis of our motivating example illustrate the validity of our approach in several scenarios. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Linda Altieri
- Dpt. of Statistical Sciences, University of Bologna, via Belle Arti 41, Bologna, 40126, Italy
| | - Alessio Farcomeni
- Dpt. of Economics and Finance, University of Rome "Tor Vergata", Via Columbia 2, Rome, 00133, Italy
| | - Danilo Alunni Fegatelli
- Dpt. of Public Health and Infectious Diseases, Sapienza - University of Rome, P.le Aldo Moro 5, Rome, 00185, Italy
| |
Collapse
|
5
|
Gutreuter S. Comparative performance of multiple-list estimators of key population size. PLOS GLOBAL PUBLIC HEALTH 2022; 2:e0000155. [PMID: 35928219 PMCID: PMC9345571 DOI: 10.1371/journal.pgph.0000155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 12/16/2021] [Indexed: 06/15/2023]
Abstract
Estimates of the sizes of key populations (KPs) affected by HIV, including men who have sex with men, female sex workers and people who inject drugs, are required for targeting epidemic control efforts where they are most needed. Unfortunately, different estimators often produce discrepant results, and an objective basis for choice is lacking. This simulation study provides the first comparison of information-theoretic selection of loglinear models (LLM-AIC), Bayesian model averaging of loglinear models (LLM-BMA) and Bayesian nonparametric latent-class modeling (BLCM) for estimation of population size from multiple lists. Four hundred random samples from populations of size 1,000, 10,000 and 20,000, each including five encounter opportunities, were independently simulated using each of 30 data-generating models obtained from combinations of six patterns of variation in encounter probabilities and five expected per-list encounter probabilities, producing a total of 36,000 samples. Population size was estimated for each combination of sample and sequentially cumulative sets of 2-5 lists using LLM-AIC, LLM-BMA and BLCM. LLM-BMA and BLCM were quite robust and performed comparably in terms of root mean-squared error and bias, and outperformed LLM-AIC. All estimation methods produced uncertainty intervals which failed to achieve the nominal coverage, but LLM-BMA, as implemented in the dga R package produced the best balance of accuracy and interval coverage. The results also indicate that two-list estimation is unnecessarily vulnerable, and it is better to estimate the sizes of KPs based on at least three lists.
Collapse
Affiliation(s)
- Steve Gutreuter
- Division of Global HIV and TB, U.S. Centers for Disease Control and Prevention, Atlanta, Georgia, United States of America
| |
Collapse
|
6
|
Plettinckx E, Crawford FW, Antoine J, Gremeaux L, Van Baelen L. Estimates of people who injected drugs within the last 12 months in Belgium based on a capture-recapture and multiplier method. Drug Alcohol Depend 2021; 219:108436. [PMID: 33310486 PMCID: PMC7856246 DOI: 10.1016/j.drugalcdep.2020.108436] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 11/03/2020] [Accepted: 11/04/2020] [Indexed: 10/22/2022]
Abstract
BACKGROUND For Belgium, available estimates of the number of people who inject drugs (PWID) are based on data from more than fifteen years ago and apply only to those who report ever injecting drugs. As a result, no reliable baseline data exist to determine the scale of services for PWID. METHODS We obtained pseudo-anonymized identifier information from treatment and harm reduction service providers and a fieldwork study between February and April 2019 in Brussels. We estimated the number of PWID, defined as people who injected within the last 12 months, in Brussels using capture-recapture (CRC) methodology. To obtain national estimates, we scaled the proportion of PWID in Brussels to the total number of this population in Belgium based on two existing drug treatment registers, which were then multiplied with the result of the CRC. RESULTS The total population of PWID is estimated to be 703 (95 %CI 538-935) for Brussels and between 6620 (95 %CI 4711 - 8576) and 7018 (95 %CI 4794 - 9527) for Belgium. CONCLUSIONS These estimates provide crucial information to ensure that services to PWID are adequately maintained. They clearly indicate the need to maximize efforts to achieve the targets set by WHO for 2030 on the provision of 300 sterile needles and syringes per PWID per year, a 90 % reduction of new HCV infections, and a 65 % reduction of liver-related mortality.
Collapse
Affiliation(s)
- Els Plettinckx
- Directorate of Epidemiology and Public Health, Sciensano, Rue Juliette Wytsmanstraat, 14, 1050 Brussels, Belgium.
| | - Forrest W. Crawford
- Department of Biostatistics, Yale School of Public Health, 60 College Street, New Haven, CT 06520-0834, United States
| | - Jérôme Antoine
- Directorate of Epidemiology and public health, Sciensano, Rue Juliette Wytsmanstraat, 14, 1050 Brussels, Belgium
| | - Lies Gremeaux
- Directorate of Epidemiology and public health, Sciensano, Rue Juliette Wytsmanstraat, 14, 1050 Brussels, Belgium
| | - Luk Van Baelen
- Directorate of Epidemiology and public health, Sciensano, Rue Juliette Wytsmanstraat, 14, 1050 Brussels, Belgium
| |
Collapse
|
7
|
Jones HE, Harris RJ, Downing BC, Pierce M, Millar T, Ades AE, Welton NJ, Presanis AM, Angelis DD, Hickman M. Estimating the prevalence of problem drug use from drug-related mortality data. Addiction 2020; 115:2393-2404. [PMID: 32392631 PMCID: PMC7613965 DOI: 10.1111/add.15111] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Revised: 11/05/2019] [Accepted: 05/04/2020] [Indexed: 01/07/2023]
Abstract
BACKGROUND AND AIMS Indirect estimation methods are required for estimating the size of populations where only a proportion of individuals are observed directly, such as problem drug users (PDUs). Capture-recapture and multiplier methods are widely used, but have been criticized as subject to bias. We propose a new approach to estimating prevalence of PDU from numbers of fatal drug-related poisonings (fDRPs) using linked databases, addressing the key limitations of simplistic 'mortality multipliers'. METHODS Our approach requires linkage of data on a large cohort of known PDUs to mortality registers and summary information concerning additional fDRPs observed outside this cohort. We model fDRP rates among the cohort and assume that rates in unobserved PDUs are equal to rates in the cohort during periods out of treatment. Prevalence is estimated in a Bayesian statistical framework, in which we simultaneously fit regression models to fDRP rates and prevalence, allowing both to vary by demographic factors and the former also by treatment status. RESULTS We report a case study analysis, estimating the prevalence of opioid dependence in England in 2008/09, by gender, age group and geographical region. Overall prevalence was estimated as 0.82% (95% credible interval = 0.74-0.94%) of 15-64-year-olds, which is similar to a published estimate based on capture-recapture analysis. CONCLUSIONS Our modelling approach estimates prevalence from drug-related mortality data, while addressing the main limitations of simplistic multipliers. This offers an alternative approach for the common situation where available data sources do not meet the strong assumptions required for valid capture-recapture estimation. In a case study analysis, prevalence estimates based on our approach were surprisingly similar to existing capture-recapture estimates but, we argue, are based on a much more objective and justifiable modelling approach.
Collapse
Affiliation(s)
- Hayley E. Jones
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Ross J. Harris
- Centre for Infectious Disease Surveillance and Control, Public Health England, London, UK
| | - Beatrice C. Downing
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Matthias Pierce
- Division of Psychology and Mental Health, School of Health Sciences, University of Manchester, Manchester, UK
| | - Tim Millar
- Division of Psychology and Mental Health, School of Health Sciences, University of Manchester, Manchester, UK
| | - A. E. Ades
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Nicky J. Welton
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | | | - Daniela De Angelis
- Centre for Infectious Disease Surveillance and Control, Public Health England, London, UK,MRC Biostatistics Unit, University of Cambridge, Cambridge, UK
| | - Matthew Hickman
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| |
Collapse
|