1
|
Arima S, Polettini S, Pasculli G, Gesualdo L, Pesce F, Procaccini DA. A Bayesian nonparametric approach to correct for underreporting in count data. Biostatistics 2024; 25:904-918. [PMID: 37811675 DOI: 10.1093/biostatistics/kxad027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 06/06/2023] [Accepted: 08/21/2023] [Indexed: 10/10/2023] Open
Abstract
We propose a nonparametric compound Poisson model for underreported count data that introduces a latent clustering structure for the reporting probabilities. The latter are estimated with the model's parameters based on experts' opinion and exploiting a proxy for the reporting process. The proposed model is used to estimate the prevalence of chronic kidney disease in Apulia, Italy, based on a unique statistical database covering information on m = 258 municipalities obtained by integrating multisource register information. Accurate prevalence estimates are needed for monitoring, surveillance, and management purposes; yet, counts are deemed to be considerably underreported, especially in some areas of Apulia, one of the most deprived and heterogeneous regions in Italy. Our results agree with previous findings and highlight interesting geographical patterns of the disease. We compare our model to existing approaches in the literature using simulated as well as real data on early neonatal mortality risk in Brazil, described in previous research: the proposed approach proves to be accurate and particularly suitable when partial information about data quality is available.
Collapse
Affiliation(s)
- Serena Arima
- Department of Human and Social Sciences, University of Salento, Via di Valesio, 73100, LECCE, Italy
| | - Silvia Polettini
- Department of Social and Economic Sciences, Sapienza University, P.le Aldo Moro, 5, 00185 ROMA, Italy
| | - Giuseppe Pasculli
- Department of Computer, Control, and Management Engineering "Antonio Ruberti", Sapienza University, Via Ariosto, 25, 00185 Roma RM, Italy
| | - Loreto Gesualdo
- Section of Nephrology, Department of Precision and Regenerative Medicine and Ionian Area (DiMePre-J), Azienda Ospedaliero Universitaria Consorziale Policlinico di Bari, Piazza Giulio Cesare, 11 - 70124 Bari, Italy
| | - Francesco Pesce
- Division of Renal Medicine, "Fatebenefratelli Isola Tiberina-Gemelli Isola", 00186 Rome, Italy
| | - Deni-Aldo Procaccini
- Section of Nephrology, Department of Precision and Regenerative Medicine and Ionian Area (DiMePre-J), Azienda Ospedaliero Universitaria Consorziale Policlinico di Bari, Piazza Giulio Cesare, 11 - 70124 Bari, Italy
| |
Collapse
|
2
|
Lei J, MacNab Y. Bayesian hierarchical spatiotemporal models for prediction of (under)reporting rates and cases: COVID-19 infection among the older people in the United States during the 2020-2022 pandemic. Spat Spatiotemporal Epidemiol 2024; 49:100658. [PMID: 38876569 DOI: 10.1016/j.sste.2024.100658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 03/25/2024] [Accepted: 05/08/2024] [Indexed: 06/16/2024]
Abstract
The gap between the reported and actual COVID-19 infection cases has been an issue of concern. Here, we present Bayesian hierarchical spatiotemporal disease mapping models for state-level predictions of COVID-19 infection risks and (under)reporting rates among people aged 65 and above during the first two years of the pandemic in the United States. With prior elicitation based on recent prevalence studies, the study suggests that the median state-level reporting rate of COVID-19 infection was 90% (interquartile range: [78%, 96%]). Our study uncovers spatiotemporal variations and dynamics in state-level infection risks and (under)reporting rates, suggesting time-varying associations between higher population density, higher percentage of minorities, and higher percentage of vaccination and increased risks of COVID-19 infection, as well as an association between more easily accessible tests and higher reporting rates. With sensitivity analyses, we highlight the impact and importance of incorporating covariates information and objective prior references for evaluating the issue of underreporting.
Collapse
Affiliation(s)
- Jingxin Lei
- School of Public Health, University of British Columbia, 2206 East Mall, Vancouver, V6T 1Z3, BC, Canada.
| | - Ying MacNab
- School of Public Health, University of British Columbia, 2206 East Mall, Vancouver, V6T 1Z3, BC, Canada
| |
Collapse
|
3
|
Lopes de Oliveira G, Ferreira AJ, Teles CADS, Paixao ES, Fiaccone R, Lana R, Aquino R, Cardoso AM, Soares MA, Oliveira dos Santos I, Pereira M, Barreto ML, Ichihara MY. Estimating the real burden of gestational syphilis in Brazil, 2007-2018: a Bayesian modeling study. LANCET REGIONAL HEALTH. AMERICAS 2023; 25:100564. [PMID: 37575963 PMCID: PMC10415804 DOI: 10.1016/j.lana.2023.100564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 07/16/2023] [Accepted: 07/17/2023] [Indexed: 08/15/2023]
Abstract
Background Although several studies have estimated gestational syphilis (GS) incidence in several countries, underreporting correction is rarely considered. This study aimed to estimate the level of under-registration and correct the GS incidence rates in the 557 Brazilian microregions. Methods Brazilian GS notifications between 2007 and 2018 were obtained from the SINAN-Syphilis system. A cluster analysis was performed to group microregions according to the quality of GS notification. A Bayesian hierarchical Poisson regression model was applied to estimate the reporting probabilities among the clusters and to correct the associated incidence rates. Findings We estimate that 45,196 (90%-HPD: 13,299; 79,310) GS cases were underreported in Brazil from 2007 to 2018, representing a coverage of 87.12% (90%-HPD: 79.40%; 95.83%) of registered cases, where HPD stands for the Bayesian highest posterior density credible interval. Underreporting levels differ across the country, with microregions in North and Northeast regions presenting the highest percentage of missed cases. After underreporting correction, Brazil's estimated GS incidence rate increased from 8.74 to 10.02 per 1000 live births in the same period. Interpretation Our findings highlight disparities in the registration level and incidence rate of GS in Brazil, reflecting regional heterogeneity in the quality of syphilis surveillance, access to prenatal care, and childbirth assistance services. This study provides robust evidence to enhance national surveillance systems, guide specific policies for GS detection disease control, and potentially mitigate the harmful consequences of mother-to-child transmission. The methodology might be applied in other regions to correct disease underreporting. Funding National Council for Scientific and Technological Development; The Bill Melinda Gates Foundation and Wellcome Trust.
Collapse
Affiliation(s)
- Guilherme Lopes de Oliveira
- Centre of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fiocruz, Salvador, Bahia, Brazil
- Department of Computing, Federal Centre of Technological Education of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Andrêa J.F. Ferreira
- Centre of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fiocruz, Salvador, Bahia, Brazil
- The Ubuntu Center on Racism, Global Movement, Population and Equity, School of Public Health, Drexel University, Pennsylvania, USA
| | - Carlos Antônio de S.S. Teles
- Centre of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fiocruz, Salvador, Bahia, Brazil
| | - Enny S. Paixao
- Centre of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fiocruz, Salvador, Bahia, Brazil
- London School of Hygiene and Tropical Medicine, London, UK
| | - Rosemeire Fiaccone
- Centre of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fiocruz, Salvador, Bahia, Brazil
- Statistics Department, Institute of Mathematics, Federal University of Bahia, Salvador, Bahia, Brazil
| | - Raquel Lana
- Centre of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fiocruz, Salvador, Bahia, Brazil
- Barcelona Supercomputing Center, Catalonia, Spain
| | - Rosana Aquino
- Centre of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fiocruz, Salvador, Bahia, Brazil
- Institute of Collective Health, Federal University of Bahia, Salvador, Bahia, Brazil
| | | | - Maria Auxiliadora Soares
- Centre of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fiocruz, Salvador, Bahia, Brazil
- Institute of Collective Health, Federal University of Bahia, Salvador, Bahia, Brazil
| | - Idália Oliveira dos Santos
- Centre of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fiocruz, Salvador, Bahia, Brazil
- Institute of Collective Health, Federal University of Bahia, Salvador, Bahia, Brazil
| | - Marcos Pereira
- Centre of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fiocruz, Salvador, Bahia, Brazil
- Institute of Collective Health, Federal University of Bahia, Salvador, Bahia, Brazil
| | - Maurício L. Barreto
- Centre of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fiocruz, Salvador, Bahia, Brazil
- Institute of Collective Health, Federal University of Bahia, Salvador, Bahia, Brazil
| | - Maria Yury Ichihara
- Centre of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fiocruz, Salvador, Bahia, Brazil
| |
Collapse
|
4
|
Efficiently exploring for human robot interaction: partially observable Poisson processes. Auton Robots 2022. [DOI: 10.1007/s10514-022-10070-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
AbstractConsider a mobile robot exploring an office building with the aim of observing as much human activity as possible over several days. It must learn where and when people are to be found, count the observed activities, and revisit popular places at the right time. In this paper we present a series of Bayesian estimators for the levels of human activity that improve on simple counting. We then show how these estimators can be used to drive efficient exploration for human activities. The estimators arise from modelling the human activity counts as a partially observable Poisson process (POPP). This paper presents novel extensions to POPP for the following cases: (i) the robot’s sensors are correlated, (ii) the robot’s sensor model, itself built from data, is also unreliable, (iii) both are combined. It also combines the resulting Bayesian estimators with a simple, but effective solution to the exploration-exploitation trade-off faced by the robot in a real deployment. A series of 15 day robot deployments show how our approach boosts the number of human activities observed by 70% relative to a baseline and produces more accurate estimates of the level of human activity in each place and time.
Collapse
|
5
|
Prunas O, Weinberger DM, Medini D, Tizzoni M, Argante L. Evaluating the Impact of Meningococcal Vaccines With Synthetic Controls. Am J Epidemiol 2022; 191:724-734. [PMID: 34753175 PMCID: PMC8971084 DOI: 10.1093/aje/kwab266] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Revised: 09/14/2021] [Accepted: 10/29/2021] [Indexed: 11/19/2022] Open
Abstract
Invasive meningococcal disease (IMD) has a low and unpredictable incidence, presenting challenges for real-world evaluations of meningococcal vaccines. Traditionally, meningococcal vaccine impact is evaluated by predicting counterfactuals from pre-immunization IMD incidences, possibly controlling for IMD in unvaccinated age groups, but the selection of controls can influence results. We retrospectively applied a synthetic control (SC) method, previously used for pneumococcal disease, to data from 2 programs for immunization of infants against serogroups B and C IMD in England and Brazil. Time series of infectious/noninfectious diseases in infants and IMD cases in older unvaccinated age groups were used as candidate controls, automatically combined in a SC through Bayesian variable selection. SC closely predicted IMD in absence of vaccination, adjusting for nontrivial changes in IMD incidence. Vaccine impact estimates were in line with previous assessments. IMD cases in unvaccinated age groups were the most frequent SC-selected controls. Similar results were obtained when excluding IMD from control sets and using other diseases only, particularly respiratory diseases and measles. Using non-IMD controls may be important where there are herd immunity effects. SC is a robust and flexible method that addresses uncertainty introduced when equally plausible controls exhibit different post-immunization behaviors, allowing objective comparisons of IMD programs between countries.
Collapse
Affiliation(s)
| | | | - Duccio Medini
- Correspondence to Dr. Duccio Medini, Via Fiorentina 1, Siena, 53100, Italy (e-mail: )
| | | | | |
Collapse
|
6
|
Chen J, Song JJ, Stamey JD. A Bayesian Hierarchical Spatial Model to Correct for Misreporting in Count Data: Application to State-Level COVID-19 Data in the United States. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:3327. [PMID: 35329019 PMCID: PMC8950980 DOI: 10.3390/ijerph19063327] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 02/27/2022] [Accepted: 03/05/2022] [Indexed: 02/01/2023]
Abstract
The COVID-19 pandemic that began at the end of 2019 has caused hundreds of millions of infections and millions of deaths worldwide. COVID-19 posed a threat to human health and profoundly impacted the global economy and people's lifestyles. The United States is one of the countries severely affected by the disease. Evidence shows that the spread of COVID-19 was significantly underestimated in the early stages, which prevented governments from adopting effective interventions promptly to curb the spread of the disease. This paper adopts a Bayesian hierarchical model to study the under-reporting of COVID-19 at the state level in the United States as of the end of April 2020. The model examines the effects of different covariates on the under-reporting and accurate incidence rates and considers spatial dependency. In addition to under-reporting (false negatives), we also explore the impact of over-reporting (false positives). Adjusting for misclassification requires adding additional parameters that are not directly identified by the observed data. Informative priors are required. We discuss prior elicitation and include R functions that convert expert information into the appropriate prior distribution.
Collapse
Affiliation(s)
| | | | - James D. Stamey
- Department of Statistical Science, Baylor University, Waco, TX 76798-7140, USA; (J.C.); (J.J.S.)
| |
Collapse
|
7
|
Estimating underreporting of leprosy in Brazil using a Bayesian approach. PLoS Negl Trop Dis 2021; 15:e0009700. [PMID: 34432805 PMCID: PMC8423270 DOI: 10.1371/journal.pntd.0009700] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Revised: 09/07/2021] [Accepted: 08/03/2021] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Leprosy remains concentrated among the poorest communities in low-and middle-income countries and it is one of the primary infectious causes of disability. Although there have been increasing advances in leprosy surveillance worldwide, leprosy underreporting is still common and can hinder decision-making regarding the distribution of financial and health resources and thereby limit the effectiveness of interventions. In this study, we estimated the proportion of unreported cases of leprosy in Brazilian microregions. METHODOLOGY/PRINCIPAL FINDINGS Using data collected between 2007 to 2015 from each of the 557 Brazilian microregions, we applied a Bayesian hierarchical model that used the presence of grade 2 leprosy-related physical disabilities as a direct indicator of delayed diagnosis and a proxy for the effectiveness of local leprosy surveillance program. We also analyzed some relevant factors that influence spatial variability in the observed mean incidence rate in the Brazilian microregions, highlighting the importance of socioeconomic factors and how they affect the levels of underreporting. We corrected leprosy incidence rates for each Brazilian microregion and estimated that, on average, 33,252 (9.6%) new leprosy cases went unreported in the country between 2007 to 2015, with this proportion varying from 8.4% to 14.1% across the Brazilian States. CONCLUSIONS/SIGNIFICANCE The magnitude and distribution of leprosy underreporting were adequately explained by a model using Grade 2 disability as a marker for the ability of the system to detect new missing cases. The percentage of missed cases was significant, and efforts are warranted to improve leprosy case detection. Our estimates in Brazilian microregions can be used to guide effective interventions, efficient resource allocation, and target actions to mitigate transmission.
Collapse
|
8
|
Sengupta D, Roy S, Banerjee T. Testing of Poisson mean with under-reported counts. BRAZ J PROBAB STAT 2021. [DOI: 10.1214/20-bjps493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Debjit Sengupta
- Department of Statistics, St. Xavier’s College, 30, Mother Teresa Sarani, Kolkata-700016, India
| | - Surupa Roy
- Department of Statistics, St. Xavier’s College, 30, Mother Teresa Sarani, Kolkata-700016, India
| | | |
Collapse
|
9
|
Alexopoulos A, Bottolo L. Bayesian Variable Selection for Gaussian Copula Regression Models. J Comput Graph Stat 2020; 30:578-593. [PMID: 37051045 PMCID: PMC7614421 DOI: 10.1080/10618600.2020.1840997] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
We develop a novel Bayesian method to select important predictors in regression models with multiple responses of diverse types. A sparse Gaussian copula regression model is used to account for the multivariate dependencies between any combination of discrete and/or continuous responses and their association with a set of predictors. We utilize the parameter expansion for data augmentation strategy to construct a Markov chain Monte Carlo algorithm for the estimation of the parameters and the latent variables of the model. Based on a centered parametrization of the Gaussian latent variables, we design a fixed-dimensional proposal distribution to update jointly the latent binary vectors of important predictors and the corresponding non-zero regression coefficients. For Gaussian responses and for outcomes that can be modeled as a dependent version of a Gaussian response, this proposal leads to a Metropolis-Hastings step that allows an efficient exploration of the predictors' model space. The proposed strategy is tested on simulated data and applied to real data sets in which the responses consist of low-intensity counts, binary, ordinal and continuous variables.
Collapse
Affiliation(s)
| | - Leonardo Bottolo
- Department of Medical Genetics, University of Cambridge, UK
- The Alan Turing Institute, London, UK
- MRC Biostatistics Unit, University of Cambridge, Cambridge, UK
| |
Collapse
|
10
|
Bracher J, Held L. A marginal moment matching approach for fitting endemic-epidemic models to underreported disease surveillance counts. Biometrics 2020; 77:1202-1214. [PMID: 32920842 DOI: 10.1111/biom.13371] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Accepted: 09/01/2020] [Indexed: 11/30/2022]
Abstract
Count data are often subject to underreporting, especially in infectious disease surveillance. We propose an approximate maximum likelihood method to fit count time series models from the endemic-epidemic class to underreported data. The approach is based on marginal moment matching where underreported processes are approximated through completely observed processes from the same class. Moreover, the form of the bias when underreporting is ignored or taken into account via multiplication factors is analyzed. Notably, we show that this leads to a downward bias in model-based estimates of the effective reproductive number. A marginal moment matching approach can also be used to account for reporting intervals which are longer than the mean serial interval of a disease. The good performance of the proposed methodology is demonstrated in simulation studies. An extension to time-varying parameters and reporting probabilities is discussed and applied in a case study on weekly rotavirus gastroenteritis counts in Berlin, Germany.
Collapse
Affiliation(s)
- Johannes Bracher
- Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
| | - Leonhard Held
- Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
| |
Collapse
|
11
|
Stoner O, Economou T, Drummond Marques da Silva G. A Hierarchical Framework for Correcting Under-Reporting in Count Data. J Am Stat Assoc 2019. [DOI: 10.1080/01621459.2019.1573732] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Affiliation(s)
- Oliver Stoner
- Department of Mathematics, University of Exeter, Exeter, UK
| | - Theo Economou
- Department of Mathematics, University of Exeter, Exeter, UK
| | | |
Collapse
|
12
|
de Oliveira GL, Loschi RH, Assunção RM. A random-censoring Poisson model for underreported data. Stat Med 2017; 36:4873-4892. [DOI: 10.1002/sim.7456] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 06/27/2017] [Accepted: 08/11/2017] [Indexed: 11/09/2022]
Affiliation(s)
- Guilherme Lopes de Oliveira
- Departamento de Estatística; Universidade Federal de Minas Gerais; Av. Antônio Carlos, 6.627 Belo Horizonte Minas Gerais 31270-901 Brazil
| | - Rosangela Helena Loschi
- Departamento de Estatística; Universidade Federal de Minas Gerais; Av. Antônio Carlos, 6.627 Belo Horizonte Minas Gerais 31270-901 Brazil
| | - Renato Martins Assunção
- Departamento de Ciência da Computação; Universidade Federal de Minas Gerais; Av. Antônio Carlos, 6.627 Belo Horizonte Minas Gerais 31270-901 Brazil
| |
Collapse
|
13
|
Counting unreported abortions: A binomial-thinned zero-inflated Poisson model. DEMOGRAPHIC RESEARCH 2017. [DOI: 10.4054/demres.2017.36.2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
|