1
|
Xu Y, Yang X, Zhang J, Zhou X, Luo L, Zhang Q. Visual analysis of sea buckthorn fruit moisture content based on deep image processing technology. Food Chem 2024; 453:139558. [PMID: 38781892 DOI: 10.1016/j.foodchem.2024.139558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 04/11/2024] [Accepted: 05/02/2024] [Indexed: 05/25/2024]
Abstract
The effect of moisture content changes during drying processing on the appearance of sea buckthorn was studied. Using computer vision methods and various image processing methods to collect and analyze images during the drying process of sea buckthorn fruit. Sea buckthorn is dried in a drying oven at a temperature of 65 °C and Level 1 wind speed conditions. The images of the entire drying process of sea buckthorn fruit were collected at 30-min intervals. Deep mining and transformation of image information through various image processing methods. By calibrating and modeling the color components, real-time online detection of the moisture content of sea buckthorn fruit can be achieved. After modeling, this article attempted to use LSTM (Long Short Term Memory) to predict the appearance of sea buckthorn fruit with supercritical moisture content. Different agricultural products adapt to different color spaces, but after standard modeling with a certain amount of data, applying color components to detect moisture content is a very good method.
Collapse
Affiliation(s)
- Yu Xu
- College of Mechanical and Electrical Engineering, Shihezi University, Shihezi 832000, China
| | - Xuhai Yang
- College of Mechanical and Electrical Engineering, Shihezi University, Shihezi 832000, China; Engineering Research Center for Production Mechanization of Oasis Characteristic Cash Crop, Ministry of Education, Shihezi 832000, China; Xinjiang Production and Construction Corps Key Laboratory of Modern Agricultural Machinery, Shihezi 832000, China
| | - Junyi Zhang
- College of Mechanical and Electrical Engineering, Shihezi University, Shihezi 832000, China
| | - Xiang Zhou
- College of Mechanical and Electrical Engineering, Shihezi University, Shihezi 832000, China
| | - Liwei Luo
- College of Mechanical and Electrical Engineering, Shihezi University, Shihezi 832000, China
| | - Qian Zhang
- College of Mechanical and Electrical Engineering, Shihezi University, Shihezi 832000, China; Engineering Research Center for Production Mechanization of Oasis Characteristic Cash Crop, Ministry of Education, Shihezi 832000, China; Xinjiang Production and Construction Corps Key Laboratory of Modern Agricultural Machinery, Shihezi 832000, China.
| |
Collapse
|
2
|
Sammy A, Medeiros A, Batomen B, Rothman L, Harris MA, Harrington DW, Macarthur C, Richmond SA. Motor vehicle collision (MVC) emergency department (ED) visits and hospitalisations in Ontario during the COVID-19 pandemic. Inj Prev 2024:ip-2024-045269. [PMID: 38871438 DOI: 10.1136/ip-2024-045269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 04/29/2024] [Indexed: 06/15/2024]
Abstract
BACKGROUND The COVID-19 pandemic policy response dramatically changed local transportation patterns. This project investigated the impact of COVID-19 policies on motor vehicle collision (MVC)-related emergency department (ED) visits and hospitalisations in Ontario. METHODS Data were collected on MVC-related ED visits and hospitalisations in Ontario between March 2016 and December 2022. Using an interrupted time series design, negative binomial regression models were fitted to the pre-pandemic data, including monthly indicator variables for seasonality and accounting for autocorrelation. Extrapolations simulated expected outcome trajectories during the pandemic, which were compared with actual observed outcome counts using the overall per cent change and mean monthly difference. Data were modelled separately for vehicle occupants, pedestrians and cyclists (MVC and non-MVC injuries). RESULTS There was a 31.5% decrease in observed ED visits (95% CI -35.4 to -27.3) and a 6.0% decrease in hospitalisations (95% CI -13.2 to 1.6) among vehicle occupants, relative to expected counts during the pandemic. Results were similar for pedestrians. Among cyclist MVCs, there was an increase in ED visits (12.8%, 95% CI -8.2 to 39.4) and hospitalisations (46.0%, 95% CI 11.6 to 93.6). Among non-MVC cyclists, there was also an increase in ED visits (47.0%, 95% CI 12.5 to 86.8) and hospitalisations (50.1%, 95% CI 8.2 to 101.2). CONCLUSIONS We observed fewer vehicle occupant and pedestrian collision injuries than expected during the pandemic. By contrast, we observed more cycling injuries than expected, especially in cycling injuries not involving motor vehicles. These observations may be attributable to changes in transportation patterns during the pandemic and increased uptake of recreational cycling.
Collapse
Affiliation(s)
- Adrian Sammy
- Department of Health Promotion, Chronic Disease and Injury Prevention, Public Health Ontario, Toronto, Ontario, Canada
| | - Alexia Medeiros
- Department of Health Promotion, Chronic Disease and Injury Prevention, Public Health Ontario, Toronto, Ontario, Canada
| | - Brice Batomen
- Dalla Lana School of Public Health, Division of Epidemiology, University of Toronto, Toronto, Ontario, Canada
| | - Linda Rothman
- Dalla Lana School of Public Health, Division of Epidemiology, University of Toronto, Toronto, Ontario, Canada
- School of Occupational and Public Health, Toronto Metropolitan University, Toronto, Ontario, Canada
| | - M Anne Harris
- Dalla Lana School of Public Health, Division of Epidemiology, University of Toronto, Toronto, Ontario, Canada
- School of Occupational and Public Health, Toronto Metropolitan University, Toronto, Ontario, Canada
| | - Daniel W Harrington
- Department of Health Promotion, Chronic Disease and Injury Prevention, Public Health Ontario, Toronto, Ontario, Canada
| | - Colin Macarthur
- Child Health Evaluative Sciences, Hospital for Sick Children Research Institute, Toronto, Ontario, Canada
| | - Sarah A Richmond
- Department of Health Promotion, Chronic Disease and Injury Prevention, Public Health Ontario, Toronto, Ontario, Canada
- Dalla Lana School of Public Health, Division of Epidemiology, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
3
|
Batomen B, Macpherson A, Lewis J, Howard A, Ruth Saunders N, Richmond S, Anne Harris M, Saskin R, Zagorski B, Macarthur C, Fuselli P, Rothman L. Vulnerable road user injury trends following the COVID-19 pandemic in Toronto, Canada: An interrupted time series analysis. JOURNAL OF SAFETY RESEARCH 2024; 89:152-159. [PMID: 38858038 DOI: 10.1016/j.jsr.2024.02.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 11/16/2023] [Accepted: 02/14/2024] [Indexed: 06/12/2024]
Abstract
BACKGROUND The COVID-19 pandemic altered traffic patterns worldwide, potentially impacting pedestrian and bicyclists safety in urban areas. In Toronto, Canada, work from home policies, bicycle network expansion, and quiet streets were implemented to support walking and cycling. We examined pedestrian and bicyclist injury trends from 2012 to 2022, utilizing police-reported killed or severely injured (KSI), emergency department (ED) visits and hospitalization data. METHODS We used an interrupted time series design, with injury counts aggregated quarterly. We fit a negative binomial regression using a Bayesian modeling approach to data prior to the pandemic that included a secular time trend, quarterly seasonal indicator variables, and autoregressive terms. The differences between observed and expected injury counts based on pre-pandemic trends with 95% credible intervals (CIs) were computed. RESULTS There were 38% fewer pedestrian KSI (95%CI: 19%, 52%), 35% fewer ED visits (95%CI: 28%, 42%), and 19% fewer hospitalizations (95%CI: 2%, 32%) since the beginning of the COVID-19 pandemic. A reduction of 35% (95%CI: 7%, 54%) in KSI bicyclist injuries was observed, but However, ED visits and hospitalizations from bicycle-motor vehicle collisions were compatible with pre-pandemic trends. In contrast, for bicycle injuries not involving motor vehicles, large increases were observed for both ED visits, 73% (95% CI: 49%, 103%) and for hospitalization 108% (95% CI: 38%, 208%). CONCLUSION New road safety interventions during the pandemic may have improved road safety for vulnerable road users with respect to collisions with motor vehicles; however, further investigation into the risk factors for bicycle injuries not involving motor vehicles is required.
Collapse
Affiliation(s)
- Brice Batomen
- Dalla Lana School of Public Health, University of Toronto, Ontario, Canada.
| | - Alison Macpherson
- School of Kinesiology and Health Science, Faculty of Health, York University, Ontario, Canada
| | - Jeremy Lewis
- School of Occupational and Public Health Toronto Metropolitan University, Toronto, Ontario, Canada
| | - Andrew Howard
- The Hospital for Sick Children, Toronto, Ontario, Canada
| | | | - Sarah Richmond
- Dalla Lana School of Public Health, University of Toronto, Ontario, Canada; Public Health Ontario, Toronto, Ontario, Canada
| | - M Anne Harris
- School of Occupational and Public Health Toronto Metropolitan University, Toronto, Ontario, Canada
| | | | | | | | | | - Linda Rothman
- Dalla Lana School of Public Health, University of Toronto, Ontario, Canada; School of Occupational and Public Health Toronto Metropolitan University, Toronto, Ontario, Canada
| |
Collapse
|
4
|
Duren JV, Puttgen HA, Martinez J, Murray NM. Poisson Modeling Predicts Acute Telestroke Patient Call Volume. Telemed J E Health 2024. [PMID: 38603583 DOI: 10.1089/tmj.2023.0614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/13/2024] Open
Abstract
Background: Predicting the frequency of calls for telestroke and emergency teleneurology consultation is essential to prepare staffing for the immediate management of time-sensitive strokes. In this study, we evaluate Poisson distribution count data using a generalized linear model that predicts the volume of hourly telestroke calls over a 24-h period. Methods: We performed an Institutional Review Board approved retrospective cohort review of patients (January 2019-December 2022) from an institutional telestroke database at a large nonprofit multihospital system in the United States. All patients ≥18 years with a telestroke activation were included. Telestroke calls were quantified in frequency per day and analyzed by multiple time and date intervals. Poisson probability mass function (PMF) and cumulative distribution function (CDF) were used to predict call probabilities. A univariable Poisson regression model was fit to predict call volumes. Results: A total of 8,499 patients at 21 hospitals met inclusion criteria, the mean calls/day were 5.82 ± 2.54, and mean calls/day within each hour increment ranged from a minimum of 0.07 from 5 a.m. to 6 a.m. to a maximum of 0.45 from 7 p.m. to 8 p.m. The Poisson distribution was the most appropriate parametric probability model for these data, confirmed by the fit of the data to the expected distributions corresponding to the calculated means. The predicted probabilities of call frequencies by hour were calculated using the Poisson PMF and CDF; the probability of two or fewer calls/day by hour ranged from 98.9% to 99.9%. Univariable Poisson regression modeled an increase of future calls/day from 6.7 calls/day in July 2023 to 7.6 calls/day in October 2025. Conclusion: Poisson modeling closely fits telestroke call volumes, predicts the future volumes, and can be applied to any health system in which the mean call volume is known, which may inform the number of physicians needed to cover calls in real-time.
Collapse
Affiliation(s)
- Joe Van Duren
- Department of Neurology, Intermountain Healthcare, Murray, Utah, USA
| | - H Adrian Puttgen
- Department of Neurology, Intermountain Healthcare, Murray, Utah, USA
| | - Julie Martinez
- Department of Neurology, Intermountain Healthcare, Murray, Utah, USA
| | - Nick M Murray
- Department of Neurology, Intermountain Healthcare, Murray, Utah, USA
| |
Collapse
|
5
|
Mbwambo SH, Mbago MC, Rao GS. Socio-environmental predictors of diabetes incidence disparities in Tanzania mainland: a comparison of regression models for count data. BMC Med Res Methodol 2024; 24:75. [PMID: 38532325 DOI: 10.1186/s12874-024-02166-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 01/30/2024] [Indexed: 03/28/2024] Open
Abstract
BACKGROUND Diabetes is one of the top four non-communicable diseases that cause death and illness to many people around the world. This study aims to use an efficient count data model to estimate socio-environmental factors associated with diabetes incidences in Tanzania mainland, addressing lack of evidence on the efficient count data model for estimating factors associated with disease incidences disparities. METHODS This study analyzed diabetes counts in 184 Tanzania mainland councils collected in 2020. The study applied generalized Poisson, negative binomial, and Poisson count data models and evaluated their adequacy using information criteria and Pearson chi-square values. RESULTS The data were over-dispersed, as evidenced by the mean and variance values and the positively skewed histograms. The results revealed uneven distribution of diabetes incidence across geographical locations, with northern and urban councils having more cases. Factors like population, GDP, and hospital numbers were associated with diabetes counts. The GP model performed better than NB and Poisson models. CONCLUSION The occurrence of diabetes can be attributed to geographical locations. To address this public health issue, environmental interventions can be implemented. Additionally, the generalized Poisson model is an effective tool for analyzing health information system count data across different population subgroups.
Collapse
Affiliation(s)
- Sauda Hatibu Mbwambo
- Department of Statistics, Dar es Salaam, University of Dar es Salaam, P.O. Box 35047, Dar es Salaam, Tanzania.
- Department of Mathematics and Statistics, The University of Dodoma, P.O. Box 338, Dodoma, Tanzania.
| | - Maurice C Mbago
- Department of Statistics, Dar es Salaam, University of Dar es Salaam, P.O. Box 35047, Dar es Salaam, Tanzania
| | - Gadde Srinivasa Rao
- Department of Mathematics and Statistics, The University of Dodoma, P.O. Box 338, Dodoma, Tanzania
| |
Collapse
|
6
|
Eshetie SM. Exploring urban land surface temperature using spatial modelling techniques: a case study of Addis Ababa city, Ethiopia. Sci Rep 2024; 14:6323. [PMID: 38491059 PMCID: PMC10942972 DOI: 10.1038/s41598-024-55121-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 02/20/2024] [Indexed: 03/18/2024] Open
Abstract
Urban areas worldwide are experiencing escalating temperatures due to the combined effects of climate change and urbanization, leading to a phenomenon known as urban overheating. Understanding the spatial distribution of land surface temperature (LST) and its driving factors is crucial for mitigation and adaptation of urban overheating. So far, there has been an absence of investigations into spatiotemporal patterns and explanatory factors of LST in the city of Addis Ababa. The study aims to determine the spatial patterns of land surface temperature, analyze how the relationships between LST and its factors vary across space, and compare the effectiveness of using ordinary least squares and geographically weighted regression to model these connections. The findings showed that the spatial patterns of LST show statistically significant hot spot zones in the north-central parts of the study area (Moran's I = 0.172). The relationship between LST and its explanatory variables were modelled using ordinary least square model and thereby tested if there is spatial dependence in the model using the Koenker (BP) Statistic.The result revealed non-stationarity (p = 0.000) and consequently geographically weighted regression was employed to compare the performance with OLS. The research has revealed that, GWR (R2 = 0.57, AIC = 1052.1) is more effective technique than OLS (R2 = 0.42, AIC = 2162.0) for studying the relationship LST and the selected explanatory variables. The use of GWR has improved the accuracy of the model by capturing the spatial heterogeneity in the relationship between land surface temperature and its explanatory variables. The relationship between LST and its explanatory variables were modelled using ordinary least square model and thereby tested if there is spatial dependence in the model using the Koenker (BP) Statistic. The result revealed non-stationarity ((p = 0.000) and consequently geographically weighted regression was employed to compare the performance with OLS. The research has revealed that, GWR (R2 = 0.57, AIC = 1052.1) is more effective technique than OLS (R2 = 0.42, AIC = 2162.0) for studying the relationship LST and the selected explanatory variables. The use of GWR has improved the accuracy of the model by capturing the spatial heterogeneity in the relationship between land surface temperature and its explanatory variables. Consequently, Localized understanding of the spatial patterns and the driving factors of LST has been formulated.
Collapse
Affiliation(s)
- Seyoum Melese Eshetie
- Space Science and Geospatial Institute of Ethiopia, Remote Sensing Department, Addis Ababa, Ethiopia.
| |
Collapse
|
7
|
Zhu Q, Conrad DN, Gartner ZJ. deMULTIplex2: robust sample demultiplexing for scRNA-seq. Genome Biol 2024; 25:37. [PMID: 38291503 PMCID: PMC10829271 DOI: 10.1186/s13059-024-03177-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 01/18/2024] [Indexed: 02/01/2024] Open
Abstract
Sample multiplexing enables pooled analysis during single-cell RNA sequencing workflows, thereby increasing throughput and reducing batch effects. A challenge for all multiplexing techniques is to link sample-specific barcodes with cell-specific barcodes, then demultiplex sample identity post-sequencing. However, existing demultiplexing tools fail under many real-world conditions where barcode cross-contamination is an issue. We therefore developed deMULTIplex2, an algorithm inspired by a mechanistic model of barcode cross-contamination. deMULTIplex2 employs generalized linear models and expectation-maximization to probabilistically determine the sample identity of each cell. Benchmarking reveals superior performance across various experimental conditions, particularly on large or noisy datasets with unbalanced sample compositions.
Collapse
Affiliation(s)
- Qin Zhu
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, CA, 94158, USA.
| | - Daniel N Conrad
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, CA, 94158, USA
| | - Zev J Gartner
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, CA, 94158, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, 94158, USA.
- Center for Cellular Construction, University of California, San Francisco, CA, 94158, USA.
| |
Collapse
|
8
|
Baffour B, Aheto JMK, Das S, Godwin P, Richardson A. Geostatistical modelling of child undernutrition in developing countries using remote-sensed data: evidence from Bangladesh and Ghana demographic and health surveys. Sci Rep 2023; 13:21573. [PMID: 38062092 PMCID: PMC10703913 DOI: 10.1038/s41598-023-48980-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 12/02/2023] [Indexed: 12/18/2023] Open
Abstract
Childhood chronic undernutrition, known as stunting, remains a critical public health problem globally. Unfortunately while the global stunting prevalence has been declining over time, as a result of concerted public health efforts, there are areas (notably in sub-Saharan Africa and South Asia) where progress has stagnated. These regions are also resource-poor, and monitoring progress in the fight against chronic undernutrition can be problematic. We propose geostatistical modelling using data from existing demographic surveys supplemented by remote-sensed information to provide improved estimates of childhood stunting, accounting for spatial and non-spatial differences across regions. We use two study areas-Bangladesh and Ghana-and our results, in the form of prevalence maps, identify communities for targeted intervention. For Bangladesh, the maps show that all districts in the south-eastern region are identified to have greater risk of stunting, while in Ghana the greater northern region had the highest prevalence of stunting. In countries like Bangladesh and Ghana with limited resources, these maps can be useful diagnostic tools for health planning, decision making and implementation.
Collapse
Affiliation(s)
- Bernard Baffour
- School of Demography, Australian National University, 146 Ellery Crescent, Canberra, ACT, 2600, Australia
| | - Justice Moses K Aheto
- Department of Biostatistics, University of Ghana, P.O. Box LG13, Accra, Ghana
- WorldPop, University of Southampton, Southampton, SO17 1BJ, Hampshire, UK
| | - Sumonkanti Das
- School of Demography, Australian National University, 146 Ellery Crescent, Canberra, ACT, 2600, Australia.
| | - Penelope Godwin
- School of Demography, Australian National University, 146 Ellery Crescent, Canberra, ACT, 2600, Australia
| | - Alice Richardson
- Statistical Support Network, Australian National University, 110 Ellery Crescent, Canberra, ACT, 2600, Australia
| |
Collapse
|
9
|
Beaney T, Clarke J, Salman D, Woodcock T, Majeed A, Barahona M, Aylin P. Identifying potential biases in code sequences in primary care electronic healthcare records: a retrospective cohort study of the determinants of code frequency. BMJ Open 2023; 13:e072884. [PMID: 37758674 PMCID: PMC10537851 DOI: 10.1136/bmjopen-2023-072884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 09/11/2023] [Indexed: 09/29/2023] Open
Abstract
OBJECTIVES To determine whether the frequency of diagnostic codes for long-term conditions (LTCs) in primary care electronic healthcare records (EHRs) is associated with (1) disease coding incentives, (2) General Practice (GP), (3) patient sociodemographic characteristics and (4) calendar year of diagnosis. DESIGN Retrospective cohort study. SETTING GPs in England from 2015 to 2022 contributing to the Clinical Practice Research Datalink Aurum dataset. PARTICIPANTS All patients registered to a GP with at least one incident LTC diagnosed between 1 January 2015 and 31 December 2019. PRIMARY AND SECONDARY OUTCOME MEASURES The number of diagnostic codes for an LTC in (1) the first and (2) the second year following diagnosis, stratified by inclusion in the Quality and Outcomes Framework (QOF) financial incentive programme. RESULTS 3 113 724 patients were included, with 7 723 365 incident LTCs. Conditions included in QOF had higher rates of annual coding than conditions not included in QOF (1.03 vs 0.32 per year, p<0.0001). There was significant variation in code frequency by GP which was not explained by patient sociodemographics. We found significant associations with patient sociodemographics, with a trend towards higher coding rates in people living in areas of higher deprivation for both QOF and non-QOF conditions. Code frequency was lower for conditions with follow-up time in 2020, associated with the onset of the COVID-19 pandemic. CONCLUSIONS The frequency of diagnostic codes for newly diagnosed LTCs is influenced by factors including patient sociodemographics, disease inclusion in QOF, GP practice and the impact of the COVID-19 pandemic. Natural language processing or other methods using temporally ordered code sequences should account for these factors to minimise potential bias.
Collapse
Affiliation(s)
- Thomas Beaney
- Department of Primary Care and Public Health, Imperial College London, London, UK
- Department of Mathematics, Imperial College London, London, UK
| | - Jonathan Clarke
- Department of Mathematics, Imperial College London, London, UK
| | - David Salman
- Department of Primary Care and Public Health, Imperial College London, London, UK
- MSk Lab, Imperial College London, London, UK
| | - Thomas Woodcock
- Department of Primary Care and Public Health, Imperial College London, London, UK
| | - Azeem Majeed
- Department of Primary Care and Public Health, Imperial College London, London, UK
| | | | - Paul Aylin
- Department of Primary Care and Public Health, Imperial College London, London, UK
| |
Collapse
|
10
|
Kang B, Goldlust S, Lee EC, Hughes J, Bansal S, Haran M. Spatial distribution and determinants of childhood vaccination refusal in the United States. Vaccine 2023; 41:3189-3195. [PMID: 37069031 DOI: 10.1016/j.vaccine.2023.04.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 04/04/2023] [Accepted: 04/05/2023] [Indexed: 04/19/2023]
Abstract
Parental refusal and delay of childhood vaccination has increased in recent years in the United States. This phenomenon challenges maintenance of herd immunity and increases the risk of outbreaks of vaccine-preventable diseases. We examine US county-level vaccine refusal for patients under five years of age collected during the period 2012-2015 from an administrative healthcare dataset. We model these data with a Bayesian zero-inflated negative binomial regression model to capture social and political processes that are associated with vaccine refusal, as well as factors that affect our measurement of vaccine refusal. Our work highlights fine-scale socio-demographic characteristics associated with vaccine refusal nationally, finds that spatial clustering in refusal can be explained by such factors, and has the potential to aid in the development of targeted public health strategies for optimizing vaccine uptake.
Collapse
Affiliation(s)
- Bokgyeong Kang
- Department of Statistics, Pennsylvania State University, University Park 16802, PA, USA
| | - Sandra Goldlust
- New York University School of Medicine, New York 10016, NY, USA
| | - Elizabeth C Lee
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore 21205, MD, USA
| | - John Hughes
- College of Health, Lehigh University, Bethlehem 18015, PA, USA
| | - Shweta Bansal
- Department of Biology, Georgetown University, Washington 20007, DC, USA
| | - Murali Haran
- Department of Statistics, Pennsylvania State University, University Park 16802, PA, USA
| |
Collapse
|
11
|
Nigussie TZ, Zewotir TT, Muluneh EK. Seasonal and spatial variations of malaria transmissions in northwest Ethiopia: Evaluating climate and environmental effects using generalized additive model. Heliyon 2023; 9:e15252. [PMID: 37089331 PMCID: PMC10114238 DOI: 10.1016/j.heliyon.2023.e15252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 03/16/2023] [Accepted: 03/31/2023] [Indexed: 04/25/2023] Open
Abstract
The impacts of climate change and environmental predictors on malaria epidemiology remain unclear and not well investigated in the Sub-Sahara African region. This study was aimed to investigate the nonlinear effects of climate and environmental factors on monthly malaria cases in northwest Ethiopia, considering space-time interaction effects. The monthly malaria cases and populations sizes of the 152 districts were obtained from the Amhara public health institute and the central statistical agency of Ethiopia. The climate and environmental data were retrieved from US National Oceanic and Atmospheric Administration. The data were analyzed using a spatiotemporal generalized additive model. The spatial, temporal, and space-time interaction effects had higher contributions in explaining the spatiotemporal distribution of malaria transmissions. Malaria transmission was seasonal, in which a higher number of cases occurred from September to November. The long-term trend of malaria incidence has decreased between 2012 and 2018 and has turned to an increased pattern since 2019. Areas neighborhood to the Abay gorge and Benshangul-Gumuz, South Sudan, and Sudan border have higher spatial effects. Climate and environmental predictors had significant nonlinear effects, in which their effects are not stationary through the ranges of values of variables, and they had a smaller contributions in explaining the variabilities of malaria incidence compared to seasonal, spatial and temporal effects. Effects of climate and environmental predictors were nonlinear and varied across areas, ecology, and landscape of the study sites, which had little contribution to explaining malaria transmission variabilities with an account of space and time dimensions. Hence, exploring and developing an early warning system that predicts the outbreak of malaria transmission would have an essential role in controlling, preventing, and eliminating malaria in areas with lower and higher transmission levels and ultimately lead to the achievement of malaria GTS milestones.
Collapse
Affiliation(s)
- Teshager Zerihun Nigussie
- Department of Statistics, College of Science, Bahir Dar University, Bahir Dar, Ethiopia
- Department of Statistics, Faculty of Natural and Computational Sciences, Debre Tabor University, Debre Tabor, Ethiopia
- Corresponding author. Department of Statistics, College of Science, Bahir Dar University, Bahir Dar, Ethiopia.
| | - Temesgen T. Zewotir
- School of Mathematics, Statistics and Computer Science, College of Agriculture Engineering and Science, University of KwaZulu-Natal, Durban, South Africa
| | - Essey Kebede Muluneh
- School of Public Health, College of Medicine and Health Sciences, Bahir Dar University, Bahir Dar, Ethiopia
| |
Collapse
|
12
|
Egbon OA, Gayawan E. Modeling the spatial patterns of antenatal care utilization in Nigeria with inference based on Pólya-Gamma mixtures. J Appl Stat 2023; 51:866-890. [PMID: 38524798 PMCID: PMC10956928 DOI: 10.1080/02664763.2022.2164561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 12/20/2022] [Indexed: 02/25/2023]
Abstract
Despite the vast advantages of making antenatal care visits, the service utilization among pregnant women in Nigeria is suboptimal. A five-year monitoring estimate indicated that about 24% of the women who had live births made no visit. The non-utilization induced excessive zeroes in the outcome of interest. Thus, this study adopted a zero-inflated negative binomial model within a Bayesian framework to identify the spatial pattern and the key factors hindering antenatal care utilization in Nigeria. We overcome the intractability associated with posterior inference by adopting a Pólya-Gamma data-augmentation technique to facilitate inference. The Gibbs sampling algorithm was used to draw samples from the joint posterior distribution. Results revealed that type of place of residence, maternal level of education, access to mass media, household work index, and woman's working status have significant effects on the use of antenatal care services. Findings identified substantial state-level spatial disparity in antenatal care utilization across the country. Cost-effective techniques to achieve an acceptable frequency of utilization include the creation of a community-specific awareness to emphasize the importance and benefits of the appropriate utilization. Special consideration should be given to older pregnant women, women in poor antenatal utilization states, and women residing in poor road network regions.
Collapse
Affiliation(s)
- Osafu Augustine Egbon
- Department of Statistics, Universidade Federal de São Carlos, São Carlos, Brazil
- Institute of Mathematical and Computer Sciences, University of São Paulo, São Carlos, Brazil
| | - Ezra Gayawan
- Department of Statistics, Federal University of Technology, Akure, Nigeria
| |
Collapse
|
13
|
Andrade AC, Pereira GH, Artes R. The circular quantile residual. Comput Stat Data Anal 2023. [DOI: 10.1016/j.csda.2022.107612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
14
|
Ishii S, Tanabe K, Ishimaru B, Kitahara K. Impact of COVID-19 on Long-Term Care Service Utilization of Older Home-Dwelling Adults in Japan. J Am Med Dir Assoc 2023; 24:156-163.e23. [PMID: 36592936 PMCID: PMC9742200 DOI: 10.1016/j.jamda.2022.12.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 11/03/2022] [Accepted: 12/04/2022] [Indexed: 12/14/2022]
Abstract
OBJECTIVES The COVID-19 outbreak severely affected long-term care (LTC) service provision. This study aimed to quantitatively evaluate its impact on the utilization of LTC services by older home-dwelling adults and identify its associated factors. DESIGN A retrospective repeated cross-sectional study. SETTING AND PARTICIPANTS Data from a nationwide LTC Insurance Comprehensive Database comprising monthly claims from January 2019 to September 2020. METHODS Interrupted time series analyses and segmented negative binomial regression were employed to examine changes in use for each of the 15 LTC services. Results of the analyses were synthesized using random effects meta-analysis in 3 service types (home visit, commuting, and short-stay services). RESULTS LTC service use declined in April 2020 when the state of emergency (SOE) was declared, followed by a gradual recovery in June after the SOE was lifted. There was a significant association between decline in LTC service use and SOE, whereas the association between LTC service use and the status of the infection spread was limited. Service type was associated with changes in service utilization, with a more precipitous decline in commuting and short-stay services than in home visiting services during the SOE. Service use by those with dementia was higher than that by those without dementia, particularly in commuting and short-stay services, partially canceling out the decline in service use that occurred during the SOE. CONCLUSIONS AND IMPLICATIONS There was a significant decline in LTC service utilization during the SOE. The decline varied depending on service types and the dementia severity of service users. These findings would help LTC professionals identify vulnerable groups and guide future plans geared toward effective infection prevention while alleviating unfavorable impacts by infection prevention measures. Future studies are required to examine the effects of the LTC service decline on older adults.
Collapse
Affiliation(s)
- Shinya Ishii
- Division of the Health for the Elderly, Health and Welfare Bureau for the Elderly, Ministry of Health, Labour and Welfare, Tokyo, Japan.
| | | | | | | |
Collapse
|
15
|
Altinisik Y, Cankaya E. New zero-inflated regression models with a variant of censoring. BRAZ J PROBAB STAT 2022. [DOI: 10.1214/22-bjps544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
- Yasin Altinisik
- Department of Statistics, Faculty of Science and Literature, Sinop University, Sinop, Turkey
| | - Emel Cankaya
- Department of Statistics, Faculty of Science and Literature, Sinop University, Sinop, Turkey
| |
Collapse
|
16
|
Hughes J. A unified Gaussian copula methodology for spatial regression analysis. Sci Rep 2022; 12:15915. [PMID: 36151389 PMCID: PMC9508247 DOI: 10.1038/s41598-022-20171-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Accepted: 09/09/2022] [Indexed: 11/09/2022] Open
Abstract
Spatially referenced data arise in many fields, including imaging, ecology, public health, and marketing. Although principled smoothing or interpolation is paramount for many practitioners, regression, too, can be an important (or even the only or most important) goal of a spatial analysis. When doing spatial regression it is crucial to accommodate spatial variation in the response variable that cannot be explained by the spatially patterned explanatory variables included in the model. Failure to model both sources of spatial dependence-regression and extra-regression, if you will-can lead to erroneous inference for the regression coefficients. In this article I highlight an under-appreciated spatial regression model, namely, the spatial Gaussian copula regression model (SGCRM), and describe said model's advantages. Then I develop an intuitive, unified, and computationally efficient approach to inference for the SGCRM. I demonstrate the efficacy of the proposed methodology by way of an extensive simulation study along with analyses of a well-known dataset from disease mapping.
Collapse
Affiliation(s)
- John Hughes
- Lehigh University, Bethlehem, PA, 18015, USA.
| |
Collapse
|
17
|
Saraiva EF, Vigas VP, Flesch MV, Gannon M, de Bragança Pereira CA. Modeling Overdispersed Dengue Data via Poisson Inverse Gaussian Regression Model: A Case Study in the City of Campo Grande, MS, Brazil. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1256. [PMID: 36141142 PMCID: PMC9497985 DOI: 10.3390/e24091256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 08/26/2022] [Accepted: 09/04/2022] [Indexed: 06/16/2023]
Abstract
Dengue fever is a tropical disease transmitted mainly by the female Aedes aegypti mosquito that affects millions of people every year. As there is still no safe and effective vaccine, currently the best way to prevent the disease is to control the proliferation of the transmitting mosquito. Since the proliferation and life cycle of the mosquito depend on environmental variables such as temperature and water availability, among others, statistical models are needed to understand the existing relationships between environmental variables and the recorded number of dengue cases and predict the number of cases for some future time interval. This prediction is of paramount importance for the establishment of control policies. In general, dengue-fever datasets contain the number of cases recorded periodically (in days, weeks, months or years). Since many dengue-fever datasets tend to be of the overdispersed, long-tail type, some common models like the Poisson regression model or negative binomial regression model are not adequate to model it. For this reason, in this paper we propose modeling a dengue-fever dataset by using a Poisson-inverse-Gaussian regression model. The main advantage of this model is that it adequately models overdispersed long-tailed data because it has a wider skewness range than the negative binomial distribution. We illustrate the application of this model in a real dataset and compare its performance to that of a negative binomial regression model.
Collapse
Affiliation(s)
| | - Valdemiro Piedade Vigas
- Institute of Matematics, Federal University of Mato Grosso do Sul, Campo Grande 79070-900, MS, Brazil
| | - Mariana Villela Flesch
- Faculty of Engineering, Architecture and Urbanism and Geography, Federal University of Mato Grosso do Sul, Campo Grande 79070-900, MS, Brazil
| | - Mark Gannon
- Institute of Matematics and Statistics, University of São Paulo, São Paulo 05508-090, SP, Brazil
| | | |
Collapse
|
18
|
Panić M, Radović M, Cvjetko Bubalo M, Radošević K, Rogošić M, Coutinho JAP, Radojčić Redovniković I, Jurinjak Tušek A. Prediction of pH Value of Aqueous Acidic and Basic Deep Eutectic Solvent Using COSMO-RS σ Profiles' Molecular Descriptors. Molecules 2022; 27:molecules27144489. [PMID: 35889358 PMCID: PMC9324476 DOI: 10.3390/molecules27144489] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 07/05/2022] [Accepted: 07/11/2022] [Indexed: 12/10/2022] Open
Abstract
The aim of this work was to develop a simple and easy-to-apply model to predict the pH values of deep eutectic solvents (DESs) over a wide range of pH values that can be used in daily work. For this purpose, the pH values of 38 different DESs were measured (ranging from 0.36 to 9.31) and mathematically interpreted. To develop mathematical models, DESs were first numerically described using σ profiles generated with the COSMOtherm software. After the DESs’ description, the following models were used: (i) multiple linear regression (MLR), (ii) piecewise linear regression (PLR), and (iii) artificial neural networks (ANNs) to link the experimental values with the descriptors. Both PLR and ANN were found to be applicable to predict the pH values of DESs with a very high goodness of fit (R2independent validation > 0.8600). Due to the good mathematical correlation of the experimental and predicted values, the σ profile generated with COSMOtherm could be used as a DES molecular descriptor for the prediction of their pH values.
Collapse
Affiliation(s)
- Manuela Panić
- Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva Ulica 6, 10000 Zagreb, Croatia; (M.P.); (M.R.); (M.C.B.); (K.R.); (A.J.T.)
| | - Mia Radović
- Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva Ulica 6, 10000 Zagreb, Croatia; (M.P.); (M.R.); (M.C.B.); (K.R.); (A.J.T.)
| | - Marina Cvjetko Bubalo
- Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva Ulica 6, 10000 Zagreb, Croatia; (M.P.); (M.R.); (M.C.B.); (K.R.); (A.J.T.)
| | - Kristina Radošević
- Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva Ulica 6, 10000 Zagreb, Croatia; (M.P.); (M.R.); (M.C.B.); (K.R.); (A.J.T.)
| | - Marko Rogošić
- Faculty of Chemical Engineering and Technology, University of Zagreb, Marulićev Trg 19, 10000 Zagreb, Croatia;
| | - João A. P. Coutinho
- CICECO—Aveiro Institute of Materials, Department of Chemistry, University of Aveiro, 3810-193 Aveiro, Portugal;
| | - Ivana Radojčić Redovniković
- Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva Ulica 6, 10000 Zagreb, Croatia; (M.P.); (M.R.); (M.C.B.); (K.R.); (A.J.T.)
- Correspondence:
| | - Ana Jurinjak Tušek
- Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva Ulica 6, 10000 Zagreb, Croatia; (M.P.); (M.R.); (M.C.B.); (K.R.); (A.J.T.)
| |
Collapse
|
19
|
Feng C. Spatial-temporal generalized additive model for modeling COVID-19 mortality risk in Toronto, Canada. SPATIAL STATISTICS 2022; 49:100526. [PMID: 34249608 PMCID: PMC8257405 DOI: 10.1016/j.spasta.2021.100526] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Revised: 06/03/2021] [Accepted: 06/25/2021] [Indexed: 06/13/2023]
Abstract
This article presents a spatial-temporal generalized additive model for modeling geo-referenced COVID-19 mortality data in Toronto, Canada. A range of factors and spatial-temporal terms are incorporated into the model. The non-linear and interactive effects of the neighborhood-level factors, i.e., population density and average of income, are modeled as a two-dimensional spline smoother. The change of spatial pattern over time is modeled as a three-dimensional tensor product smoother. By fitting this model, the space-time effect can uncover the underlying spatial-temporal pattern that is not explained by the covariates. The performance of the modeling method based on the individual data is also compared to the modeling methods based on the aggregated data in terms of in-sample and out-of-sample predictive checking. The results suggest that the individual-level based analysis provided a better overall model fit and higher predictive accuracy for detecting epidemic peaks in this application as compared to the analysis based on the aggregated data.
Collapse
Affiliation(s)
- Cindy Feng
- Department of Community Health and Epidemiology, Faculty of Medicine, Dalhousie University, Halifax, Nova Scotia, Canada, B3H 1V7
| |
Collapse
|
20
|
Wongnak P, Bord S, Jacquot M, Agoulon A, Beugnet F, Bournez L, Cèbe N, Chevalier A, Cosson JF, Dambrine N, Hoch T, Huard F, Korboulewsky N, Lebert I, Madouasse A, Mårell A, Moutailler S, Plantard O, Pollet T, Poux V, René-Martellet M, Vayssier-Taussat M, Verheyden H, Vourc'h G, Chalvet-Monfray K. Meteorological and climatic variables predict the phenology of Ixodes ricinus nymph activity in France, accounting for habitat heterogeneity. Sci Rep 2022; 12:7833. [PMID: 35552424 PMCID: PMC9098447 DOI: 10.1038/s41598-022-11479-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 03/31/2022] [Indexed: 12/04/2022] Open
Abstract
Ixodes ricinus ticks (Acari: Ixodidae) are the most important vector for Lyme borreliosis in Europe. As climate change might affect their distributions and activities, this study aimed to determine the effects of environmental factors, i.e., meteorological, bioclimatic, and habitat characteristics on host-seeking (questing) activity of I. ricinus nymphs, an important stage in disease transmissions, across diverse climatic types in France over 8 years. Questing activity was observed using a repeated removal sampling with a cloth-dragging technique in 11 sampling sites from 7 tick observatories from 2014 to 2021 at approximately 1-month intervals, involving 631 sampling campaigns. Three phenological patterns were observed, potentially following a climatic gradient. The mixed-effects negative binomial regression revealed that observed nymph counts were driven by different interval-average meteorological variables, including 1-month moving average temperature, previous 3-to-6-month moving average temperature, and 6-month moving average minimum relative humidity. The interaction effects indicated that the phenology in colder climates peaked differently from that of warmer climates. Also, land cover characteristics that support the highest baseline abundance were moderate forest fragmentation with transition borders with agricultural areas. Finally, our model could potentially be used to predict seasonal human-tick exposure risks in France that could contribute to mitigating Lyme borreliosis risk.
Collapse
Affiliation(s)
- Phrutsamon Wongnak
- Université de Lyon, INRAE, VetAgro Sup, UMR EPIA, 69280, Marcy l'Etoile, France
- Université Clermont Auvergne, INRAE, VetAgro Sup, UMR EPIA, 63122, Saint-Genès-Champanelle, France
| | - Séverine Bord
- Université Paris-Saclay, AgroParisTech, INRAE, UMR MIA-Paris, 75005, Paris, France
| | - Maude Jacquot
- Université de Lyon, INRAE, VetAgro Sup, UMR EPIA, 69280, Marcy l'Etoile, France
- Université Clermont Auvergne, INRAE, VetAgro Sup, UMR EPIA, 63122, Saint-Genès-Champanelle, France
- Ifremer, RBE-SGMM-LGPMM, 17390, La Tremblade, France
| | | | - Frédéric Beugnet
- Global Technical Services, Boehringer-Ingelheim Animal Health, 69007, Lyon, France
| | - Laure Bournez
- Nancy Laboratory for Rabies and Wildlife, The French Agency for Food, Environmental and Occupational Health and Safety (ANSES), 54220, Malzéville, France
| | - Nicolas Cèbe
- Université de Toulouse, INRAE, UR CEFS, 31326, Castanet-Tolosan, France
- LTSER ZA PYRénées GARonne, 31326, Auzeville-Tolosane, France
| | | | | | - Naïma Dambrine
- Université de Lyon, INRAE, VetAgro Sup, UMR EPIA, 69280, Marcy l'Etoile, France
- Université Clermont Auvergne, INRAE, VetAgro Sup, UMR EPIA, 63122, Saint-Genès-Champanelle, France
| | - Thierry Hoch
- INRAE, Oniris, UMR BIOEPAR, 44300, Nantes, France
| | | | | | - Isabelle Lebert
- Université de Lyon, INRAE, VetAgro Sup, UMR EPIA, 69280, Marcy l'Etoile, France
- Université Clermont Auvergne, INRAE, VetAgro Sup, UMR EPIA, 63122, Saint-Genès-Champanelle, France
| | | | | | - Sara Moutailler
- ANSES, ENVA, INRAE, UMR 956 BIPAR, 94701, Maisons-Alfort, France
| | | | - Thomas Pollet
- ANSES, ENVA, INRAE, UMR 956 BIPAR, 94701, Maisons-Alfort, France
- INRAE, CIRAD, UMR ASTRE, 34398, Montpellier, France
| | - Valérie Poux
- Université de Lyon, INRAE, VetAgro Sup, UMR EPIA, 69280, Marcy l'Etoile, France
- Université Clermont Auvergne, INRAE, VetAgro Sup, UMR EPIA, 63122, Saint-Genès-Champanelle, France
| | - Magalie René-Martellet
- Université de Lyon, INRAE, VetAgro Sup, UMR EPIA, 69280, Marcy l'Etoile, France
- Université Clermont Auvergne, INRAE, VetAgro Sup, UMR EPIA, 63122, Saint-Genès-Champanelle, France
| | | | - Hélène Verheyden
- Université de Toulouse, INRAE, UR CEFS, 31326, Castanet-Tolosan, France
- LTSER ZA PYRénées GARonne, 31326, Auzeville-Tolosane, France
| | - Gwenaël Vourc'h
- Université de Lyon, INRAE, VetAgro Sup, UMR EPIA, 69280, Marcy l'Etoile, France
- Université Clermont Auvergne, INRAE, VetAgro Sup, UMR EPIA, 63122, Saint-Genès-Champanelle, France
| | - Karine Chalvet-Monfray
- Université de Lyon, INRAE, VetAgro Sup, UMR EPIA, 69280, Marcy l'Etoile, France.
- Université Clermont Auvergne, INRAE, VetAgro Sup, UMR EPIA, 63122, Saint-Genès-Champanelle, France.
| |
Collapse
|
21
|
Choudhary S, Satija R. Comparison and evaluation of statistical error models for scRNA-seq. Genome Biol 2022; 23:27. [PMID: 35042561 PMCID: PMC8764781 DOI: 10.1186/s13059-021-02584-9] [Citation(s) in RCA: 120] [Impact Index Per Article: 60.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Accepted: 12/20/2021] [Indexed: 01/31/2023] Open
Abstract
BACKGROUND Heterogeneity in single-cell RNA-seq (scRNA-seq) data is driven by multiple sources, including biological variation in cellular state as well as technical variation introduced during experimental processing. Deconvolving these effects is a key challenge for preprocessing workflows. Recent work has demonstrated the importance and utility of count models for scRNA-seq analysis, but there is a lack of consensus on which statistical distributions and parameter settings are appropriate. RESULTS Here, we analyze 59 scRNA-seq datasets that span a wide range of technologies, systems, and sequencing depths in order to evaluate the performance of different error models. We find that while a Poisson error model appears appropriate for sparse datasets, we observe clear evidence of overdispersion for genes with sufficient sequencing depth in all biological systems, necessitating the use of a negative binomial model. Moreover, we find that the degree of overdispersion varies widely across datasets, systems, and gene abundances, and argues for a data-driven approach for parameter estimation. CONCLUSIONS Based on these analyses, we provide a set of recommendations for modeling variation in scRNA-seq data, particularly when using generalized linear models or likelihood-based approaches for preprocessing and downstream analysis.
Collapse
Affiliation(s)
- Saket Choudhary
- New York Genome Center, 101 Avenue of the Americas, New York, 100013 USA
| | - Rahul Satija
- New York Genome Center, 101 Avenue of the Americas, New York, 100013 USA
- Center for Genomics and Systems Biology, New York University, 12 Waverly Pl, New York, 10003 USA
| |
Collapse
|
22
|
Botta-Dukát Z. Devil in the details: how can we avoid potential pitfalls of CATS regression when our data do not follow a Poisson distribution? PeerJ 2022; 10:e12763. [PMID: 35174013 PMCID: PMC8763042 DOI: 10.7717/peerj.12763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Accepted: 12/17/2021] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Community assembly by trait selection (CATS) allows for the detection of environmental filtering and estimation of the relative role of local and regional (meta-community-level) effects on community composition from trait and abundance data without using environmental data. It has been shown that Poisson regression of abundances against trait data results in the same parameter estimates. Abundance data do not necessarily follow a Poisson distribution, and in these cases, other generalized linear models should be fitted to obtain unbiased parameter estimates. AIMS This paper discusses how the original algorithm for calculating the relative role of local and regional effects has to be modified if Poisson model is not appropriate. RESULTS It can be shown that the use of the logarithm of regional relative abundances as an offset is appropriate only if a log-link function is applied. Otherwise, the link function should be applied to the product of local total abundance and regional relative abundances. Since this product may be outside the domain of the link function, the use of log-link is recommended, even if it is not the canonical link. An algorithm is also suggested for calculating the offset when data are zero-inflated. The relative role of local and regional effects is measured by Kullback-Leibler R2. The formula for this measure presented by Shipley (2014) is valid only if the abundances follow a Poisson distribution. Otherwise, slightly different formulas have to be applied. Beyond theoretical considerations, the proposed refinements are illustrated by numerical examples. CATS regression could be a useful tool for community ecologists, but it has to be slightly modified when abundance data do not follow a Poisson distribution. This paper gives detailed instructions on the necessary refinement.
Collapse
|
23
|
Bai W, Dong M, Li L, Feng C, Xu W. Randomized quantile residuals for diagnosing zero-inflated generalized linear mixed models with applications to microbiome count data. BMC Bioinformatics 2021; 22:564. [PMID: 34823466 PMCID: PMC8620156 DOI: 10.1186/s12859-021-04371-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Accepted: 09/11/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND For differential abundance analysis, zero-inflated generalized linear models, typically zero-inflated NB models, have been increasingly used to model microbiome and other sequencing count data. A common assumption in estimating the false discovery rate is that the p values are uniformly distributed under the null hypothesis, which demands that the postulated model fit the count data adequately. Mis-specification of the distribution of the count data may lead to excess false discoveries. Therefore, model checking is critical to control the FDR at a nominal level in differential abundance analysis. Increasing studies show that the method of randomized quantile residual (RQR) performs well in diagnosing count regression models. However, the performance of RQR in diagnosing zero-inflated GLMMs for sequencing count data has not been extensively investigated in the literature. RESULTS We conduct large-scale simulation studies to investigate the performance of the RQRs for zero-inflated GLMMs. The simulation studies show that the type I error rates of the GOF tests with RQRs are very close to the nominal level; in addition, the scatter-plots and Q-Q plots of RQRs are useful in discerning the good and bad models. We also apply the RQRs to diagnose six GLMMs to a real microbiome dataset. The results show that the OTU counts at the genus level of this dataset (after a truncation treatment) can be modelled well by zero-inflated and zero-modified NB models. CONCLUSION RQR is an excellent tool for diagnosing GLMMs for zero-inflated count data, particularly the sequencing count data arising in microbiome studies. In the supplementary materials, we provided two generic R functions, called rqr.glmmtmb and rqr.hurdle.glmmtmb, for calculating the RQRs given fitting outputs of the R package glmmTMB.
Collapse
Affiliation(s)
- Wei Bai
- Department of Mathematics and Statistics, University of Saskatchewan, Saskatoon, CA Canada
| | - Mei Dong
- Dalla Lana School of Public Health, University of Toronto, Toronto, CA Canada
| | - Longhai Li
- Department of Mathematics and Statistics, University of Saskatchewan, Saskatoon, CA Canada
| | - Cindy Feng
- Department of Community Health and Epidemiology, Dalhousie University, Halifax, CA Canada
| | - Wei Xu
- Dalla Lana School of Public Health, University of Toronto, Toronto, CA Canada
| |
Collapse
|
24
|
Feng CX. A comparison of zero-inflated and hurdle models for modeling zero-inflated count data. JOURNAL OF STATISTICAL DISTRIBUTIONS AND APPLICATIONS 2021; 8:8. [PMID: 34760432 PMCID: PMC8570364 DOI: 10.1186/s40488-021-00121-4] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Accepted: 05/19/2021] [Indexed: 11/12/2022]
Abstract
Counts data with excessive zeros are frequently encountered in practice. For example, the number of health services visits often includes many zeros representing the patients with no utilization during a follow-up time. A common feature of this type of data is that the count measure tends to have excessive zero beyond a common count distribution can accommodate, such as Poisson or negative binomial. Zero-inflated or hurdle models are often used to fit such data. Despite the increasing popularity of ZI and hurdle models, there is still a lack of investigation of the fundamental differences between these two types of models. In this article, we reviewed the zero-inflated and hurdle models and highlighted their differences in terms of their data generating processes. We also conducted simulation studies to evaluate the performances of both types of models. The final choice of regression model should be made after a careful assessment of goodness of fit and should be tailored to a particular data in question.
Collapse
Affiliation(s)
- Cindy Xin Feng
- Department of Community Health and Epidemiology, Faculty of Medicine, Dalhousie University, 5790 University Avenue, Halifax, B3H 4R2 Nova Scotia Canada
| |
Collapse
|
25
|
Sharker S, Balbuena L, Marcoux G, Feng CX. Modeling socio-demographic and clinical factors influencing psychiatric inpatient service use: a comparison of models for zero-Inflated and overdispersed count data. BMC Med Res Methodol 2020; 20:232. [PMID: 32938381 PMCID: PMC7495888 DOI: 10.1186/s12874-020-01112-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Accepted: 09/02/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Psychiatric disorders may occur as a single episode or be persistent and relapsing, sometimes leading to suicidal behaviours. The exact causes of psychiatric disorders are hard to determine but easy access to health care services can help to reduce their severity. The aim of this study was to investigate the factors associated with repeated hospitalizations among the patients with psychiatric illness, which may help the policy makers to target the high-risk groups in a more focused manner. METHODS A large linked administrative database consisting of 200,537 patients with psychiatric diagnosis in the years of 2008-2012 was used in this analysis. Various counts regression models including zero-inflated and hurdle models were considered for analyzing the hospitalization rate among patients with psychiatric disorders within three months follow-up since their index visit dates. The covariates for this study consisted of socio-demographic and clinical characteristics of the patients. RESULTS The results show that the odds of hospitalization are significantly higher among registered Indians, male patients and younger patients. Hospitalization rate depends on the patients' disease types. Having previously visited a general physician served a protective role for psychiatric hospitalization during the study period. Patients who had seen an outpatient psychiatrist were more likely to have a higher number of psychiatric hospitalizations. This may indicate that psychiatrists tend to see patients with more severe illnesses, who require hospital-based care for managing their illness. CONCLUSIONS Providing easier access to registered Indian people and youth may reduce the need for hospital-based care. Patients with mental health conditions may benefit from greater and more timely access to primary care.
Collapse
Affiliation(s)
- Sharmin Sharker
- School of Public Health, University of Saskatchewan, 104 Clinic Place, Saskatoon, Canada
| | - Lloyd Balbuena
- Department of Psychiatry, College of Medicine, University of Saskatchewan, 103 Hospital Drive, Saskatoon, S7N 0W8, Canada
| | - Gene Marcoux
- Department of Psychiatry, College of Medicine, University of Saskatchewan, 103 Hospital Drive, Saskatoon, S7N 0W8, Canada
| | - Cindy Xin Feng
- School of Public Health, University of Saskatchewan, 104 Clinic Place, Saskatoon, Canada. .,Department of Community Health and Epidemiology, Faculty of Medicine, Dalhousie University, 5790 University Avenue, Halifax, B3H 1V7, Canada.
| |
Collapse
|