1
|
Fakieh B, Saleem F. COVID-19 from symptoms to prediction: A statistical and machine learning approach. Comput Biol Med 2024; 182:109211. [PMID: 39342677 DOI: 10.1016/j.compbiomed.2024.109211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 09/02/2024] [Accepted: 09/23/2024] [Indexed: 10/01/2024]
Abstract
During the COVID-19 pandemic, the analysis of patient data has become a cornerstone for developing effective public health strategies. This study leverages a dataset comprising over 10,000 anonymized patient records from various leading medical institutions to predict COVID-19 patient age groups using a suite of statistical and machine learning techniques. Initially, extensive statistical tests including ANOVA and t-tests were utilized to assess relationships among demographic and symptomatic variables. The study then employed machine learning models such as Decision Tree, Naïve Bayes, KNN, Gradient Boosted Trees, Support Vector Machine, and Random Forest, with rigorous data preprocessing to enhance model accuracy. Further improvements were sought through ensemble methods; bagging, boosting, and stacking. Our findings indicate strong associations between key symptoms and patient age groups, with ensemble methods significantly enhancing model accuracy. Specifically, stacking applied with random forest as a meta leaner exhibited the highest accuracy (0.7054). In addition, the implementation of stacking techniques notably improved the performance of K-Nearest Neighbors (from 0.529 to 0.63) and Naïve Bayes (from 0.554 to 0.622) and demonstrated the most successful prediction method. The study aimed to understand the number of symptoms identified in COVID-19 patients and their association with different age groups. The results can assist doctors and higher authorities in improving treatment strategies. Additionally, several decision-making techniques can be applied during pandemic, tailored to specific age groups, such as resource allocation, medicine availability, vaccine development, and treatment strategies. The integration of these predictive models into clinical settings could support real-time public health responses and targeted intervention strategies.
Collapse
Affiliation(s)
- Bahjat Fakieh
- Department of Information System, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Farrukh Saleem
- School of Built Environment, Engineering, and Computing, Leeds Beckett University, Leeds, LS6 3QR, UK.
| |
Collapse
|
2
|
Sergio AR, Schimit PHT. Optimizing Contact Network Topological Parameters of Urban Populations Using the Genetic Algorithm. ENTROPY (BASEL, SWITZERLAND) 2024; 26:661. [PMID: 39202131 PMCID: PMC11353388 DOI: 10.3390/e26080661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 07/11/2024] [Accepted: 07/26/2024] [Indexed: 09/03/2024]
Abstract
This paper explores the application of complex network models and genetic algorithms in epidemiological modeling. By considering the small-world and Barabási-Albert network models, we aim to replicate the dynamics of disease spread in urban environments. This study emphasizes the importance of accurately mapping individual contacts and social networks to forecast disease progression. Using a genetic algorithm, we estimate the input parameters for network construction, thereby simulating disease transmission within these networks. Our results demonstrate the networks' resemblance to real social interactions, highlighting their potential in predicting disease spread. This study underscores the significance of complex network models and genetic algorithms in understanding and managing public health crises.
Collapse
|
3
|
Cortes-Ramirez J, Wilches-Vega J, Caicedo-Velasquez B, Paris-Pineda O, Sly P. Spatiotemporal hierarchical Bayesian analysis to identify factors associated with COVID-19 in suburban areas in Colombia. Heliyon 2024; 10:e30182. [PMID: 38707376 PMCID: PMC11068642 DOI: 10.1016/j.heliyon.2024.e30182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Revised: 04/21/2024] [Accepted: 04/22/2024] [Indexed: 05/07/2024] Open
Abstract
Introduction The pandemic had a profound impact on the provision of health services in Cúcuta, Colombia where the neighbourhood-level risk of Covid-19 has not been investigated. Identifying the sociodemographic and environmental risk factors of Covid-19 in large cities is key to better estimate its morbidity risk and support health strategies targeting specific suburban areas. This study aims to identify the risk factors associated with the risk of Covid-19 in Cúcuta considering inter -spatial and temporal variations of the disease in the city's neighbourhoods between 2020 and 2022. Methods Age-adjusted rate of Covid-19 were calculated in each Cúcuta neighbourhood and each quarter between 2020 and 2022. A hierarchical spatial Bayesian model was used to estimate the risk of Covid-19 adjusting for socioenvironmental factors per neighbourhood across the study period. Two spatiotemporal specifications were compared (a nonparametric temporal trend; with and without space-time interaction). The posterior mean of the spatial and spatiotemporal effects was used to map the Covid-19 risk. Results There were 65,949 Covid-19 cases in the study period with a varying standardized Covid-19 rate that peaked in October-December 2020 and April-June 2021. Both models identified an association of the poverty and stringency indexes, education level and PM10 with Covid-19 although the best fit model with a space-time interaction estimated a strong association with the number of high-traffic roads only. The highest risk of Covid-19 was found in neighbourhoods in west, central, and east Cúcuta. Conclusions The number of high-traffic roads is the most important risk factor of Covid-19 infection in Cucuta. This indicator of mobility and connectivity overrules other socioenvironmental factors when Bayesian models include a space-time interaction. Bayesian spatial models are important tools to identify significant determinants of Covid-19 and identifying at-risk neighbourhoods in large cities. Further research is needed to establish causal links between these factors and Covid-19.
Collapse
Affiliation(s)
- J. Cortes-Ramirez
- Centre for Data Science. Queensland University of Technology, Australia
- Faculty of Medical and Health Sciences, University of Santander, Colombia
- Children's Health and Environment Program, Child Health Research Centre, The University of Queensland, Australia
| | - J.D. Wilches-Vega
- Faculty of Medical and Health Sciences, University of Santander, Colombia
| | - B. Caicedo-Velasquez
- Epidemiology Research Group, Faculty of Public Health, University of Antioquia, Colombia
| | - O.M. Paris-Pineda
- Faculty of Medical and Health Sciences, University of Santander, Colombia
| | - P.D. Sly
- Children's Health and Environment Program, Child Health Research Centre, The University of Queensland, Australia
| |
Collapse
|
4
|
Uddin S, Khan A, Lu H, Zhou F, Karim S, Hajati F, Moni MA. Road networks and socio-demographic factors to explore COVID-19 infection during its different waves. Sci Rep 2024; 14:1551. [PMID: 38233430 PMCID: PMC10794216 DOI: 10.1038/s41598-024-51610-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 01/07/2024] [Indexed: 01/19/2024] Open
Abstract
The COVID-19 pandemic triggered an unprecedented level of restrictive measures globally. Most countries resorted to lockdowns at some point to buy the much-needed time for flattening the curve and scaling up vaccination and treatment capacity. Although lockdowns, social distancing and business closures generally slowed the case growth, there is a growing concern about these restrictions' social, economic and psychological impact, especially on the disadvantaged and poorer segments of society. While we are all in this together, these segments often take the heavier toll of the pandemic and face harsher restrictions or get blamed for community transmission. This study proposes a road-network-based networked approach to model mobility patterns between localities during lockdown stages. It utilises a panel regression method to analyse the effects of mobility in transmitting COVID-19 in an Australian context, together with a close look at a suburban population's characteristics like their age, income and education. Firstly, we attempt to model how the local road networks between the neighbouring suburbs (i.e., neighbourhood measure) and current infection count affect the case growth and how they differ between delta and omicron variants. We use a geographic information system, population and infection data to measure road connections, mobility and transmission probability across the suburbs. We then looked at three socio-demographic variables: age, education and income and explored how they moderate independent and dependent variables (infection rates and neighbourhood measures). The result shows strong model performance to predict infection rate based on neighbourhood road connection. However, apart from age in the delta variant context, the other variables (income and education level) do not seem to moderate the relationship between infection rate and neighbourhood measure. The results indicate that suburbs with a more socio-economically disadvantaged population do not necessarily contribute to more community transmission. The study findings could be potentially helpful for stakeholders in tailoring any health decision for future pandemics.
Collapse
Affiliation(s)
- Shahadat Uddin
- School of Project Management, Faculty of Engineering, The University of Sydney, Forest Lodge, NSW, 2037, Australia.
| | - Arif Khan
- School of Project Management, Faculty of Engineering, The University of Sydney, Forest Lodge, NSW, 2037, Australia
| | - Haohui Lu
- School of Project Management, Faculty of Engineering, The University of Sydney, Forest Lodge, NSW, 2037, Australia
| | - Fangyu Zhou
- School of Project Management, Faculty of Engineering, The University of Sydney, Forest Lodge, NSW, 2037, Australia
| | - Shakir Karim
- School of Project Management, Faculty of Engineering, The University of Sydney, Forest Lodge, NSW, 2037, Australia
| | - Farshid Hajati
- School of Science and Technology, University of New England, Armidale, NSW, 2350, Australia
| | - Mohammad Ali Moni
- Artificial Intelligence and Cyber Futures Institute, Charles Sturt University, Bathurst, NSW, 2795, Australia
| |
Collapse
|
5
|
Relationships between COVID-19 and disaster risk in Costa Rican municipalities. NATURAL HAZARDS RESEARCH 2023; 3:336-343. [PMCID: PMC9922674 DOI: 10.1016/j.nhres.2023.02.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 02/01/2023] [Accepted: 02/07/2023] [Indexed: 07/23/2024]
Abstract
The COVID-19 pandemic has had far-reaching impacts on every aspect of human life since the first confirmed case in December 2019. Costa Rica reported its first case of COVID-19 in March 2020, coinciding with a notable correlation between the occurrence of disaster events at the municipal scale over the past five decades. In Costa Rica, over 90% of disasters are hydrometeorological in nature, while geological disasters have caused significant economic and human losses throughout the country's history. To analyze the relationship between COVID-19 cases and disaster events in Costa Rica, two Generalized Linear Models (GLMs) were used to statistically evaluate the influence of socio-environmental parameters such as population density, social development index, road density, and non-forested areas. The results showed that population and road density are the most critical factors in explaining the spread of COVID-19, while population density and social development index can provide insights into disaster events at the municipal level in Costa Rica. This study provides valuable information for understanding municipal vulnerability and exposure to disasters in Costa Rica and can serve as a model for other countries to assess disaster risk.
Collapse
|
6
|
Big Data, Decision Models, and Public Health. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19148543. [PMID: 35886394 PMCID: PMC9324609 DOI: 10.3390/ijerph19148543] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 07/08/2022] [Indexed: 12/10/2022]
|
7
|
Comparing the Impact of Road Networks on COVID-19 Severity between Delta and Omicron Variants: A Study Based on Greater Sydney (Australia) Suburbs. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19116551. [PMID: 35682134 PMCID: PMC9180306 DOI: 10.3390/ijerph19116551] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 05/09/2022] [Accepted: 05/25/2022] [Indexed: 12/01/2022]
Abstract
The Omicron and Delta variants of COVID-19 have recently become the most dominant virus strains worldwide. A recent study on the Delta variant found that a suburban road network provides a reliable proxy for human mobility to explore COVID-19 severity. This study first examines the impact of road networks on COVID-19 severity for the Omicron variant using the infection and road connections data from Greater Sydney, Australia. We then compare the findings of this study with a recent study that used the infection data of the Delta variant for the same region. In analysing the road network, we used four centrality measures (degree, closeness, betweenness and eigenvector) and the coreness measure. We developed two multiple linear regression models for Delta and Omicron variants using the same set of independent and dependent variables. Only eigenvector is a statistically significant predictor for COVID-19 severity for the Omicron variant. On the other hand, both degree and eigenvector are statistically significant predictors for the Delta variant, as found in a recent study considered for comparison. We further found a statistical difference (p < 0.05) between the R-squared values for these two multiple linear regression models. Our findings point to an important difference in the transmission nature of Delta and Omicron variants, which could provide practical insights into understanding their infectious nature and developing appropriate control strategies accordingly.
Collapse
|