1
|
Guo Y, Zhang L, Pang S, Cui X, Zhao X, Feng Y. Deep learning models for hepatitis E incidence prediction leveraging Baidu index. BMC Public Health 2024; 24:3014. [PMID: 39478514 PMCID: PMC11526602 DOI: 10.1186/s12889-024-20532-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Accepted: 10/28/2024] [Indexed: 11/02/2024] Open
Abstract
BACKGROUND Infectious diseases are major medical and social challenges of the 21st century. Accurately predicting incidence is of great significance for public health organizations to prevent the spread of diseases. Internet search engine data, like Baidu search index, may be useful for analyzing epidemics and improving prediction. METHODS We collected data on hepatitis E incidence and cases in Shandong province from January 2009 to December 2022 are extracted. Baidu index is available from January 2009 to December 2022. Employing Pearson correlation analysis, we validated the relationship between the Baidu index and hepatitis E incidence. We utilized various LSTM architectures, including LSTM, stacked LSTM, attention-based LSTM, and attention-based stacked LSTM, to forecast hepatitis E incidence both with and without incorporating the Baidu index. Meanwhile, we introduce KAN to LSTM models for improving nonlinear learning capability. The performance of models are evaluated by three standard quality metrics, including root mean square error(RMSE), mean absolute percentage error(MAPE) and mean absolute error(MAE). RESULTS Adjusting for the Baidu index altered the correlation between hepatitis E incidence and the Baidu index from -0.1654 to 0.1733. Without Baidu index, we obtained 17.04±0.13%, 17.19±0.57%, in terms of MAPE, by LSTM and attention based stacked LSTM, respectively. With the Baidu index, we obtained 15.36±0.16%, 15.15±0.07%, in term of MAPE, by the same methods. The prediction accuracy increased by 2%. The methods with KAN can improve the performance by 0.3%. More detailed results are shown in results section of this paper. CONCLUSIONS Our experiments reveal a weak correlation and similar trends between the Baidu index and hepatitis E incidence. Baidu index proves to be valuable for predicting hepatitis E incidence. Furthermore, stack layers and KAN can also improve the representational ability of LSTM models.
Collapse
Affiliation(s)
- Yanhui Guo
- School of Data and Computer Science, Shandong Women's University, 2399 Daxue Road, Changqing District, Ji'nan, 250300, Shandong, China
| | - Li Zhang
- Shandong Provincial Key Laboratory of Infectious Disease Control and Prevention, Shandong Center for Disease Control and Prevention, 16992 Jingshi Road, Lixia District, Ji'nan, 250014, Shandong, China
| | - Shengnan Pang
- School of Journalism and Communication, Tsinghua University, 30 Shuangqing Road, Haidian District, Beijing, 100018, Beijing, China
| | - Xiya Cui
- School of Data and Computer Science, Shandong Women's University, 2399 Daxue Road, Changqing District, Ji'nan, 250300, Shandong, China
| | - Xuechen Zhao
- School of Data and Computer Science, Shandong Women's University, 2399 Daxue Road, Changqing District, Ji'nan, 250300, Shandong, China
| | - Yi Feng
- Shandong Provincial Key Laboratory of Infectious Disease Control and Prevention, Shandong Center for Disease Control and Prevention, 16992 Jingshi Road, Lixia District, Ji'nan, 250014, Shandong, China.
| |
Collapse
|
2
|
Chu Y, Li W, Wang S, Jia G, Zhang Y, Dai H. Increasing public concern on insomnia during the COVID-19 outbreak in China: An info-demiology study. Heliyon 2022; 8:e11830. [PMID: 36439717 PMCID: PMC9681991 DOI: 10.1016/j.heliyon.2022.e11830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Revised: 09/19/2022] [Accepted: 11/16/2022] [Indexed: 11/24/2022] Open
Abstract
Background Since December 2019, an unexplained pneumonia has broken out in Wuhan, Hubei Province, China. In order to prevent the rapid spread of this disease, quarantine or lockdown measures were taken by the Chinese government. These measures turned out to be effective in containing the contagious disease. In spite of that, social distancing measures, together with disease itself, would potentially cause certain health risks among the affected population, such as sleep disorder. We herein conducted this web search analysis so as to examine the temporal and spatial changes of public search volume of the mental health topic of "insomnia" during COVID-19 pandemic in China. Methods The data sources included Baidu Index (BDI) to analyze related search terms and the official website of the National Health Commission of the People's Republic of China to collect the daily number of newly confirmed COVID-19 cases. Following a descriptive analysis of the overall search situation, Spearman's correlation analysis was used to analyze the relationship between the daily insomnia-related search values and the daily newly confirmed cases. The means of search volume for insomnia-related terms during the COVID-19 outbreak period (January 23rd, 2020 to April 8th, 2020) were compared with those during 2016-2019 using Student's t test. Finally, by analyzing the overall daily mean of insomnia in various provinces, we further evaluated whether there existed regional differences in searching for insomnia during the COVID-19 outbreak period. Results During the COVID-19 outbreak period, the number of insomnia-related searches increased significantly, especially the average daily the BDI for the term "1 min to fall asleep immediately". Spearman's correlation analysis showed that 6 out of the 10 insomnia-related keywords were significantly positively related to the daily newly confirmed cases. Compared with the same period in the past four years, a significantly increased search volume was found in 60.0% (6/10) insomnia-related terms during the COVID-19 outbreak period. We also found that Guangdong province had the highest number of searches for insomnia-related during the pandemic. Conclusions The surge in the number of confirmed cases during the COVID-19 pandemic has led to an increase in concern and online searches on this topic of insomnia. Further studies are needed to determine whether the search behavior truly reflect the real-time prevalence profile of relevant mental disorders, and further to establish a risk prediction model to determine the prevalence risk of psychopathological disorders, including insomnia, using insomnia-related BDI and other well-established risk factors.
Collapse
Affiliation(s)
- Yuying Chu
- School of Nursing, Jinzhou Medical University, Jinzhou, 121001, Liaoning, PR China
| | - Wenhui Li
- Experimental Teaching Center of Basic Medicine, Jinzhou Medical University, Jinzhou, 121001, Liaoning, PR China
| | - Suyan Wang
- Centre for Mental Health Guidance, Jinzhou Medical University, Jinzhou, 121001, Liaoning, PR China
| | - Guizhi Jia
- Department of Physiology, Jinzhou Medical University, Jinzhou 121001, PR China
| | - Yuqiang Zhang
- Department of Orthopaedics, First Affiliated Hospital, Jinzhou Medical University, Jinzhou 121001, PR China
| | - Hongliang Dai
- School of Nursing, Jinzhou Medical University, Jinzhou, 121001, Liaoning, PR China
| |
Collapse
|
3
|
Liu K, Xie Z, Xie B, Chen S, Zhang Y, Wang W, Wu Q, Cai G, Chen B. Bridging the Gap in End Tuberculosis Targets in the Elderly Population in Eastern China: Observational Study From 2015 to 2020. JMIR Public Health Surveill 2022; 8:e39142. [PMID: 35904857 PMCID: PMC9377476 DOI: 10.2196/39142] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 06/06/2022] [Accepted: 06/09/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND With a progressive increase in the aging process, the challenges posed by pulmonary tuberculosis (PTB) are also increasing for the elderly population. OBJECTIVE This study aimed to identify the epidemiological distribution of PTB among the elderly, forecast the achievement of the World Health Organization's 2025 goal in this specific group, and predict further advancement of PTB in the eastern area of China. METHODS All notified active PTB cases aged ≥65 years from Zhejiang Province were screened and analyzed. The general epidemiological characteristics were depicted and presented using the ArcGIS software. Further prediction of PTB was performed using R and SPSS software programs. RESULTS Altogether 41,431 cases aged ≥65 years were identified by the surveillance system from 2015 to 2020. After excluding extrapulmonary TB cases, we identified 39,832 PTB cases, including laboratory-confirmed (23,664, 59.41%) and clinically diagnosed (16,168, 40.59%) PTB. The notified PTB incidence indicated an evident downward trend with a reduction of 30%; however, the incidence of bacteriologically positive cases was steady at approximately 60/100,000. Based on the geographical distribution, Quzhou and Jinhua Cities had a higher PTB incidence among the elderly. The delay in PTB diagnosis was identified, and a significantly prolonged treatment course was observed in the elderly. Moreover, a 50% reduction of PTB incidence by the middle of 2024 was predicted using a linear regression model. It was found that using the exponential smoothing model would be better to predict the PTB trend in the elderly than a seasonal autoregressive integrated moving average model. CONCLUSIONS More comprehensive and effective interventions such as active PTB screening combined with physical checkup and succinct health education should be implemented and strengthened in the elderly. A more systematic assessment of the PTB epidemic trend in the elderly population should be considered to incorporate more predictive factors.
Collapse
Affiliation(s)
- Kui Liu
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, China
| | | | - Bo Xie
- School of Urban Design, Wuhan University, Wuhan, China
| | - Songhua Chen
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, China
| | - Yu Zhang
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, China
| | - Wei Wang
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, China
| | - Qian Wu
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, China
| | - Gaofeng Cai
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, China
| | - Bin Chen
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, China
| |
Collapse
|
4
|
Zhang Y, Bambrick H, Mengersen K, Tong S, Hu W. Using internet-based query and climate data to predict climate-sensitive infectious disease risks: a systematic review of epidemiological evidence. INTERNATIONAL JOURNAL OF BIOMETEOROLOGY 2021; 65:2203-2214. [PMID: 34075475 DOI: 10.1007/s00484-021-02155-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 05/25/2021] [Accepted: 05/27/2021] [Indexed: 06/12/2023]
Abstract
The use of internet-based query data offers a novel approach to improve disease surveillance and provides timely disease information. This paper systematically reviewed the literature on infectious disease predictions using internet-based query data and climate factors, discussed the current research progress and challenges, and provided some recommendations for future studies. We searched the relevant articles in the PubMed, Scopus, and Web of Science databases between January 2000 and December 2019. We initially included studies that used internet-based query data to predict infectious disease epidemics, then we further filtered and appraised the studies that used both internet-based query data and climate factors. In total, 129 relevant papers were included in the review. The results showed that most studies used a simple descriptive approach (n=80; 62%) to detect epidemics of influenza (including influenza-like illness (ILI)) (n=88; 68%) and dengue (n=9; 7%). Most studies (n=61; 47%) purely used internet search metrics to predict the epidemics of infectious diseases, while only 3 out of the 129 papers included both climate variables and internet-based query data. Our research shows that including internet-based query data and climate variables could better predict climate-sensitive infectious disease epidemics; however, this method has not been widely used to date. Moreover, previous studies did not sufficiently consider the spatiotemporal uncertainty of infectious diseases. Our review suggests that further research should use both internet-based query and climate data to develop predictive models for climate-sensitive infectious diseases based on spatiotemporal models.
Collapse
Affiliation(s)
- Yuzhou Zhang
- School of Public Health and Social Work, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Hilary Bambrick
- School of Public Health and Social Work, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Kerrie Mengersen
- Science and Engineering Faculty, Mathematical Sciences and Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Shilu Tong
- School of Public Health and Social Work, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Queensland, Australia
- Shanghai Children's Medical Centre, Shanghai Jiao-Tong University, Shanghai, China
- School of Public Health and Institute of Environment and Human Health, Anhui Medical University, Hefei, Anhui, China
- Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Wenbiao Hu
- School of Public Health and Social Work, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Queensland, Australia.
| |
Collapse
|
5
|
Short-Term Impacts of Meteorology, Air Pollution, and Internet Search Data on Viral Diarrhea Infection among Children in Jilin Province, China. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph182111615. [PMID: 34770125 PMCID: PMC8582928 DOI: 10.3390/ijerph182111615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Revised: 10/29/2021] [Accepted: 11/03/2021] [Indexed: 01/08/2023]
Abstract
The influence of natural environmental factors and social factors on children’s viral diarrhea remains inconclusive. This study aimed to evaluate the short-term effects of temperature, precipitation, air quality, and social attention on children’s viral diarrhea in temperate regions of China by using the distribution lag nonlinear model (DLNM). We found that low temperature affected the increase in children’s viral diarrhea infection for about 1 week, while high temperature and heavy precipitation affected the increase in children’s viral diarrhea infection risk for at least 3 weeks. As the increase of the air pollution index may change the daily life of the public, the infection of children’s viral diarrhea can be restrained within 10 days, but the risk of infection will increase after 2 weeks. The extreme network search may reflect the local outbreak of viral diarrhea, which will significantly improve the infection risk. The above factors can help the departments of epidemic prevention and control create early warnings of high-risk outbreaks in time and assist the public to deal with the outbreak of children’s viral diarrhea.
Collapse
|
6
|
Forecasting Teleconsultation Demand Using an Ensemble CNN Attention-Based BILSTM Model with Additional Variables. Healthcare (Basel) 2021; 9:healthcare9080992. [PMID: 34442130 PMCID: PMC8391747 DOI: 10.3390/healthcare9080992] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 07/21/2021] [Accepted: 08/02/2021] [Indexed: 11/16/2022] Open
Abstract
To enhance the forecasting accuracy of daily teleconsultation demand, this study proposes an ensemble hybrid deep learning model. The proposed ensemble CNN attention-based BILSTM model (ECA-BILSTM) combines shallow convolutional neural networks (CNNs), attention mechanisms, and bidirectional long short-term memory (BILSTM). Moreover, additional variables are selected according to the characteristics of teleconsultation demand and added to the inputs of forecasting models. To verify the superiority of ECA-BILSTM and the effectiveness of additional variables, two actual teleconsultation datasets collected in the National Telemedicine Center of China (NTCC) are used as the experimental data. Results showed that ECA-BILSTMs can significantly outperform corresponding benchmark models. And two key additional variables were identified for teleconsultation demand prediction improvement. Overall, the proposed ECA-BILSTM model with effective additional variables is a feasible promising approach in teleconsultation demand forecasting.
Collapse
|
7
|
Wang Q, Zhang W, Cai H, Cao Y. Understanding the perceptions of Chinese women of the commercially available domestic and imported HPV vaccine: A semantic network analysis. Vaccine 2020; 38:8334-8342. [PMID: 33190947 DOI: 10.1016/j.vaccine.2020.11.016] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 10/08/2020] [Accepted: 11/03/2020] [Indexed: 12/14/2022]
Abstract
BACKGROUND A domestic human papillomavirus (HPV) vaccine, Cecolin, that protects against HPV strains 16 and 18 was introduced to the Chinese market at a relatively low price in May 2020.This study has explored Chinese women's perceptions of both domestic and imported HPV vaccines, which differ in price and valency. METHODS Sentiment analysis and semantic network analyses were performed based on a sample of 45,729 domestic HPV vaccine-related posts from females on the Sina Weibo between April 17 and May 2, 2020. The geographic distribution was also analyzed based on the users' locations, which were retrieved from the database. RESULTS Most of the posts were positive and neutral (85%), although 15% were negative (e.g., expressions of anger, sadness, fear and disgust). Semantic analyses of the negative posts revealed that Chinese women generally had positive attitudes towards the HPV vaccine and were willing to be vaccinated. However, obvious geographical variations were identified. Women who lived in economically developed areas expressed a stronger desire to obtain imported quadrivalent or nonavalent vaccines due to concerns regarding effectiveness and quality. The women expressed disgust and anger mainly regarding difficulties in making an appointment, age restrictions for the nonavalent vaccine and gender restrictions. However, the population targeted by the domestic vaccine, namely women who lived in economically undeveloped areas and had relatively low incomes, had a low awareness of the HPV vaccine. CONCLUSION Government should provide programs, which educate females that bivalent HPV vaccine can offer protection against the majority of high-risk HPV types. Increasing awareness of the domestic vaccine among the population in economically undeveloped areas and provision of free domestic bivalent HPV vaccination/screening for low-income high-risk women would help to prevent cervical carcinoma. This issue also depends on rebuilding trust and repairing damage to the relationship between government/domestic vaccine manufacturers and the public.
Collapse
Affiliation(s)
- Qi Wang
- School of Industrial Design, Hubei University of Technology, 28 Nanli Road, Wuhan 430068, PR China
| | - Wen Zhang
- School of Journalism and Culture Communication, Zhongnan University of Economics and Law, 182 Nanhu Avenue, Wuhan 430073, PR China.
| | - Hongning Cai
- Department of Gynecological Oncology, Hubei Maternal and Child Health Care Hospital, Wuhan 430070, Hubei, PR China
| | - Yuan Cao
- Cloud-clone Diagnostic Reagents Institute, Wuhan 430056, Hubei, PR China
| |
Collapse
|
8
|
Huang Q, Kang YS. Mathematical Modeling of COVID-19 Control and Prevention Based on Immigration Population Data in China: Model Development and Validation. JMIR Public Health Surveill 2020; 6:e18638. [PMID: 32396132 PMCID: PMC7250064 DOI: 10.2196/18638] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Revised: 04/26/2020] [Accepted: 05/12/2020] [Indexed: 12/22/2022] Open
Abstract
Background At the end of February 2020, the spread of coronavirus disease (COVID-19) in China had drastically slowed and appeared to be under control compared to the peak data in early February of that year. However, the outcomes of COVID-19 control and prevention measures varied between regions (ie, provinces and municipalities) in China; moreover, COVID-19 has become a global pandemic, and the spread of the disease has accelerated in countries outside China. Objective This study aimed to establish valid models to evaluate the effectiveness of COVID-19 control and prevention among various regions in China. These models also targeted regions with control and prevention problems by issuing immediate warnings. Methods We built a mathematical model, the Epidemic Risk Time Series Model, and used it to analyze two sets of data, including the daily COVID-19 incidence (ie, newly diagnosed cases) as well as the daily immigration population size. Results Based on the results of the model evaluation, some regions, such as Shanghai and Zhejiang, were successful in COVID-19 control and prevention, whereas other regions, such as Heilongjiang, yielded poor performance. The evaluation result was highly correlated with the basic reproduction number (R0) value, and the result was evaluated in a timely manner at the beginning of the disease outbreak. Conclusions The Epidemic Risk Time Series Model was designed to evaluate the effectiveness of COVID-19 control and prevention in different regions in China based on analysis of immigration population data. Compared to other methods, such as R0, this model enabled more prompt issue of early warnings. This model can be generalized and applied to other countries to evaluate their COVID-19 control and prevention.
Collapse
Affiliation(s)
| | - Yu Sunny Kang
- School of Health and Human Services, University of Baltimore, Baltimore, MD, United States
| |
Collapse
|
9
|
Search trends and prediction of human brucellosis using Baidu index data from 2011 to 2018 in China. Sci Rep 2020; 10:5896. [PMID: 32246053 PMCID: PMC7125199 DOI: 10.1038/s41598-020-62517-7] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Accepted: 03/16/2020] [Indexed: 11/13/2022] Open
Abstract
Reporting on brucellosis, a relatively rare infectious disease caused by Brucella, is often delayed or incomplete in traditional disease surveillance systems in China. Internet search engine data related to brucellosis can provide an economical and efficient complement to a conventional surveillance system because people tend to seek brucellosis-related health information from Baidu, the largest search engine in China. In this study, brucellosis incidence data reported by the CDC of China and Baidu index data were gathered to evaluate the relationship between them. We applied an autoregressive integrated moving average (ARIMA) model and an ARIMA model with Baidu search index data as the external variable (ARIMAX) to predict the incidence of brucellosis. The two models based on brucellosis incidence data were then compared, and the ARIMAX model performed better in all the measurements we applied. Our results illustrate that Baidu index data can enhance the traditional surveillance system to monitor and predict brucellosis epidemics in China.
Collapse
|
10
|
Huang R, Luo G, Duan Q, Zhang L, Zhang Q, Tang W, Smith MK, Li J, Zou H. Using Baidu search index to monitor and predict newly diagnosed cases of HIV/AIDS, syphilis and gonorrhea in China: estimates from a vector autoregressive (VAR) model. BMJ Open 2020; 10:e036098. [PMID: 32209633 PMCID: PMC7202716 DOI: 10.1136/bmjopen-2019-036098] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVES Internet search engine data have been widely used to monitor and predict infectious diseases. Existing studies have found correlations between search data and HIV/AIDS epidemics. We aimed to extend the literature through exploring the feasibility of using search data to monitor and predict the number of newly diagnosed cases of HIV/AIDS, syphilis and gonorrhoea in China. METHODS This paper used vector autoregressive model to combine the number of newly diagnosed cases with Baidu search index to predict monthly newly diagnosed cases of HIV/AIDS, syphilis and gonorrhoea in China. The procedures included: (1) keywords selection and filtering; (2) construction of composite search index; (3) modelling with training data from January 2011 to October 2016 and calculating the prediction performance with validation data from November 2016 to October 2017. RESULTS The analysis showed that there was a close correlation between the monthly number of newly diagnosed cases and the composite search index (the Spearman's rank correlation coefficients were 0.777 for HIV/AIDS, 0.590 for syphilis and 0.633 for gonorrhoea, p<0.05 for all). The R2 were all more than 85% and the mean absolute percentage errors were less than 11%, showing the good fitting effect and prediction performance of vector autoregressive model in this field. CONCLUSIONS Our study indicated the potential feasibility of using Baidu search data to monitor and predict the number of newly diagnosed cases of HIV/AIDS, syphilis and gonorrhoea in China.
Collapse
Affiliation(s)
- Ruonan Huang
- School of Public Health, Sun Yat-Sen University, Guangzhou, China
| | - Ganfeng Luo
- School of Public Health (Shenzhen), Sun Yat-sen University, Shenzhen, China
| | - Qibin Duan
- The Kirby Institute, University of New South Wales, Sydney, New South Wales, Australia
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Lei Zhang
- China-Australia Joint Research Center for Infectious Diseass, School of Public Health, Xi'an Jiaotong University, Xi'an, China
- Melbourne Sexual Health Centre, Alfred Health, Melbourne, Victoria, Australia
- Central Clinical School, Faculty of Medicine Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Qingpeng Zhang
- School of Data Science, City University of Hong Kong, Kowloon, Hong Kong
| | - Weiming Tang
- University of North Carolina Project China, Guangzhou, China
- Southern Medical University Dermatology Hospital, Guangzhou, China
| | - M Kumi Smith
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota Twin Cities, Minneapolis, Minnesota, USA
| | - Jinghua Li
- School of Public Health, Sun Yat-Sen University, Guangzhou, China
- Sun Yat-sen Global Health Institute, Sun Yat-Sen University, Guangzhou, China
| | - Huachun Zou
- School of Public Health (Shenzhen), Sun Yat-sen University, Shenzhen, China
- The Kirby Institute, University of New South Wales, Sydney, New South Wales, Australia
| |
Collapse
|
11
|
Barros JM, Duggan J, Rebholz-Schuhmann D. The Application of Internet-Based Sources for Public Health Surveillance (Infoveillance): Systematic Review. J Med Internet Res 2020; 22:e13680. [PMID: 32167477 PMCID: PMC7101503 DOI: 10.2196/13680] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2019] [Revised: 09/18/2019] [Accepted: 11/26/2019] [Indexed: 12/30/2022] Open
Abstract
Background Public health surveillance is based on the continuous and systematic collection, analysis, and interpretation of data. This informs the development of early warning systems to monitor epidemics and documents the impact of intervention measures. The introduction of digital data sources, and specifically sources available on the internet, has impacted the field of public health surveillance. New opportunities enabled by the underlying availability and scale of internet-based sources (IBSs) have paved the way for novel approaches for disease surveillance, exploration of health communities, and the study of epidemic dynamics. This field and approach is also known as infodemiology or infoveillance. Objective This review aimed to assess research findings regarding the application of IBSs for public health surveillance (infodemiology or infoveillance). To achieve this, we have presented a comprehensive systematic literature review with a focus on these sources and their limitations, the diseases targeted, and commonly applied methods. Methods A systematic literature review was conducted targeting publications between 2012 and 2018 that leveraged IBSs for public health surveillance, outbreak forecasting, disease characterization, diagnosis prediction, content analysis, and health-topic identification. The search results were filtered according to previously defined inclusion and exclusion criteria. Results Spanning a total of 162 publications, we determined infectious diseases to be the preferred case study (108/162, 66.7%). Of the eight categories of IBSs (search queries, social media, news, discussion forums, websites, web encyclopedia, and online obituaries), search queries and social media were applied in 95.1% (154/162) of the reviewed publications. We also identified limitations in representativeness and biased user age groups, as well as high susceptibility to media events by search queries, social media, and web encyclopedias. Conclusions IBSs are a valuable proxy to study illnesses affecting the general population; however, it is important to characterize which diseases are best suited for the available sources; the literature shows that the level of engagement among online platforms can be a potential indicator. There is a necessity to understand the population’s online behavior; in addition, the exploration of health information dissemination and its content is significantly unexplored. With this information, we can understand how the population communicates about illnesses online and, in the process, benefit public health.
Collapse
Affiliation(s)
- Joana M Barros
- Insight Centre for Data Analytics, National University of Ireland Galway, Galway, Ireland.,School of Computer Science, National University of Ireland Galway, Galway, Ireland
| | - Jim Duggan
- School of Computer Science, National University of Ireland Galway, Galway, Ireland
| | | |
Collapse
|
12
|
Sips GJ, Dirven MJG, Donkervoort JT, van Kolfschoten FM, Schapendonk CME, Phan MVT, Bloem A, van Leeuwen AF, Trompenaars ME, Koopmans MPG, van der Eijk AA, de Graaf M, Fanoy EB. Norovirus outbreak in a natural playground: A One Health approach. Zoonoses Public Health 2020; 67:453-459. [PMID: 32037743 PMCID: PMC7318310 DOI: 10.1111/zph.12689] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 12/24/2019] [Accepted: 01/10/2020] [Indexed: 12/03/2022]
Abstract
Norovirus constitutes the most frequently identified infectious cause of disease outbreaks associated with untreated recreational water. When investigating outbreaks related to surface water, a One Health approach is insightful. Historically, there has been a focus on potential contamination of recreational water by bird droppings and a recent publication demonstrating human noroviruses in bird faeces suggested this should be investigated in future water‐related norovirus outbreaks. Here, we describe a One Health approach investigating a norovirus outbreak in a natural playground. On social media, a large amount of waterfowl were reported to defecate near these playground premises leading to speculations about their potential involvement. Surface water, as well as human and bird faecal specimens, was tested for human noroviruses. Norovirus was found to be the most likely cause of the outbreak but there was no evidence for transmission via waterfowl. Cases had become known on social media prior to notification to the public health service underscoring the potential of online media as an early warning system. In view of known risk factors, advice was given for future outbreak investigations and natural playground design.
Collapse
Affiliation(s)
- Gregorius J Sips
- Public Health Service Rotterdam-Rijnmond, Rotterdam, The Netherlands.,Department of Medical Microbiology and Infectious Diseases, Erasmus Medical Center, Rotterdam, The Netherlands
| | | | | | | | | | - My V T Phan
- Department of Viroscience, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Annemieke Bloem
- Department of Medical Microbiology and Infectious Diseases, Erasmus Medical Center, Rotterdam, The Netherlands
| | | | | | - Marion P G Koopmans
- Department of Viroscience, Erasmus Medical Center, Rotterdam, The Netherlands
| | | | - Miranda de Graaf
- Department of Viroscience, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Ewout B Fanoy
- Public Health Service Rotterdam-Rijnmond, Rotterdam, The Netherlands
| |
Collapse
|
13
|
Liang F, Guan P, Wu W, Huang D. Forecasting influenza epidemics by integrating internet search queries and traditional surveillance data with the support vector machine regression model in Liaoning, from 2011 to 2015. PeerJ 2018; 6:e5134. [PMID: 29967755 PMCID: PMC6022725 DOI: 10.7717/peerj.5134] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2018] [Accepted: 06/08/2018] [Indexed: 12/15/2022] Open
Abstract
Background Influenza epidemics pose significant social and economic challenges in China. Internet search query data have been identified as a valuable source for the detection of emerging influenza epidemics. However, the selection of the search queries and the adoption of prediction methods are crucial challenges when it comes to improving predictions. The purpose of this study was to explore the application of the Support Vector Machine (SVM) regression model in merging search engine query data and traditional influenza data. Methods The official monthly reported number of influenza cases in Liaoning province in China was acquired from the China National Scientific Data Center for Public Health from January 2011 to December 2015. Based on Baidu Index, a publicly available search engine database, search queries potentially related to influenza over the corresponding period were identified. An SVM regression model was built to be used for predictions, and the choice of three parameters (C, γ, ε) in the SVM regression model was determined by leave-one-out cross-validation (LOOCV) during the model construction process. The model’s performance was evaluated by the evaluation metrics including Root Mean Square Error, Root Mean Square Percentage Error and Mean Absolute Percentage Error. Results In total, 17 search queries related to influenza were generated through the initial query selection approach and were adopted to construct the SVM regression model, including nine queries in the same month, three queries at a lag of one month, one query at a lag of two months and four queries at a lag of three months. The SVM model performed well when with the parameters (C = 2, γ = 0.005, ɛ = 0.0001), based on the ensemble data integrating the influenza surveillance data and Baidu search query data. Conclusions The results demonstrated the feasibility of using internet search engine query data as the complementary data source for influenza surveillance and the efficiency of SVM regression model in tracking the influenza epidemics in Liaoning.
Collapse
Affiliation(s)
- Feng Liang
- Department of Epidemiology, School of Public Health, China Medical University, Shenyang, Liaoning, China
| | - Peng Guan
- Department of Epidemiology, School of Public Health, China Medical University, Shenyang, Liaoning, China
| | - Wei Wu
- Department of Epidemiology, School of Public Health, China Medical University, Shenyang, Liaoning, China
| | - Desheng Huang
- Department of Epidemiology, School of Public Health, China Medical University, Shenyang, Liaoning, China.,Department of Mathematics, School of Fundamental Sciences, China Medical University, Shenyang, Liaoning, China
| |
Collapse
|
14
|
van de Belt TH, van Stockum PT, Engelen LJLPG, Lancee J, Schrijver R, Rodríguez-Baño J, Tacconelli E, Saris K, van Gelder MMHJ, Voss A. Social media posts and online search behaviour as early-warning system for MRSA outbreaks. Antimicrob Resist Infect Control 2018; 7:69. [PMID: 29876100 PMCID: PMC5977481 DOI: 10.1186/s13756-018-0359-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Accepted: 05/15/2018] [Indexed: 01/02/2023] Open
Abstract
Background Despite many preventive measures, outbreaks with multi-drug resistant micro-organisms (MDROs) still occur. Moreover, current alert systems from healthcare organizations have shortcomings due to delayed or incomplete notifications, which may amplify the spread of MDROs by introducing infected patients into a new healthcare setting and institutions. Additional sources of information about upcoming and current outbreaks, may help to prevent further spread of MDROs. The study objective was to evaluate whether methicillin-resistant Staphylococcus aureus (MRSA) outbreaks could be detected via social media posts or online search behaviour; if so, this might allow earlier detection than the official notifications by healthcare organizations. Methods We conducted an exploratory study in which we compared information about MRSA outbreaks in the Netherlands derived from two online sources, Coosto for Social Media, and Google Trends for search behaviour, to the mandatory Dutch outbreak notification system (SO-ZI/AMR). The latter provides information on MDRO outbreaks including the date of the outbreak, micro-organism involved, the region/location, and the type of health care organization. Results During the research period of 15 months (455 days), 49 notifications of outbreaks were recorded in SO-ZI/AMR. For Coosto, the number of unique potential outbreaks was 37 and for Google Trends 24. The use of social media and online search behaviour missed many of the hospital outbreaks that were reported to SO-ZI/AMR, but detected additional outbreaks in long-term care facilities. Conclusions Despite several limitations, using information from social media and online search behaviour allows rapid identification of potential MRSA outbreaks, especially in healthcare settings with a low notification compliance. When combined in an automated system with real-time updates, this approach might increase early discovery and subsequent implementation of preventive measures.
Collapse
Affiliation(s)
- Tom H van de Belt
- Radboud REshape Innovation Center, Radboudumc University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, Netherlands
| | | | - Lucien J L P G Engelen
- Radboud REshape Innovation Center, Radboudumc University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, Netherlands
| | - Jules Lancee
- Radboud REshape Innovation Center, Radboudumc University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, Netherlands
| | | | - Jesús Rodríguez-Baño
- 3Unidad Clínica de Enfermedades Infecciosas y Microbiología Instituto de Biomedicina de Sevilla (IBiS) /Hospital Universitario Virgen Macarena / CSIC / Departamento de Medicina, Universidad de Sevilla, Sevilla, Spain
| | - Evelina Tacconelli
- 4Division of Infectious Diseases, Tübingen University Hospital, DZIF Center, Tübingen, Germany.,5Infectious Diseases, University of Verona, Verona, Italy
| | - Katja Saris
- 6Department of Medical Microbiology, Radboudumc, Nijmegen, The Netherlands.,8Department of Clinical Microbiology and Infectious Diseases, Canisius-Wilhelmina Hospital, Nijmegen, The Netherlands
| | - Marleen M H J van Gelder
- Radboud REshape Innovation Center, Radboudumc University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, Netherlands.,7Department for Health Evidence, Radboud Institute for Health Sciences, Radboudumc, Nijmegen, The Netherlands
| | - Andreas Voss
- Radboud REshape Innovation Center, Radboudumc University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, Netherlands.,6Department of Medical Microbiology, Radboudumc, Nijmegen, The Netherlands.,8Department of Clinical Microbiology and Infectious Diseases, Canisius-Wilhelmina Hospital, Nijmegen, The Netherlands
| |
Collapse
|
15
|
Mavragani A, Sampri A, Sypsa K, Tsagarakis KP. Integrating Smart Health in the US Health Care System: Infodemiology Study of Asthma Monitoring in the Google Era. JMIR Public Health Surveill 2018; 4:e24. [PMID: 29530839 PMCID: PMC5869181 DOI: 10.2196/publichealth.8726] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Revised: 10/15/2017] [Accepted: 01/13/2018] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND With the internet's penetration and use constantly expanding, this vast amount of information can be employed in order to better assess issues in the US health care system. Google Trends, a popular tool in big data analytics, has been widely used in the past to examine interest in various medical and health-related topics and has shown great potential in forecastings, predictions, and nowcastings. As empirical relationships between online queries and human behavior have been shown to exist, a new opportunity to explore the behavior toward asthma-a common respiratory disease-is present. OBJECTIVE This study aimed at forecasting the online behavior toward asthma and examined the correlations between queries and reported cases in order to explore the possibility of nowcasting asthma prevalence in the United States using online search traffic data. METHODS Applying Holt-Winters exponential smoothing to Google Trends time series from 2004 to 2015 for the term "asthma," forecasts for online queries at state and national levels are estimated from 2016 to 2020 and validated against available Google query data from January 2016 to June 2017. Correlations among yearly Google queries and between Google queries and reported asthma cases are examined. RESULTS Our analysis shows that search queries exhibit seasonality within each year and the relationships between each 2 years' queries are statistically significant (P<.05). Estimated forecasting models for a 5-year period (2016 through 2020) for Google queries are robust and validated against available data from January 2016 to June 2017. Significant correlations were found between (1) online queries and National Health Interview Survey lifetime asthma (r=-.82, P=.001) and current asthma (r=-.77, P=.004) rates from 2004 to 2015 and (2) between online queries and Behavioral Risk Factor Surveillance System lifetime (r=-.78, P=.003) and current asthma (r=-.79, P=.002) rates from 2004 to 2014. The correlations are negative, but lag analysis to identify the period of response cannot be employed until short-interval data on asthma prevalence are made available. CONCLUSIONS Online behavior toward asthma can be accurately predicted, and significant correlations between online queries and reported cases exist. This method of forecasting Google queries can be used by health care officials to nowcast asthma prevalence by city, state, or nationally, subject to future availability of daily, weekly, or monthly data on reported cases. This method could therefore be used for improved monitoring and assessment of the needs surrounding the current population of patients with asthma.
Collapse
Affiliation(s)
- Amaryllis Mavragani
- Department of Computing Science and Mathematics, Faculty of Natural Sciences, University of Stirling, Stirling, United Kingdom
| | - Alexia Sampri
- Department of Computing Science and Mathematics, Faculty of Natural Sciences, University of Stirling, Stirling, United Kingdom
| | - Karla Sypsa
- Department of Pharmacy and Forensic Science, King's College London, University of London, London, United Kingdom
| | - Konstantinos P Tsagarakis
- Business and Environmental Technology Economics Lab, Department of Environmental Engineering, Democritus University of Thrace, Xanthi, Greece
| |
Collapse
|