1
|
Cai R, Zhang J, Li Z, Zeng C, Qiao S, Li X. Using Twitter Data to Estimate the Prevalence of Symptoms of Mental Disorders in the United States During the COVID-19 Pandemic: Ecological Cohort Study. JMIR Form Res 2022; 6:e37582. [PMID: 36459569 PMCID: PMC9770024 DOI: 10.2196/37582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 11/29/2022] [Accepted: 11/30/2022] [Indexed: 12/05/2022] Open
Abstract
BACKGROUND Existing research and national surveillance data suggest an increase of the prevalence of mental disorders during the COVID-19 pandemic. Social media platforms, such as Twitter, could be a source of data for estimation owing to its real-time nature, high availability, and large geographical coverage. However, there is a dearth of studies validating the accuracy of the prevalence of mental disorders on Twitter compared to that reported by the Centers for Disease Control and Prevention (CDC). OBJECTIVE This study aims to verify the feasibility of Twitter-based prevalence of mental disorders symptoms being an instrument for prevalence estimation, where feasibility is gauged via correlations between Twitter-based prevalence of mental disorder symptoms (ie, anxiety and depressive symptoms) and that based on national surveillance data. In addition, this study aims to identify how the correlations changed over time (ie, the temporal trend). METHODS State-level prevalence of anxiety and depressive symptoms was retrieved from the national Household Pulse Survey (HPS) of the CDC from April 2020 to July 2021. Tweets were retrieved from the Twitter streaming application programming interface during the same period and were used to estimate the prevalence of symptoms of mental disorders for each state using keyword analysis. Stratified linear mixed models were used to evaluate the correlations between the Twitter-based prevalence of symptoms of mental disorders and those reported by the CDC. The magnitude and significance of model parameters were considered to evaluate the correlations. Temporal trends of correlations were tested after adding the time variable to the model. Geospatial differences were compared on the basis of random effects. RESULTS Pearson correlation coefficients between the overall prevalence reported by the CDC and that on Twitter for anxiety and depressive symptoms were 0.587 (P<.001) and 0.368 (P<.001), respectively. Stratified by 4 phases (ie, April 2020, August 2020, October 2020, and April 2021) defined by the HPS, linear mixed models showed that Twitter-based prevalence for anxiety symptoms had a positive and significant correlation with CDC-reported prevalence in phases 2 and 3, while a significant correlation for depressive symptoms was identified in phases 1 and 3. CONCLUSIONS Positive correlations were identified between Twitter-based and CDC-reported prevalence, and temporal trends of these correlations were found. Geospatial differences in the prevalence of symptoms of mental disorders were found between the northern and southern United States. Findings from this study could inform future investigation on leveraging social media platforms to estimate symptoms of mental disorders and the provision of immediate prevention measures to improve health outcomes.
Collapse
Affiliation(s)
- Ruilie Cai
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
| | - Jiajia Zhang
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
- South Carolina SmartState Center for Healthcare Quality, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
- University of South Carolina Big Data Health Science Center, Columbia, SC, United States
| | - Zhenlong Li
- South Carolina SmartState Center for Healthcare Quality, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
- University of South Carolina Big Data Health Science Center, Columbia, SC, United States
- Geoinformation and Big Data Research Lab, Department of Geography, University of South Carolina, Columbia, SC, United States
| | - Chengbo Zeng
- South Carolina SmartState Center for Healthcare Quality, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
- University of South Carolina Big Data Health Science Center, Columbia, SC, United States
- Department of Health Promotion, Education and Behavior, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
| | - Shan Qiao
- South Carolina SmartState Center for Healthcare Quality, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
- University of South Carolina Big Data Health Science Center, Columbia, SC, United States
- Department of Health Promotion, Education and Behavior, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
| | - Xiaoming Li
- South Carolina SmartState Center for Healthcare Quality, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
- University of South Carolina Big Data Health Science Center, Columbia, SC, United States
- Department of Health Promotion, Education and Behavior, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
| |
Collapse
|