Curtis B, Giorgi S, Buffone AEK, Ungar LH, Ashford RD, Hemmons J, Summers D, Hamilton C, Schwartz HA. Can Twitter be used to predict county excessive alcohol consumption rates?
PLoS One 2018;
13:e0194290. [PMID:
29617408 PMCID:
PMC5884504 DOI:
10.1371/journal.pone.0194290]
[Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Accepted: 02/28/2018] [Indexed: 01/26/2023] Open
Abstract
Objectives
The current study analyzes a large set of Twitter data from 1,384 US counties to determine whether excessive alcohol consumption rates can be predicted by the words being posted from each county.
Methods
Data from over 138 million county-level tweets were analyzed using predictive modeling, differential language analysis, and mediating language analysis.
Results
Twitter language data captures cross-sectional patterns of excessive alcohol consumption beyond that of sociodemographic factors (e.g. age, gender, race, income, education), and can be used to accurately predict rates of excessive alcohol consumption. Additionally, mediation analysis found that Twitter topics (e.g. ‘ready gettin leave’) can explain much of the variance associated between socioeconomics and excessive alcohol consumption.
Conclusions
Twitter data can be used to predict public health concerns such as excessive drinking. Using mediation analysis in conjunction with predictive modeling allows for a high portion of the variance associated with socioeconomic status to be explained.
Collapse