26
|
Abdur Rehman N, Salje H, Kraemer MUG, Subramanian L, Saif U, Chunara R. Quantifying the localized relationship between vector containment activities and dengue incidence in a real-world setting: A spatial and time series modelling analysis based on geo-located data from Pakistan. PLoS Negl Trop Dis 2020; 14:e0008273. [PMID: 32392225 PMCID: PMC7241855 DOI: 10.1371/journal.pntd.0008273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2019] [Revised: 05/21/2020] [Accepted: 04/07/2020] [Indexed: 11/19/2022] Open
Abstract
Increasing urbanization is having a profound effect on infectious disease risk, posing significant challenges for governments to allocate limited resources for their optimal control at a sub-city scale. With recent advances in data collection practices, empirical evidence about the efficacy of highly localized containment and intervention activities, which can lead to optimal deployment of resources, is possible. However, there are several challenges in analyzing data from such real-world observational settings. Using data on 3.9 million instances of seven dengue vector containment activities collected between 2012 and 2017, here we develop and assess two frameworks for understanding how the generation of new dengue cases changes in space and time with respect to application of different types of containment activities. Accounting for the non-random deployment of each containment activity in relation to dengue cases and other types of containment activities, as well as deployment of activities in different epidemiological contexts, results from both frameworks reinforce existing knowledge about the efficacy of containment activities aimed at the adult phase of the mosquito lifecycle. Results show a 10% (95% CI: 1-19%) and 20% reduction (95% CI: 4-34%) reduction in probability of a case occurring in 50 meters and 30 days of cases which had Indoor Residual Spraying (IRS) and fogging performed in the immediate vicinity, respectively, compared to cases of similar epidemiological context and which had no containment in their vicinity. Simultaneously, limitations due to the real-world nature of activity deployment are used to guide recommendations for future deployment of resources during outbreaks as well as data collection practices. Conclusions from this study will enable more robust and comprehensive analyses of localized containment activities in resource-scarce urban settings and lead to improved allocation of resources of government in an outbreak setting.
Collapse
|
27
|
Daughton AR, Chunara R, Paul MJ. Comparison of Social Media, Syndromic Surveillance, and Microbiologic Acute Respiratory Infection Data: Observational Study. JMIR Public Health Surveill 2020; 6:e14986. [PMID: 32329741 PMCID: PMC7210500 DOI: 10.2196/14986] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 09/27/2019] [Accepted: 02/09/2020] [Indexed: 11/30/2022] Open
Abstract
Background Internet data can be used to improve infectious disease models. However, the representativeness and individual-level validity of internet-derived measures are largely unexplored as this requires ground truth data for study. Objective This study sought to identify relationships between Web-based behaviors and/or conversation topics and health status using a ground truth, survey-based dataset. Methods This study leveraged a unique dataset of self-reported surveys, microbiological laboratory tests, and social media data from the same individuals toward understanding the validity of individual-level constructs pertaining to influenza-like illness in social media data. Logistic regression models were used to identify illness in Twitter posts using user posting behaviors and topic model features extracted from users’ tweets. Results Of 396 original study participants, only 81 met the inclusion criteria for this study. Of these participants’ tweets, we identified only two instances that were related to health and occurred within 2 weeks (before or after) of a survey indicating symptoms. It was not possible to predict when participants reported symptoms using features derived from topic models (area under the curve [AUC]=0.51; P=.38), though it was possible using behavior features, albeit with a very small effect size (AUC=0.53; P≤.001). Individual symptoms were also generally not predictable either. The study sample and a random sample from Twitter are predictably different on held-out data (AUC=0.67; P≤.001), meaning that the content posted by people who participated in this study was predictably different from that posted by random Twitter users. Individuals in the random sample and the GoViral sample used Twitter with similar frequencies (similar @ mentions, number of tweets, and number of retweets; AUC=0.50; P=.19). Conclusions To our knowledge, this is the first instance of an attempt to use a ground truth dataset to validate infectious disease observations in social media data. The lack of signal, the lack of predictability among behaviors or topics, and the demonstrated volunteer bias in the study population are important findings for the large and growing body of disease surveillance using internet-sourced data.
Collapse
|
28
|
Mhasawade V, Elghafari A, Duncan DT, Chunara R. Role of the Built and Online Social Environments on Expression of Dining on Instagram. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020; 17:E735. [PMID: 31979291 PMCID: PMC7037839 DOI: 10.3390/ijerph17030735] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Revised: 01/15/2020] [Accepted: 01/17/2020] [Indexed: 11/17/2022]
Abstract
Online social communities are becoming windows for learning more about the health of populations, through information about our health-related behaviors and outcomes from daily life. At the same time, just as public health data and theory has shown that aspects of the built environment can affect our health-related behaviors and outcomes, it is also possible that online social environments (e.g., posts and other attributes of our online social networks) can also shape facets of our life. Given the important role of the online environment in public health research and implications, factors which contribute to the generation of such data must be well understood. Here we study the role of the built and online social environments in the expression of dining on Instagram in Abu Dhabi; a ubiquitous social media platform, city with a vibrant dining culture, and a topic (food posts) which has been studied in relation to public health outcomes. Our study uses available data on user Instagram profiles and their Instagram networks, as well as the local food environment measured through the dining types (e.g., casual dining restaurants, food court restaurants, lounges etc.) by neighborhood. We find evidence that factors of the online social environment (profiles that post about dining versus profiles that do not post about dining) have different influences on the relationship between a user's built environment and the social dining expression, with effects also varying by dining types in the environment and time of day. We examine the mechanism of the relationships via moderation and mediation analyses. Overall, this study provides evidence that the interplay of online and built environments depend on attributes of said environments and can also vary by time of day. We discuss implications of this synergy for precisely-targeting public health interventions, as well as on using online data for public health research.
Collapse
|
29
|
Alburez-Gutierrez D, Chandrasekharan E, Chunara R, Gil-Clavel S, Hannak A, Interdonato R, Joseph K, Kalimeri K, Malik M, Mayer K, Mejova Y, Paolotti D, Zagheni E. Reports of the Workshops Held at the 2019 International AAAI Conference on Web and Social Media. AI MAG 2019. [DOI: 10.1609/aimag.v40i4.5287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Abstract
The workshop program of the Association for the Advancement of Artificial Intelligence’s 13th International Conference on Web and Social Media was held at the Bavarian School of Public Policy in Munich, Germany on June 11, 2019. There were five full-day workshops, one half-day workshop, and the annual evening Science Slam in the program. The proceedings of the workshops were published in Research Topic of the Frontiers in Big Data. This report contains summaries of those workshops.
Collapse
|
30
|
Editor M, An J, Chunara R, Crandall DJ, Frajberg D, French M, Jansen BJ, Kulshrestha J, Mejova Y, Romero DM, Salminen J, Sharma A, Sheth A, Tan C, Taylor SH, Wijeratne S. Reports of the Workshops Held at the 2018 International AAAI Conference on Web and Social Media. AI MAG 2018. [DOI: 10.1609/aimag.v39i4.2835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Abstract
The Workshop Program of the Association for the Advancement of Artificial Intelligence’s 12th International Conference on Web and Social Media (AAAI-18) was held at Stanford University, Stanford, California USA, on Monday, June 25, 2018. There were fourteen workshops in the program: Algorithmic Personalization and News: Risks and Opportunities; Beyond Online Data: Tackling Challenging Social Science Questions; Bridging the Gaps: Social Media, Use and Well-Being; Chatbot; Data-Driven Personas and Human-Driven Analytics: Automating Customer Insights in the Era of Social Media; Designed Data for Bridging the Lab and the Field: Tools, Methods, and Challenges in Social Media Experiments; Emoji Understanding and Applications in Social Media; Event Analytics Using Social Media Data; Exploring Ethical Trade-Offs in Social Media Research; Making Sense of Online Data for Population Research; News and Public Opinion; Social Media and Health: A Focus on Methods for Linking Online and Offline Data; Social Web for Environmental and Ecological Monitoring and The ICWSM Science Slam. Workshops were held on the first day of the conference. Workshop participants met and discussed issues with a selected focus — providing an informal setting for active exchange among researchers, developers, and users on topics of current interest. Organizers from nine of the workshops submitted reports, which are reproduced in this report. Brief summaries of the other five workshops have been reproduced from their website descriptions.
Collapse
|
31
|
Relia K, Akbari M, Duncan D, Chunara R. Socio-spatial Self-organizing Maps: Using Social Media to Assess Relevant Geographies for Exposure to Social Processes. PROCEEDINGS OF THE ACM ON HUMAN-COMPUTER INTERACTION 2018; 2:145. [PMID: 30957076 PMCID: PMC6448781 DOI: 10.1145/3274414] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Social media offers a unique window into attitudes like racism and homophobia, exposure to which are important, hard to measure and understudied social determinants of health. However, individual geo-located observations from social media are noisy and geographically inconsistent. Existing areas by which exposures are measured, like Zip codes, average over irrelevant administratively-defined boundaries. Hence, in order to enable studies of online social environmental measures like attitudes on social media and their possible relationship to health outcomes, first there is a need for a method to define the collective, underlying degree of social media attitudes by region. To address this, we create the Socio-spatial-Self organizing map, "SS-SOM" pipeline to best identify regions by their latent social attitude from Twitter posts. SS-SOMs use neural embedding for text-classification, and augment traditional SOMs to generate a controlled number of nonoverlapping, topologically-constrained and topically-similar clusters. We find that not only are SS-SOMs robust to missing data, the exposure of a cohort of men who are susceptible to multiple racism and homophobia-linked health outcomes, changes by up to 42% using SS-SOM measures as compared to using Zip code-based measures.
Collapse
|
32
|
Kolawole O, Oguntoye M, Dam T, Chunara R. Etiology of respiratory tract infections in the community and clinic in Ilorin, Nigeria. BMC Res Notes 2017; 10:712. [PMID: 29212531 PMCID: PMC5719735 DOI: 10.1186/s13104-017-3063-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2017] [Accepted: 12/02/2017] [Indexed: 01/30/2023] Open
Abstract
Objective Recognizing increasing interest in community disease surveillance globally, the goal of this study was to investigate whether respiratory viruses circulating in the community may be represented through clinical (hospital) surveillance in Nigeria. Results Children were selected via convenience sampling from communities and a tertiary care center (n = 91) during spring 2017 in Ilorin, Nigeria. Nasal swabs were collected and tested using polymerase chain reaction. The majority (79.1%) of subjects were under 6 years old, of whom 46 were infected (63.9%). A total of 33 of the 91 subjects had one or more respiratory tract virus; there were 10 cases of triple infection and 5 of quadruple. Parainfluenza virus 4, respiratory syncytial virus B and enterovirus were the most common viruses in the clinical sample; present in 93.8% (15/16) of clinical subjects, and 6.7% (5/75) of community subjects (significant difference, p < 0.001). Coronavirus OC43 was the most common virus detected in community members (13.3%, 10/75). A different strain, Coronavirus OC 229 E/NL63 was detected among subjects from the clinic (2/16) and not detected in the community. This pilot study provides evidence that data from the community can potentially represent different information than that sourced clinically, suggesting the need for community surveillance to enhance public health efforts and scientific understanding of respiratory infections.
Collapse
|
33
|
Huang T, Elghafari A, Relia K, Chunara R. High-resolution Temporal Representations of Alcohol and Tobacco Behaviors from Social Media Data. PROCEEDINGS OF THE ACM ON HUMAN-COMPUTER INTERACTION 2017; 1:54. [PMID: 29264592 PMCID: PMC5734092 DOI: 10.1145/3134689] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Understanding tobacco- and alcohol-related behavioral patterns is critical for uncovering risk factors and potentially designing targeted social computing intervention systems. Given that we make choices multiple times per day, hourly and daily patterns are critical for better understanding behaviors. Here, we combine natural language processing, machine learning and time series analyses to assess Twitter activity specifically related to alcohol and tobacco consumption and their sub-daily, daily and weekly cycles. Twitter self-reports of alcohol and tobacco use are compared to other data streams available at similar temporal resolution. We assess if discussion of drinking by inferred underage versus legal age people or discussion of use of different types of tobacco products can be differentiated using these temporal patterns. We find that time and frequency domain representations of behaviors on social media can provide meaningful and unique insights, and we discuss the types of behaviors for which the approach may be most useful.
Collapse
|
34
|
Liu J, Weitzman ER, Chunara R. Assessing Behavioral Stages From Social Media Data. CSCW : PROCEEDINGS OF THE CONFERENCE ON COMPUTER-SUPPORTED COOPERATIVE WORK. CONFERENCE ON COMPUTER-SUPPORTED COOPERATIVE WORK 2017; 2017:1320-1333. [PMID: 29034371 DOI: 10.1145/2998181.2998336] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Important work rooted in psychological theory posits that health behavior change occurs through a series of discrete stages. Our work builds on the field of social computing by identifying how social media data can be used to resolve behavior stages at high resolution (e.g. hourly/daily) for key population subgroups and times. In essence this approach opens new opportunities to advance psychological theories and better understand how our health is shaped based on the real, dynamic, and rapid actions we make every day. To do so, we bring together domain knowledge and machine learning methods to form a hierarchical classification of Twitter data that resolves different stages of behavior. We identify and examine temporal patterns of the identified stages, with alcohol as a use case (planning or looking to drink, currently drinking, and reflecting on drinking). Known seasonal trends are compared with findings from our methods. We discuss the potential health policy implications of detecting high frequency behavior stages.
Collapse
|
35
|
Baltrusaitis K, Santillana M, Crawley AW, Chunara R, Smolinski M, Brownstein JS. Determinants of Participants' Follow-Up and Characterization of Representativeness in Flu Near You, A Participatory Disease Surveillance System. JMIR Public Health Surveill 2017; 3:e18. [PMID: 28389417 PMCID: PMC5400887 DOI: 10.2196/publichealth.7304] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2017] [Revised: 03/03/2017] [Accepted: 03/16/2017] [Indexed: 12/02/2022] Open
Abstract
Background Flu Near You (FNY) is an Internet-based participatory surveillance system in the United States and Canada that allows volunteers to report influenza-like symptoms using a brief weekly symptom report. Objective Our objective was to evaluate the representativeness of the FNY population compared with the general population of the United States, explore the demographic and behavioral characteristics associated with FNY’s high-participation users, and summarize results from a user survey of a cohort of FNY participants. Methods We compared (1) the representativeness of sex and age groups of FNY participants during the 2014-2015 flu season versus the general US population and (2) the distribution of Human Development Index (HDI) scores of FNY participants versus that of the general US population. We analyzed associations between demographic and behavioral factors and the level of participant follow-up (ie, high vs low). Finally, descriptive statistics of responses from FNY’s 2015 and 2016 end-of-season user surveys were calculated. Results During the 2014-2015 influenza season, 47,234 unique participants had at least one FNY symptom report that was either self-reported (users) or submitted on their behalf (household members). The proportion of female FNY participants was significantly higher than that of the general US population (n=28,906, 61.2% vs 51.1%, P<.001). Although each age group was represented in the FNY population, the age distribution was significantly different from that of the US population (P<.001). Compared with the US population, FNY had a greater proportion of individuals with HDI >5.0, signaling that the FNY user distribution was more affluent and educated than the US population baseline. We found that high-participation use (ie, higher participation in follow-up symptom reports) was associated with sex (females were 25% less likely than men to be high-participation users), higher HDI, not reporting an influenza-like illness at the first symptom report, older age, and reporting for household members (all differences between high- and low-participation users P<.001). Approximately 10% of FNY users completed an additional survey at the end of the flu season that assessed detailed user characteristics (3217/33,324 in 2015; 4850/44,313 in 2016). Of these users, most identified as being either retired or employed in the health, education, and social services sectors and indicated that they achieved a bachelor’s degree or higher. Conclusions The representativeness of the FNY population and characteristics of its high-participation users are consistent with what has been observed in other Internet-based influenza surveillance systems. With targeted recruitment of underrepresented populations, FNY may improve as a complementary system to timely tracking of flu activity, especially in populations that do not seek medical attention and in areas with poor official surveillance data.
Collapse
|
36
|
Chunara R, Wisk LE, Weitzman ER. Denominator Issues for Personally Generated Data in Population Health Monitoring. Am J Prev Med 2017; 52:549-553. [PMID: 28012811 PMCID: PMC5362284 DOI: 10.1016/j.amepre.2016.10.038] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/05/2016] [Revised: 10/13/2016] [Accepted: 10/31/2016] [Indexed: 01/14/2023]
|
37
|
Ray B, Ghedin E, Chunara R. Network inference from multimodal data: A review of approaches from infectious disease transmission. J Biomed Inform 2016; 64:44-54. [PMID: 27612975 PMCID: PMC7106161 DOI: 10.1016/j.jbi.2016.09.004] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2016] [Revised: 07/10/2016] [Accepted: 09/03/2016] [Indexed: 02/02/2023]
Abstract
Networks inference problems are commonly found in multiple biomedical subfields such as genomics, metagenomics, neuroscience, and epidemiology. Networks are useful for representing a wide range of complex interactions ranging from those between molecular biomarkers, neurons, and microbial communities, to those found in human or animal populations. Recent technological advances have resulted in an increasing amount of healthcare data in multiple modalities, increasing the preponderance of network inference problems. Multi-domain data can now be used to improve the robustness and reliability of recovered networks from unimodal data. For infectious diseases in particular, there is a body of knowledge that has been focused on combining multiple pieces of linked information. Combining or analyzing disparate modalities in concert has demonstrated greater insight into disease transmission than could be obtained from any single modality in isolation. This has been particularly helpful in understanding incidence and transmission at early stages of infections that have pandemic potential. Novel pieces of linked information in the form of spatial, temporal, and other covariates including high-throughput sequence data, clinical visits, social network information, pharmaceutical prescriptions, and clinical symptoms (reported as free-text data) also encourage further investigation of these methods. The purpose of this review is to provide an in-depth analysis of multimodal infectious disease transmission network inference methods with a specific focus on Bayesian inference. We focus on analytical Bayesian inference-based methods as this enables recovering multiple parameters simultaneously, for example, not just the disease transmission network, but also parameters of epidemic dynamics. Our review studies their assumptions, key inference parameters and limitations, and ultimately provides insights about improving future network inference methods in multiple applications.
Collapse
|
38
|
Smolinski MS, Crawley AW, Baltrusaitis K, Chunara R, Olsen JM, Wójcik O, Santillana M, Nguyen A, Brownstein JS. Flu Near You: Crowdsourced Symptom Reporting Spanning 2 Influenza Seasons. Am J Public Health 2015; 105:2124-30. [PMID: 26270299 DOI: 10.2105/ajph.2015.302696] [Citation(s) in RCA: 116] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
OBJECTIVES We summarized Flu Near You (FNY) data from the 2012-2013 and 2013-2014 influenza seasons in the United States. METHODS FNY collects limited demographic characteristic information upon registration, and prompts users each Monday to report symptoms of influenza-like illness (ILI) experienced during the previous week. We calculated the descriptive statistics and rates of ILI for the 2012-2013 and 2013-2014 seasons. We compared raw and noise-filtered ILI rates with ILI rates from the Centers for Disease Control and Prevention ILINet surveillance system. RESULTS More than 61 000 participants submitted at least 1 report during the 2012-2013 season, totaling 327 773 reports. Nearly 40 000 participants submitted at least 1 report during the 2013-2014 season, totaling 336 933 reports. Rates of ILI as reported by FNY tracked closely with ILINet in both timing and magnitude. CONCLUSIONS With increased participation, FNY has the potential to serve as a viable complement to existing outpatient, hospital-based, and laboratory surveillance systems. Although many established systems have the benefits of specificity and credibility, participatory systems offer advantages in the areas of speed, sensitivity, and scalability.
Collapse
|
39
|
McIver DJ, Hawkins JB, Chunara R, Chatterjee AK, Bhandari A, Fitzgerald TP, Jain SH, Brownstein JS. Characterizing Sleep Issues Using Twitter. J Med Internet Res 2015; 17:e140. [PMID: 26054530 PMCID: PMC4526927 DOI: 10.2196/jmir.4476] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2015] [Revised: 04/29/2015] [Accepted: 05/24/2015] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Sleep issues such as insomnia affect over 50 million Americans and can lead to serious health problems, including depression and obesity, and can increase risk of injury. Social media platforms such as Twitter offer exciting potential for their use in studying and identifying both diseases and social phenomenon. OBJECTIVE Our aim was to determine whether social media can be used as a method to conduct research focusing on sleep issues. METHODS Twitter posts were collected and curated to determine whether a user exhibited signs of sleep issues based on the presence of several keywords in tweets such as insomnia, "can't sleep", Ambien, and others. Users whose tweets contain any of the keywords were designated as having self-identified sleep issues (sleep group). Users who did not have self-identified sleep issues (non-sleep group) were selected from tweets that did not contain pre-defined words or phrases used as a proxy for sleep issues. RESULTS User data such as number of tweets, friends, followers, and location were collected, as well as the time and date of tweets. Additionally, the sentiment of each tweet and average sentiment of each user were determined to investigate differences between non-sleep and sleep groups. It was found that sleep group users were significantly less active on Twitter (P=.04), had fewer friends (P<.001), and fewer followers (P<.001) compared to others, after adjusting for the length of time each user's account has been active. Sleep group users were more active during typical sleeping hours than others, which may suggest they were having difficulty sleeping. Sleep group users also had significantly lower sentiment in their tweets (P<.001), indicating a possible relationship between sleep and pyschosocial issues. CONCLUSIONS We have demonstrated a novel method for studying sleep issues that allows for fast, cost-effective, and customizable data to be gathered.
Collapse
|
40
|
Chunara R, Goldstein E, Patterson-Lomba O, Brownstein JS. Estimating influenza attack rates in the United States using a participatory cohort. Sci Rep 2015; 5:9540. [PMID: 25835538 PMCID: PMC4894435 DOI: 10.1038/srep09540] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2014] [Accepted: 03/09/2015] [Indexed: 11/09/2022] Open
Abstract
We considered how participatory syndromic surveillance data can be used to estimate influenza attack rates during the 2012-2013 and 2013-2014 seasons in the United States. Our inference is based on assessing the difference in the rates of self-reported influenza-like illness (ILI, defined as presence of fever and cough/sore throat) among the survey participants during periods of active vs. low influenza circulation as well as estimating the probability of self-reported ILI for influenza cases. Here, we combined Flu Near You data with additional sources (Hong Kong household studies of symptoms of influenza cases and the U.S. Centers for Disease Control and Prevention estimates of vaccine coverage and effectiveness) to estimate influenza attack rates. The estimated influenza attack rate for the early vaccinated Flu Near You members (vaccination reported by week 45) aged 20-64 between calendar weeks 47-12 was 14.7%(95% CI(5.9%,24.1%)) for the 2012-2013 season and 3.6%(-3.3%,10.3%) for the 2013-2014 season. The corresponding rates for the US population aged 20-64 were 30.5% (4.4%, 49.3%) in 2012-2013 and 7.1%(-5.1%, 32.5%) in 2013-2014. The attack rates in women and men were similar each season. Our findings demonstrate that participatory syndromic surveillance data can be used to gauge influenza attack rates during future influenza seasons.
Collapse
|
41
|
Nagar R, Yuan Q, Freifeld CC, Santillana M, Nojima A, Chunara R, Brownstein JS. A case study of the New York City 2012-2013 influenza season with daily geocoded Twitter data from temporal and spatiotemporal perspectives. J Med Internet Res 2014; 16:e236. [PMID: 25331122 PMCID: PMC4259880 DOI: 10.2196/jmir.3416] [Citation(s) in RCA: 87] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2014] [Revised: 08/08/2014] [Accepted: 08/30/2014] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Twitter has shown some usefulness in predicting influenza cases on a weekly basis in multiple countries and on different geographic scales. Recently, Broniatowski and colleagues suggested Twitter's relevance at the city-level for New York City. Here, we look to dive deeper into the case of New York City by analyzing daily Twitter data from temporal and spatiotemporal perspectives. Also, through manual coding of all tweets, we look to gain qualitative insights that can help direct future automated searches. OBJECTIVE The intent of the study was first to validate the temporal predictive strength of daily Twitter data for influenza-like illness emergency department (ILI-ED) visits during the New York City 2012-2013 influenza season against other available and established datasets (Google search query, or GSQ), and second, to examine the spatial distribution and the spread of geocoded tweets as proxies for potential cases. METHODS From the Twitter Streaming API, 2972 tweets were collected in the New York City region matching the keywords "flu", "influenza", "gripe", and "high fever". The tweets were categorized according to the scheme developed by Lamb et al. A new fourth category was added as an evaluator guess for the probability of the subject(s) being sick to account for strength of confidence in the validity of the statement. Temporal correlations were made for tweets against daily ILI-ED visits and daily GSQ volume. The best models were used for linear regression for forecasting ILI visits. A weighted, retrospective Poisson model with SaTScan software (n=1484), and vector map were used for spatiotemporal analysis. RESULTS Infection-related tweets (R=.763) correlated better than GSQ time series (R=.683) for the same keywords and had a lower mean average percent error (8.4 vs 11.8) for ILI-ED visit prediction in January, the most volatile month of flu. SaTScan identified primary outbreak cluster of high-probability infection tweets with a 2.74 relative risk ratio compared to medium-probability infection tweets at P=.001 in Northern Brooklyn, in a radius that includes Barclay's Center and the Atlantic Avenue Terminal. CONCLUSIONS While others have looked at weekly regional tweets, this study is the first to stress test Twitter for daily city-level data for New York City. Extraction of personal testimonies of infection-related tweets suggests Twitter's strength both qualitatively and quantitatively for ILI-ED prediction compared to alternative daily datasets mixed with awareness-based data such as GSQ. Additionally, granular Twitter data provide important spatiotemporal insights. A tweet vector-map may be useful for visualization of city-level spread when local gold standard data are otherwise unavailable.
Collapse
|
42
|
Salimian PK, Chunara R, Weitzman ER. Averting the perfect storm: addressing youth substance use risk from social media use. Pediatr Ann 2014; 43:411. [PMID: 25290130 DOI: 10.3928/00904481-20140924-08] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Adolescents are developmentally sensitive to pathways that influence alcohol and other drug (AOD) use. In the absence of guidance, their routine engagement with social media may add a further layer of risk. There are several potential mechanisms for social media use to influence AOD risk, including exposure to peer portrayals of AOD use, socially amplified advertising, misinformation, and predatory marketing against a backdrop of lax regulatory systems and privacy controls. Here the authors summarize the influences of the social media world and suggest how pediatricians in everyday practice can alert youth and their parents to these risks to foster conversation, awareness, and harm reduction.
Collapse
|
43
|
Wójcik OP, Brownstein JS, Chunara R, Johansson MA. Public health for the people: participatory infectious disease surveillance in the digital age. Emerg Themes Epidemiol 2014; 11:7. [PMID: 24991229 PMCID: PMC4078360 DOI: 10.1186/1742-7622-11-7] [Citation(s) in RCA: 84] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2014] [Accepted: 06/09/2014] [Indexed: 11/20/2022] Open
Abstract
The 21st century has seen the rise of Internet-based participatory surveillance systems for infectious diseases. These systems capture voluntarily submitted symptom data from the general public and can aggregate and communicate that data in near real-time. We reviewed participatory surveillance systems currently running in 13 different countries. These systems have a growing evidence base showing a high degree of accuracy and increased sensitivity and timeliness relative to traditional healthcare-based systems. They have also proven useful for assessing risk factors, vaccine effectiveness, and patterns of healthcare utilization while being less expensive, more flexible, and more scalable than traditional systems. Nonetheless, they present important challenges including biases associated with the population that chooses to participate, difficulty in adjusting for confounders, and limited specificity because of reliance only on syndromic definitions of disease limits. Overall, participatory disease surveillance data provides unique disease information that is not available through traditional surveillance sources.
Collapse
|
44
|
Ocampo AJ, Chunara R, Brownstein JS. Using search queries for malaria surveillance, Thailand. Malar J 2013; 12:390. [PMID: 24188069 PMCID: PMC4228243 DOI: 10.1186/1475-2875-12-390] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2013] [Accepted: 10/23/2013] [Indexed: 11/30/2022] Open
Abstract
Background Internet search query trends have been shown to correlate with incidence trends for select infectious diseases and countries. Herein, the first use of Google search queries for malaria surveillance is investigated. The research focuses on Thailand where real-time malaria surveillance is crucial as malaria is re-emerging and developing resistance to pharmaceuticals in the region. Methods Official Thai malaria case data was acquired from the World Health Organization (WHO) from 2005 to 2009. Using Google correlate, an openly available online tool, and by surveying Thai physicians, search queries potentially related to malaria prevalence were identified. Four linear regression models were built from different sub-sets of malaria-related queries to be used in future predictions. The models’ accuracies were evaluated by their ability to predict the malaria outbreak in 2009, their correlation with the entire available malaria case data, and by Akaike information criterion (AIC). Results Each model captured the bulk of the variability in officially reported malaria incidence. Correlation in the validation set ranged from 0.75 to 0.92 and AIC values ranged from 808 to 586 for the models. While models using malaria-related and general health terms were successful, one model using only microscopy-related terms obtained equally high correlations to malaria case data trends. The model built strictly of queries provided by Thai physicians was the only one that consistently captured the well-documented second seasonal malaria peak in Thailand. Conclusions Models built from Google search queries were able to adequately estimate malaria activity trends in Thailand, from 2005–2010, according to official malaria case counts reported by WHO. While presenting their own limitations, these search queries may be valid real-time indicators of malaria incidence in the population, as correlations were on par with those of related studies for other infectious diseases. Additionally, this methodology provides a cost-effective description of malaria prevalence that can act as a complement to traditional public health surveillance. This and future studies will continue to identify ways to leverage web-based data to improve public health.
Collapse
|
45
|
Chunara R, Smolinski MS, Brownstein JS. Why we need crowdsourced data in infectious disease surveillance. Curr Infect Dis Rep 2013; 15:316-9. [PMID: 23689991 DOI: 10.1007/s11908-013-0341-5] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In infectious disease surveillance, public health data such as environmental, hospital, or census data have been extensively explored to create robust models of disease dynamics. However, this information is also subject to its own biases, including latency, high cost, contributor biases, and imprecise resolution. Simultaneously, new technologies including Internet and mobile phone based tools, now enable information to be garnered directly from individuals at the point of care. Here, we consider how these crowdsourced data offer the opportunity to fill gaps in and augment current epidemiological models. Challenges and methods for overcoming limitations of the data are also reviewed. As more new information sources become mature, incorporating these novel data into epidemiological frameworks will enable us to learn more about infectious disease dynamics.
Collapse
|
46
|
Cassa CA, Chunara R, Mandl K, Brownstein JS. Twitter as a sentinel in emergency situations: lessons from the Boston marathon explosions. PLOS CURRENTS 2013; 5. [PMID: 23852273 PMCID: PMC3706072 DOI: 10.1371/currents.dis.ad70cd1c8bc585e9470046cde334ee4b] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Immediately following the Boston Marathon attacks, individuals near the scene posted a deluge of data to social media sites. Previous work has shown that these data can be leveraged to provide rapid insight during natural disasters, disease outbreaks and ongoing conflicts that can assist in the public health and medical response. Here, we examine and discuss the social media messages posted immediately after and around the Boston Marathon bombings, and find that specific keywords appear frequently prior to official public safety and news media reports. Individuals immediately adjacent to the explosions posted messages within minutes via Twitter which identify the location and specifics of events, demonstrating a role for social media in the early recognition and characterization of emergency events.
*Christopher Cassa and Rumi Chunara contributed equally to this work.
Collapse
|
47
|
Chunara R, Aman S, Smolinski M, Brownstein JS. Flu Near You: An Online Self-reported Influenza Surveillance System in the USA. Online J Public Health Inform 2013. [PMCID: PMC3692780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
Abstract
Objective Introduction Methods Results Conclusions
Collapse
|
48
|
Chunara R, Andrews JR, Brownstein JS. Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak. Am J Trop Med Hyg 2012; 86:39-45. [PMID: 22232449 DOI: 10.4269/ajtmh.2012.11-0597] [Citation(s) in RCA: 186] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
During infectious disease outbreaks, data collected through health institutions and official reporting structures may not be available for weeks, hindering early epidemiologic assessment. By contrast, data from informal media are typically available in near real-time and could provide earlier estimates of epidemic dynamics. We assessed correlation of volume of cholera-related HealthMap news media reports, Twitter postings, and government cholera cases reported in the first 100 days of the 2010 Haitian cholera outbreak. Trends in volume of informal sources significantly correlated in time with official case data and was available up to 2 weeks earlier. Estimates of the reproductive number ranged from 1.54 to 6.89 (informal sources) and 1.27 to 3.72 (official sources) during the initial outbreak growth period, and 1.04 to 1.51 (informal) and 1.06 to 1.73 (official) when Hurricane Tomas afflicted Haiti. Informal data can be used complementarily with official data in an outbreak setting to get timely estimates of disease dynamics.
Collapse
|
49
|
Chunara R, Chhaya V, Bane S, Mekaru SR, Chan EH, Freifeld CC, Brownstein JS. Online reporting for malaria surveillance using micro-monetary incentives, in urban India 2010-2011. Malar J 2012; 11:43. [PMID: 22330227 PMCID: PMC3305483 DOI: 10.1186/1475-2875-11-43] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2011] [Accepted: 02/13/2012] [Indexed: 11/16/2022] Open
Abstract
Background The objective of this study was to investigate the use of novel surveillance tools in a malaria endemic region where prevalence information is limited. Specifically, online reporting for participatory epidemiology was used to gather information about malaria spread directly from the public. Individuals in India were incentivized to self-report their recent experience with malaria by micro-monetary payments. Methods Self-reports about malaria diagnosis status and related information were solicited online via Amazon's Mechanical Turk. Responders were paid $0.02 to answer survey questions regarding their recent experience with malaria. Timing of the peak volume of weekly self-reported malaria diagnosis in 2010 was compared to other available metrics such as the volume over time of and information about the epidemic from media sources. Distribution of Plasmodium species reports were compared with values from the literature. The study was conducted in summer 2010 during a malaria outbreak in Mumbai and expanded to other cities during summer 2011, and prevalence from self-reports in 2010 and 2011 was contrasted. Results Distribution of Plasmodium species diagnosis through self-report in 2010 revealed 59% for Plasmodium vivax, which is comparable to literature reports of the burden of P. vivax in India (between 50 and 69%). Self-reported Plasmodium falciparum diagnosis was 19% and during the 2010 outbreak and the estimated burden was between 10 and 15%. Prevalence between 2010 and 2011 via self-reports decreased significantly from 36.9% to 19.54% in Mumbai (p = 0.001), and official reports also confirmed a prevalence decrease in 2011. Conclusions With careful study design, micro-monetary incentives and online reporting are a rapid way to solicit malaria, and potentially other public health information. This methodology provides a cost-effective way of executing a field study that can act as a complement to traditional public health surveillance methods, offering an opportunity to obtain information about malaria activity, temporal progression, demographics affected or Plasmodium-specific diagnosis at a finer resolution than official reports can provide. The recent adoption of technologies, such as the Internet supports self-reporting mediums, and self-reporting should continue to be studied as it can foster preventative health behaviours.
Collapse
|
50
|
Bogich TL, Chunara R, Scales D, Chan E, Pinheiro LC, Chmura AA, Carroll D, Daszak P, Brownstein JS. Preventing pandemics via international development: a systems approach. PLoS Med 2012; 9:e1001354. [PMID: 23239944 PMCID: PMC3519898 DOI: 10.1371/journal.pmed.1001354] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Tiffany Bogich and colleagues find that breakdown or absence of public health infrastructure is most often the driver in pandemic outbreaks, whose prevention requires mainstream development funding rather than emergency funding.
Collapse
|