An Evaluation of Geo-located Twitter Data for Measuring Human Migration.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE : IJGIS 2022;
36:1830-1852. [PMID:
36643847 PMCID:
PMC9837860 DOI:
10.1080/13658816.2022.2075878]
[Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 05/05/2022] [Accepted: 05/06/2022] [Indexed: 06/17/2023]
Abstract
This study evaluates the spatial patterns of flows generated from geo-located Twitter data to measure human migration. Using geo-located tweets continuously collected in the U.S. from 2013 to 2015, we identified Twitter users who migrated per changes in county-of-residence every two years and compared the Twitter-estimated county-to-county migration flows with the ones from the U.S. Internal Revenue Service (IRS). To evaluate the spatial patterns of Twitter migration flows when representing the IRS counterparts, we developed a normalized difference representation index to visualize and identify those counties of over-/under-representations in the Twitter estimates. Further, we applied a multidimensional spatial scan statistic approach based on a Poisson process model to detect pairs of origin and destination regions where the over-/under-representativeness occurred. The results suggest that Twitter migration flows tend to under-represent the IRS estimates in regions with a large population and over-represent them in metropolitan regions adjacent to tourist attractions. This study demonstrated that geo-located Twitter data could be a sound statistical proxy for measuring human migration. Given that the spatial patterns of Twitter-estimated migration flows vary significantly across the geographic space, related studies will benefit from our approach by identifying those regions where data calibration is necessary.
Collapse