1
|
Fox L, Miller WC, Gesink D, Doherty I, Serre M. Enhancing insights in sexually transmitted infection mapping: Syphilis in Forsyth County, North Carolina, a case study. PLoS Comput Biol 2024; 20:e1012464. [PMID: 39480897 PMCID: PMC11774488 DOI: 10.1371/journal.pcbi.1012464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 01/28/2025] [Accepted: 09/05/2024] [Indexed: 11/02/2024] Open
Abstract
In 2008-2011 Forsyth County, North Carolina (NC) experienced a four-fold increase in syphilis rising to over 35 cases per 100,000 mirroring the 2021 state syphilis rate. Our methodology extends current models with: 1) donut geomasking to enhance resolution while protecting patient privacy; 2) a moving window uniform grid to control the modifiable areal unit problem, edge effect and remove kriging islands; and 3) mitigating the "small number problem" with Uniform Model Bayesian Maximum Entropy (UMBME). Data is 2008-2011 early syphilis cases reported to the NC Department of Health and Human Services for Forsyth County. Results were assessed using latent rate theory cross validation. We show combining a moving window and a UMBME analysis with geomasked data effectively predicted the true or latent syphilis rate 5% to 26% more accurate than the traditional, geopolitical boundary method. It removed kriging islands, reduced background incidence rate to 0, relocated nine outbreak hotspots to more realistic locations, and elucidated hotspot connectivity producing more realistic geographical patterns for targeted insights. Using the Forsyth outbreak as a case study showed how the outbreak emerged from endemic areas spreading through sexual core transmitters and contextualizing the outbreak to current and past outbreaks. As the dynamics of sexually transmitted infections spread have changed to online partnership selection and demographically to include more women, partnership selection continues to remain highly localized. Furthermore, it is important to present methods to increase interpretability and accuracy of visual representations of data.
Collapse
Affiliation(s)
- Lani Fox
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, North Carolina, United States of America
- Lani Fox Geostatistical Consulting, Claremont, California, United States of America
| | - William C. Miller
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, North Carolina, United States of America
| | - Dionne Gesink
- Epidemiology Division, Dalla Lana School of Public Health, University of Toronto
| | - Irene Doherty
- Julius L. Chambers Biomedical/Biotechnology Research Institute / North Carolina Central University, North Carolina, United States of America
| | - Marc Serre
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, North Carolina, United States of America
| |
Collapse
|
2
|
Fox L, Peter BG, Frake AN, Messina JP. A Bayesian maximum entropy model for predicting tsetse ecological distributions. Int J Health Geogr 2023; 22:31. [PMID: 37974150 PMCID: PMC10655428 DOI: 10.1186/s12942-023-00349-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 10/10/2023] [Indexed: 11/19/2023] Open
Abstract
BACKGROUND African trypanosomiasis is a tsetse-borne parasitic infection that affects humans, wildlife, and domesticated animals. Tsetse flies are endemic to much of Sub-Saharan Africa and a spatial and temporal understanding of tsetse habitat can aid surveillance and support disease risk management. Problematically, current fine spatial resolution remote sensing data are delivered with a temporal lag and are relatively coarse temporal resolution (e.g., 16 days), which results in disease control models often targeting incorrect places. The goal of this study was to devise a heuristic for identifying tsetse habitat (at a fine spatial resolution) into the future and in the temporal gaps where remote sensing and proximal data fail to supply information. METHODS This paper introduces a generalizable and scalable open-access version of the tsetse ecological distribution (TED) model used to predict tsetse distributions across space and time, and contributes a geospatial Bayesian Maximum Entropy (BME) prediction model trained by TED output data to forecast where, herein the Morsitans group of tsetse, persist in Kenya, a method that mitigates the temporal lag problem. This model facilitates identification of tsetse habitat and provides critical information to control tsetse, mitigate the impact of trypanosomiasis on vulnerable human and animal populations, and guide disease minimization in places with ephemeral tsetse. Moreover, this BME analysis is one of the first to utilize cluster and parallel computing along with a Monte Carlo analysis to optimize BME computations. This allows for the analysis of an exceptionally large dataset (over 2 billion data points) at a finer resolution and larger spatiotemporal scale than what had previously been possible. RESULTS Under the most conservative assessment for Kenya, the BME kriging analysis showed an overall prediction accuracy of 74.8% (limited to the maximum suitability extent). In predicting tsetse distribution outcomes for the entire country the BME kriging analysis was 97% accurate in its forecasts. CONCLUSIONS This work offers a solution to the persistent temporal data gap in accurate and spatially precise rainfall predictions and the delayed processing of remotely sensed data collectively in the - 45 days past to + 180 days future temporal window. As is shown here, the BME model is a reliable alternative for forecasting future tsetse distributions to allow preplanning for tsetse control. Furthermore, this model provides guidance on disease control that would otherwise not be available. These 'big data' BME methods are particularly useful for large domain studies. Considering that past BME studies required reduction of the spatiotemporal grid to facilitate analysis. Both the GEE-TED and the BME libraries have been made open source to enable reproducibility and offer continual updates into the future as new remotely sensed data become available.
Collapse
Affiliation(s)
- Lani Fox
- Lani Fox Geostatistical Consulting, Claremont, CA, USA.
- Department of Environmental Sciences and Engineering, Gillings School of Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| | - Brad G Peter
- Department of Geosciences, University of Arkansas, Fayetteville, AR, USA
| | - April N Frake
- Center for Global Change and Earth Observation, Michigan State University, East Lansing, MI, USA
- Center for Healthy Communities, Michigan Public Health Institute, Okemos, MI, USA
| | - Joseph P Messina
- Department of Geography, University of Alabama, Tuscaloosa, AL, USA
| |
Collapse
|
3
|
Zhang P, Kamel Boulos MN. Privacy-by-Design Environments for Large-Scale Health Research and Federated Learning from Data. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:11876. [PMID: 36231175 PMCID: PMC9565554 DOI: 10.3390/ijerph191911876] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/07/2022] [Accepted: 09/16/2022] [Indexed: 06/16/2023]
Abstract
This article offers a brief overview of 'privacy-by-design (or data-protection-by-design) research environments', namely Trusted Research Environments (TREs, most commonly used in the United Kingdom) and Personal Health Trains (PHTs, most commonly used in mainland Europe). These secure environments are designed to enable the safe analysis of multiple, linked (and often big) data sources, including sensitive personal data and data owned by, and distributed across, different institutions. They take data protection and privacy requirements into account from the very start (conception phase, during system design) rather than as an afterthought or 'patch' implemented at a later stage on top of an existing environment. TREs and PHTs are becoming increasingly important for conducting large-scale privacy-preserving health research and for enabling federated learning and discoveries from big healthcare datasets. The paper also presents select examples of successful TRE and PHT implementations and of large-scale studies that used them.
Collapse
Affiliation(s)
- Peng Zhang
- Data Science Institute & Department of Computer Science, Vanderbilt University, Nashville, TN 37240, USA
| | | |
Collapse
|
4
|
Kamel Boulos MN, Kwan MP, El Emam K, Chung ALL, Gao S, Richardson DB. Reconciling public health common good and individual privacy: new methods and issues in geoprivacy. Int J Health Geogr 2022; 21:1. [PMID: 35045864 PMCID: PMC8767534 DOI: 10.1186/s12942-022-00300-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/13/2022] [Indexed: 11/30/2022] Open
Abstract
This article provides a state-of-the-art summary of location privacy issues and geoprivacy-preserving methods in public health interventions and health research involving disaggregate geographic data about individuals. Synthetic data generation (from real data using machine learning) is discussed in detail as a promising privacy-preserving approach. To fully achieve their goals, privacy-preserving methods should form part of a wider comprehensive socio-technical framework for the appropriate disclosure, use and dissemination of data containing personal identifiable information. Select highlights are also presented from a related December 2021 AAG (American Association of Geographers) webinar that explored ethical and other issues surrounding the use of geospatial data to address public health issues during challenging crises, such as the COVID-19 pandemic.
Collapse
Affiliation(s)
- Maged N Kamel Boulos
- Institute for Preventive Medicine and Public Health, School of Medicine (FMUL), University of Lisbon, 1649-028, Lisbon, Portugal.
| | - Mei-Po Kwan
- Institute of Space and Earth Information Science, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Khaled El Emam
- School of Epidemiology and Public Health, University of Ottawa, Ottawa, ON, K1G 5Z3, Canada
| | - Ada Lai-Ling Chung
- Office of the Privacy Commissioner for Personal Data, Wanchai, Hong Kong, China
| | - Song Gao
- Department of Geography, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Douglas B Richardson
- Centre for Geographic Analysis, Institute for Quantitative Social Science, Harvard University, Cambridge, MA, 02138, USA
| |
Collapse
|
5
|
Ajayakumar J, Curtis AJ, Curtis J. Addressing the data guardian and geospatial scientist collaborator dilemma: how to share health records for spatial analysis while maintaining patient confidentiality. Int J Health Geogr 2019; 18:30. [PMID: 31864350 PMCID: PMC6925902 DOI: 10.1186/s12942-019-0194-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2019] [Accepted: 12/13/2019] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND The utility of being able to spatially analyze health care data in near-real time is a growing need. However, this potential is often limited by the level of in-house geospatial expertise. One solution is to form collaborative partnerships between the health and geoscience sectors. A challenge in achieving this is how to share data outside of a host institution's protection protocols without violating patient confidentiality, and while still maintaining locational geographic integrity. Geomasking techniques have been previously championed as a solution, though these still largely remain an unavailable option to institutions with limited geospatial expertise. This paper elaborates on the design, implementation, and testing of a new geomasking tool Privy, which is designed to be a simple yet efficient mechanism for health practitioners to share health data with geospatial scientists while maintaining an acceptable level of confidentiality. The basic premise of Privy is to move the important coordinates to a different geography, perform the analysis, and then return the resulting hotspot outputs to the original landscape. RESULTS We show that by transporting coordinates through a combination of random translations and rotations, Privy is able to preserve location connectivity among spatial point data. Our experiments with typical analytical scenarios including spatial point pattern analysis and density analysis shows that, along with protecting spatial privacy, Privy maintains the spatial integrity of data which reduces information loss created due to data augmentation. CONCLUSION The results from this study suggests that along with developing new mathematical techniques to augment geospatial health data for preserving confidentiality, simple yet efficient software solutions can be developed to enable collaborative research among custodians of medical and health data records and GIS experts. We have achieved this by developing Privy, a tool which is already being used in real-world situations to address the spatial confidentiality dilemma.
Collapse
Affiliation(s)
- Jayakrishnan Ajayakumar
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH, USA.
| | - Andrew J Curtis
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Jacqueline Curtis
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| |
Collapse
|
6
|
Cardoso de Moraes JL, de Souza WL, Pires LF, do Prado AF. A methodology based on openEHR archetypes and software agents for developing e-health applications reusing legacy systems. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2016; 134:267-287. [PMID: 27480749 DOI: 10.1016/j.cmpb.2016.07.013] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Accepted: 07/04/2016] [Indexed: 06/06/2023]
Abstract
BACKGROUND AND OBJECTIVE In Pervasive Healthcare, novel information and communication technologies are applied to support the provision of health services anywhere, at anytime and to anyone. Since health systems may offer their health records in different electronic formats, the openEHR Foundation prescribes the use of archetypes for describing clinical knowledge in order to achieve semantic interoperability between these systems. Software agents have been applied to simulate human skills in some healthcare procedures. This paper presents a methodology, based on the use of openEHR archetypes and agent technology, which aims to overcome the weaknesses typically found in legacy healthcare systems, thereby adding value to the systems. METHODS This methodology was applied in the design of an agent-based system, which was used in a realistic healthcare scenario in which a medical staff meeting to prepare a cardiac surgery has been supported. We conducted experiments with this system in a distributed environment composed by three cardiology clinics and a center of cardiac surgery, all located in the city of Marília (São Paulo, Brazil). We evaluated this system according to the Technology Acceptance Model. RESULTS The case study confirmed the acceptance of our agent-based system by healthcare professionals and patients, who reacted positively with respect to the usefulness of this system in particular, and with respect to task delegation to software agents in general. The case study also showed that a software agent-based interface and a tools-based alternative must be provided to the end users, which should allow them to perform the tasks themselves or to delegate these tasks to other people. CONCLUSIONS A Pervasive Healthcare model requires efficient and secure information exchange between healthcare providers. The proposed methodology allows designers to build communication systems for the message exchange among heterogeneous healthcare systems, and to shift from systems that rely on informal communication of actors to a more automated and less error-prone agent-based system. Our methodology preserves significant investment of many years in the legacy systems and allows developers to extend them adding new features to these systems, by providing proactive assistance to the end-users and increasing the user mobility with an appropriate support.
Collapse
Affiliation(s)
- João Luís Cardoso de Moraes
- Federal University of São Carlos, Computer Department, Rodovia Washington Luís-Km 235, 13565-905 São Carlos-SP, Brazil.
| | - Wanderley Lopes de Souza
- Federal University of São Carlos, Computer Department, Rodovia Washington Luís-Km 235, 13565-905 São Carlos-SP, Brazil
| | - Luís Ferreira Pires
- University of Twente, Centre for Telematics and Information Technology, Drienerlolaan 5, 7522 NB, Enschede, The Netherlands
| | - Antonio Francisco do Prado
- Federal University of São Carlos, Computer Department, Rodovia Washington Luís-Km 235, 13565-905 São Carlos-SP, Brazil
| |
Collapse
|
7
|
Influence of Demographic and Health Survey Point Displacements on Distance-Based Analyses. SPATIAL DEMOGRAPHY 2015; 4:155-173. [PMID: 27453935 DOI: 10.1007/s40980-015-0014-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
We evaluate the impacts of random spatial displacements on analyses that involve distance measures from displaced Demographic and Health Survey (DHS) clusters to nearest ancillary point or line features, such as health resources or roads. We use simulation and case studies to address the effects of this introduced error, and propose use of regression calibration (RC) to reduce its impact. Results suggest that RC outperforms analyses involving naive distance-based covariate assignments by reducing the bias and MSE of the main estimator in most settings. Proposed guidelines also address the effect of the spatial density of destination features on observed bias.
Collapse
|
8
|
Sarpatwari A, Kesselheim AS, Malin BA, Gagne JJ, Schneeweiss S. Ensuring patient privacy in data sharing for postapproval research. N Engl J Med 2014; 371:1644-9. [PMID: 25337755 DOI: 10.1056/nejmsb1405487] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Ameet Sarpatwari
- From the Program on Regulation, Therapeutics, and Law, Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston (A.S., A.S.K., J.J.G., S.S.); and the Department of Biomedical Informatics, School of Medicine, and the Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville (B.A.M.)
| | | | | | | | | |
Collapse
|
9
|
Ensuring Confidentiality of Geocoded Health Data: Assessing Geographic Masking Strategies for Individual-Level Data. Adv Med 2014; 2014:567049. [PMID: 26556417 PMCID: PMC4590956 DOI: 10.1155/2014/567049] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/12/2013] [Revised: 10/25/2013] [Accepted: 10/27/2013] [Indexed: 11/18/2022] Open
Abstract
Public health datasets increasingly use geographic identifiers such as an individual's address. Geocoding these addresses often provides new insights since it becomes possible to examine spatial patterns and associations. Address information is typically considered confidential and is therefore not released or shared with others. Publishing maps with the locations of individuals, however, may also breach confidentiality since addresses and associated identities can be discovered through reverse geocoding. One commonly used technique to protect confidentiality when releasing individual-level geocoded data is geographic masking. This typically consists of applying a certain amount of random perturbation in a systematic manner to reduce the risk of reidentification. A number of geographic masking techniques have been developed as well as methods to quantity the risk of reidentification associated with a particular masking method. This paper presents a review of the current state-of-the-art in geographic masking, summarizing the various methods and their strengths and weaknesses. Despite recent progress, no universally accepted or endorsed geographic masking technique has emerged. Researchers on the other hand are publishing maps using geographic masking of confidential locations. Any researcher publishing such maps is advised to become familiar with the different masking techniques available and their associated reidentification risks.
Collapse
|
10
|
Abstract
Scholarly communication is at an unprecedented turning point created in part by the increasing saliency of data stewardship and data sharing. Formal data management plans represent a new emphasis in research, enabling access to data at higher volumes and more quickly, and the potential for replication and augmentation of existing research. Data sharing has recently transformed the practice, scope, content, and applicability of research in several disciplines, in particular in relation to spatially specific data. This lends exciting potentiality, but the most effective ways in which to implement such changes, particularly for disciplines involving human subjects and other sensitive information, demand consideration. Data management plans, stewardship, and sharing, impart distinctive technical, sociological, and ethical challenges that remain to be adequately identified and remedied. Here, we consider these and propose potential solutions for their amelioration.
Collapse
|
11
|
AbdelMalik P, Kamel Boulos MN. Multidimensional point transform for public health practice. Methods Inf Med 2011; 51:63-73. [PMID: 21691675 DOI: 10.3414/me11-01-0001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2011] [Accepted: 05/10/2011] [Indexed: 11/09/2022]
Abstract
BACKGROUND With increases in spatial information and enabling technologies, location-privacy concerns have been on the rise. A commonly proposed solution in public health involves random perturbation, however consideration for individual dimensions (attributes) has been weak. OBJECTIVES The current study proposes a multidimensional point transform (MPT) that integrates the spatial dimension with other dimensions of interest to comprehensively anonymise data. METHODS The MPT relies on the availability of a base population, a subset patient dataset, and shared dimensions of interest. Perturbation distance and anonymity thresholds are defined, as are allowable dimensional perturbations. A preliminary implementation is presented using sex, age and location as the three dimensions of interest, with a maximum perturbation distance of 1 kilometre and an anonymity threshold of 20%. A synthesised New York county population is used for testing with 1000 iterations for each of 25, 50, 100, 200 and 400 patient dataset sizes. RESULTS The MPT consistently yielded a mean perturbation distance of 46 metres with no sex or age perturbation required. Displacement of the spatial mean decreased with patient dataset size and averaged 5.6 metres overall. CONCLUSIONS The MPT presents a flexible, customisable and adaptive algorithm for perturbing datasets for public health, allowing tweaking and optimisation of the trade-offs for different datasets and purposes. It is not, however, a substitute for secure and ethical conduct, and a public health framework for the appropriate disclosure, use and dissemination of data containing personal identifiable information is required. The MPT presents an important component of such a framework.
Collapse
Affiliation(s)
- P AbdelMalik
- Faculty of Health and Education, University of Plymouth, Drake Circus, Plymouth, Devon, PL4 8AA, UK.
| | | |
Collapse
|
12
|
Allshouse WB, Fitch MK, Hampton KH, Gesink DC, Doherty IA, Leone PA, Serre ML, Miller WC. Geomasking sensitive health data and privacy protection: an evaluation using an E911 database. GEOCARTO INTERNATIONAL 2010; 25:443-452. [PMID: 20953360 PMCID: PMC2952889 DOI: 10.1080/10106049.2010.496496] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Geomasking is used to provide privacy protection for individual address information while maintaining spatial resolution for mapping purposes. Donut geomasking and other random perturbation geomasking algorithms rely on the assumption of a homogeneously distributed population to calculate displacement distances, leading to possible under-protection of individuals when this condition is not met. Using household data from 2007, we evaluated the performance of donut geomasking in Orange County, North Carolina. We calculated the estimated k-anonymity for every household based on the assumption of uniform household distribution. We then determined the actual k-anonymity by revealing household locations contained in the county E911 database. Census block groups in mixed-use areas with high population distribution heterogeneity were the most likely to have privacy protection below selected criteria. For heterogeneous populations, we suggest tripling the minimum displacement area in the donut to protect privacy with a less than 1% error rate.
Collapse
Affiliation(s)
- William B Allshouse
- The University of North Carolina at Chapel Hill, Gillings School of Global Public Health, Department of Environmental Sciences and Engineering, Chapel Hill, 27599-7431 United States
| | | | | | | | | | | | | | | |
Collapse
|
13
|
Boulos DNK, Ghali RR, Ibrahim EM, Boulos MNK, AbdelMalik P. An eight-year snapshot of geospatial cancer research (2002-2009): clinico-epidemiological and methodological findings and trends. Med Oncol 2010; 28:1145-62. [PMID: 20589539 DOI: 10.1007/s12032-010-9607-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2010] [Accepted: 06/16/2010] [Indexed: 12/14/2022]
Abstract
Geographic information systems (GIS) offer a very rich toolbox of methods and technologies, and powerful research tools that extend far beyond the mere production of maps, making it possible to cross-link and study the complex interaction of disease data and factors originating from a wide range of disparate sources. Despite their potential indispensable role in cancer prevention and control programmes, GIS are underrepresented in specialised oncology literature. The latter has provided an impetus for the current review. The review provides an eight-year snapshot of geospatial cancer research in peer-reviewed literature (2002-2009), presenting the clinico-epidemiological and methodological findings and trends in the covered corpus (93 papers). The authors concluded that understanding the relationship between location and cancer/cancer care services can play a crucial role in disease control and prevention, and in better service planning, and appropriate resource utilisation. Nevertheless, there are still barriers that hinder the wide-scale adoption of GIS and related technologies in everyday oncology practice.
Collapse
Affiliation(s)
- Dina N Kamel Boulos
- Department of Community, Environmental and Occupational Medicine, Faculty of Medicine, Ain Shams University, Abbassia, Cairo, Egypt
| | | | | | | | | |
Collapse
|
14
|
Katsaliaki K, Mustafee N. Improving decision making in healthcare services through the use of existing simulation modelling tools and new technologies. TRANSFORMING GOVERNMENT- PEOPLE PROCESS AND POLICY 2010. [DOI: 10.1108/17506161011047389] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
PurposeThe purpose of this paper is to investigate the viability of using distributed simulation to execute large and complex healthcare simulation models which help government take informed decisions.Design/methodology/approachThe paper compares the execution time of a standalone healthcare supply chain simulation with its distributed counterpart. Both the standalone and the distributed models are built using a commercial simulation package (CSP).FindingsThe results show that the execution time of the standalone healthcare supply chain simulation increases exponentially as the size and complexity of the system being modelled increases. On the other hand, using distributed simulation approach decreases the run time for large and complex models.Research limitations/implicationsThe distributed approach of executing different parts of a single simulation model over different computers is only viable when the model: can be divided into logical parts and the exchange of information between these parts occurs at constant simulated time intervals; is sufficiently large and complicated, such that executing the model over a single processor is very time consuming.Practical implicationsBased on a feasibility study of the UK National Blood Service we demonstrate the effectiveness of distributed simulation and argue that it is a vital technique in healthcare informatics with respect to supporting decision making in large healthcare systems.Originality/valueTo the best of the knowledge, this is the first feasibility study in healthcare which shows the outcome of modelling and executing a distributed simulation using unmodified CSPs and a software/middleware for distributed simulation.
Collapse
|
15
|
Rainham D, McDowell I, Krewski D, Sawada M. Conceptualizing the healthscape: contributions of time geography, location technologies and spatial ecology to place and health research. Soc Sci Med 2009; 70:668-76. [PMID: 19963310 DOI: 10.1016/j.socscimed.2009.10.035] [Citation(s) in RCA: 122] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2008] [Indexed: 10/20/2022]
Abstract
Geomatics and related technologies allow for the application of integrated approaches to the analysis of individual spatial and temporal activities in the context of place and health research. The ability to track individuals as they make decisions and negotiate space may provide a fundamental advance. This paper introduces the need to move beyond conventional place-based perspectives in health research, and invokes the theoretical contributions of time geography and spatial ecology as opportunities to integrate human agency into contextual models of health. Issues around the geographical representation of place are reviewed, and the concept of the healthscape is introduced as an approach to operationalizing context as expressed by the spatial and temporal activities of individuals. We also discuss how these concepts have the potential to influence and contribute to empirical place and health research.
Collapse
Affiliation(s)
- Daniel Rainham
- Environmental Programs, Faculty of Science, Dalhousie University, 1355 Oxford Street, Halifax, Nova Scotia, Canada B3H4J1.
| | | | | | | |
Collapse
|
16
|
Jeffery C, Ozonoff A, White LF, Nuño M, Pagano M. Power to detect spatial disturbances under different levels of geographic aggregation. J Am Med Inform Assoc 2009; 16:847-54. [PMID: 19717807 DOI: 10.1197/jamia.m2788] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
OBJECTIVE Spatio and/or temporal surveillance systems are designed to monitor the ongoing appearance of disease cases in space and time, and to detect potential disturbances in either dimension. Patient addresses are sometimes reported at some level of geographic aggregation, for example by ZIP code or census tract. While this aggregation has the advantage of protecting patient privacy, it also risks compromising statistical efficiency. This paper investigated the variation in power to detect a change in the spatial distribution in the presence of spatial aggregation. METHODS The authors generated 400,000 spatial datasets with varying location and spread of simulated spatial disturbances, both on a purely synthetic uniform population, and on a heterogeneous population, representing hospital admissions to three community hospitals in Cape Cod, Massachusetts. The authors evaluated the power of the M-statistic to detect spatial disturbances, comparing the use of exact spatial locations versus twelve different levels of aggregation, where the M-statistic is a comparison of two distributions of interpoint distances between locations. RESULTS When the spread of simulated spatial disturbances was contained to a small portion of the study region or affects a large proportion of the population at risk, power was highest when exact locations were reported. If the spatial disturbance was a more modest signal, the best power was attained at an aggregated level. CONCLUSIONS The precision at which patients' locations are reported has the potential to affect the power of detection significantly.
Collapse
Affiliation(s)
- Caroline Jeffery
- Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115, USA
| | | | | | | | | |
Collapse
|
17
|
Boulos MNK, Curtis AJ, AbdelMalik P. Musings on privacy issues in health research involving disaggregate geographic data about individuals. Int J Health Geogr 2009; 8:46. [PMID: 19619311 PMCID: PMC2716332 DOI: 10.1186/1476-072x-8-46] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2009] [Accepted: 07/20/2009] [Indexed: 11/17/2022] Open
Abstract
This paper offers a state-of-the-art overview of the intertwined privacy, confidentiality, and security issues that are commonly encountered in health research involving disaggregate geographic data about individuals. Key definitions are provided, along with some examples of actual and potential security and confidentiality breaches and related incidents that captured mainstream media and public interest in recent months and years. The paper then goes on to present a brief survey of the research literature on location privacy/confidentiality concerns and on privacy-preserving solutions in conventional health research and beyond, touching on the emerging privacy issues associated with online consumer geoinformatics and location-based services. The 'missing ring' (in many treatments of the topic) of data security is also discussed. Personal information and privacy legislations in two countries, Canada and the UK, are covered, as well as some examples of recent research projects and events about the subject. Select highlights from a June 2009 URISA (Urban and Regional Information Systems Association) workshop entitled 'Protecting Privacy and Confidentiality of Geographic Data in Health Research' are then presented. The paper concludes by briefly charting the complexity of the domain and the many challenges associated with it, and proposing a novel, 'one stop shop' case-based reasoning framework to streamline the provision of clear and individualised guidance for the design and approval of new research projects (involving geographical identifiers about individuals), including crisp recommendations on which specific privacy-preserving solutions and approaches would be suitable in each case.
Collapse
Affiliation(s)
- Maged N Kamel Boulos
- Faculty of Health and Social Work, University of Plymouth, Drake Circus, Plymouth, Devon, PL4 8AA, UK
| | - Andrew J Curtis
- GIS Research Laboratory, Department of Geography, University of Southern California, Kaprielian Hall (KAP), Room 416, 3620 South Vermont Avenue, Los Angeles, CA 90089-0255, USA
| | - Philip AbdelMalik
- Faculty of Health and Social Work, University of Plymouth, Drake Circus, Plymouth, Devon, PL4 8AA, UK
| |
Collapse
|
18
|
Cassa CA, Wieland SC, Mandl KD. Re-identification of home addresses from spatial locations anonymized by Gaussian skew. Int J Health Geogr 2008; 7:45. [PMID: 18700031 PMCID: PMC2526988 DOI: 10.1186/1476-072x-7-45] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2008] [Accepted: 08/12/2008] [Indexed: 11/15/2022] Open
Abstract
Background Knowledge of the geographical locations of individuals is fundamental to the practice of spatial epidemiology. One approach to preserving the privacy of individual-level addresses in a data set is to de-identify the data using a non-deterministic blurring algorithm that shifts the geocoded values. We investigate a vulnerability in this approach which enables an adversary to re-identify individuals using multiple anonymized versions of the original data set. If several such versions are available, each can be used to incrementally refine estimates of the original geocoded location. Results We produce multiple anonymized data sets using a single set of addresses and then progressively average the anonymized results related to each address, characterizing the steep decline in distance from the re-identified point to the original location, (and the reduction in privacy). With ten anonymized copies of an original data set, we find a substantial decrease in average distance from 0.7 km to 0.2 km between the estimated, re-identified address and the original address. With fifty anonymized copies of an original data set, we find a decrease in average distance from 0.7 km to 0.1 km. Conclusion We demonstrate that multiple versions of the same data, each anonymized by non-deterministic Gaussian skew, can be used to ascertain original geographic locations. We explore solutions to this problem that include infrastructure to support the safe disclosure of anonymized medical data to prevent inference or re-identification of original address data, and the use of a Markov-process based algorithm to mitigate this risk.
Collapse
Affiliation(s)
- Christopher A Cassa
- Children's Hospital Informatics Program, Children's Hospital Boston, Boston, MA, USA.
| | | | | |
Collapse
|
19
|
Providing Spatial Data for Secondary Analysis: Issues and Current Practices relating to Confidentiality. POPULATION RESEARCH AND POLICY REVIEW 2008; 27:639-665. [PMID: 19122860 DOI: 10.1007/s11113-008-9095-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Spatially explicit data pose a series of opportunities and challenges for all the actors involved in providing data for long-term preservation and secondary analysis -- the data producer, the data archive, and the data user. We report on opportunities and challenges for each of the three players, and then turn to a summary of current thinking about how best to prepare, archive, disseminate, and make use of social science data that have spatially explicit identification. The core issue that runs through the paper is the risk of the disclosure of the identity of respondents. If we know where they live, where they work, or where they own property, it is possible to find out who they are. Those involved in collecting, archiving, and using data need to be aware of the risks of disclosure and become familiar with best practices to avoid disclosures that will be harmful to respondents.
Collapse
|
20
|
AbdelMalik P, Boulos MNK, Jones R. The perceived impact of location privacy: a web-based survey of public health perspectives and requirements in the UK and Canada. BMC Public Health 2008; 8:156. [PMID: 18471295 PMCID: PMC2396622 DOI: 10.1186/1471-2458-8-156] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2007] [Accepted: 05/09/2008] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND The "place-consciousness" of public health professionals is on the rise as spatial analyses and Geographic Information Systems (GIS) are rapidly becoming key components of their toolbox. However, "place" is most useful at its most precise, granular scale - which increases identification risks, thereby clashing with privacy issues. This paper describes the views and requirements of public health professionals in Canada and the UK on privacy issues and spatial data, as collected through a web-based survey. METHODS Perceptions on the impact of privacy were collected through a web-based survey administered between November 2006 and January 2007. The survey targeted government, non-government and academic GIS labs and research groups involved in public health, as well as public health units (Canada), ministries, and observatories (UK). Potential participants were invited to participate through personally addressed, standardised emails. RESULTS Of 112 invitees in Canada and 75 in the UK, 66 and 28 participated in the survey, respectively. The completion proportion for Canada was 91%, and 86% for the UK. No response differences were observed between the two countries. Ninety three percent of participants indicated a requirement for personally identifiable data (PID) in their public health activities, including geographic information. Privacy was identified as an obstacle to public health practice by 71% of respondents. The overall self-rated median score for knowledge of privacy legislation and policies was 7 out of 10. Those who rated their knowledge of privacy as high (at the median or above) also rated it significantly more severe as an obstacle to research (P < 0.001). The most critical cause cited by participants in both countries was bureaucracy. CONCLUSION The clash between PID requirements - including granular geography - and limitations imposed by privacy and its associated bureaucracy require immediate attention and solutions, particularly given the increasing utilisation of GIS in public health. Solutions include harmonization of privacy legislation with public health requirements, bureaucratic simplification, increased multidisciplinary discourse, education, and development of toolsets, algorithms and guidelines for using and reporting on disaggregate data.
Collapse
Affiliation(s)
- Philip AbdelMalik
- Faculty of Health and Social Work, University of Plymouth, Centre Court, 73 Exeter Street, Drake Circus, Plymouth, Devon PL4 8AA, UK
- Office of Public Health Practice, Public Health Agency of Canada, 120 Colonnade Road, AL6702A, Ottawa, Ontario, K1A 0K9, Canada
| | - Maged N Kamel Boulos
- Faculty of Health and Social Work, University of Plymouth, Centre Court, 73 Exeter Street, Drake Circus, Plymouth, Devon PL4 8AA, UK
| | - Ray Jones
- Faculty of Health and Social Work, University of Plymouth, Centre Court, 73 Exeter Street, Drake Circus, Plymouth, Devon PL4 8AA, UK
| |
Collapse
|
21
|
Ozonoff A, Jeffery C, Manjourides J, White LF, Pagano M. Effect of spatial resolution on cluster detection: a simulation study. Int J Health Geogr 2007; 6:52. [PMID: 18042281 PMCID: PMC2213641 DOI: 10.1186/1476-072x-6-52] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2007] [Accepted: 11/27/2007] [Indexed: 11/18/2022] Open
Abstract
Background Aggregation of spatial data is intended to protect privacy, but some effects of aggregation on spatial methods have not yet been quantified. Methods We generated 3,000 spatial data sets and evaluated power of detection at 12 different levels of aggregation using the spatial scan statistic implemented in SaTScan v6.0. Results Power to detect clusters decreased from nearly 100% when using exact locations to roughly 40% at the coarsest level of spatial resolution. Conclusion Aggregation has the potential for obfuscation.
Collapse
Affiliation(s)
- Al Ozonoff
- Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115, USA.
| | | | | | | | | |
Collapse
|
22
|
Siffel C, Strickland MJ, Gardner BR, Kirby RS, Correa A. Role of geographic information systems in birth defects surveillance and research. ACTA ACUST UNITED AC 2007; 76:825-33. [PMID: 17094141 DOI: 10.1002/bdra.20325] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
BACKGROUND With the significant advancement of geographic information systems (GIS), mapping and evaluating the spatial distribution of health events has become easier. We examine the role of GIS in birth defects surveillance and research. METHODS We briefly describe the geocoding process and potential problems in accuracy of the obtained geocodes, and some of the capabilities and limitations of GIS. We illustrate how GIS has been applied using the Metropolitan Atlanta Congenital Defects Program geocoded dataset. We provide some comments on potential data quality and confidentiality issues with birth defects in relation to GIS. RESULTS It is desirable to geocode addresses using a multistrategy approach to achieve a high-quality and accurate GIS dataset. Beyond the basic but important function of mapping, sophisticated statistical approaches and software are available to analyze the spatial or spatial-temporal occurrence of birth defects, alone or in association with environmental hazards, and to present this information without compromising the confidentiality of the subjects. CONCLUSIONS We recommend a broad and systematic use of GIS in birth defects spatial surveillance and research.
Collapse
Affiliation(s)
- Csaba Siffel
- Division of Birth Defects and Developmental Disabilities, National Center on Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention, Atlanta, Georgia 30333, USA.
| | | | | | | | | |
Collapse
|
23
|
Lhotska L. Multi-agent system as a platform for management of medical documentation. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2007; 2007:3661-3664. [PMID: 18002791 DOI: 10.1109/iembs.2007.4353125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
The paper is focused on description of an ongoing project of a pilot study and implementation of a multi-agent system for management of medical documentation in a hospital. First we analyzed the problem and divided it into four groups of tasks: storing and retrieving stored data, user interaction, data archiving, and system security. All these tasks are performed by corresponding agents, namely user interface agent, database agent, archive agent, and security agent. Communication between the agents is a crucial point of the system operation. The system has been designed as an open system and we assume that it will be extended by additional agents with new functions, e.g. decision support, biomedical signal evaluation, laboratory test evaluation.
Collapse
Affiliation(s)
- Lenka Lhotska
- Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University, Prague, Technicka 2, 166 27 Prague 6, Czech Republic.
| |
Collapse
|
24
|
Brownstein JS, Cassa CA, Kohane IS, Mandl KD. An unsupervised classification method for inferring original case locations from low-resolution disease maps. Int J Health Geogr 2006; 5:56. [PMID: 17156451 PMCID: PMC1702538 DOI: 10.1186/1476-072x-5-56] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2006] [Accepted: 12/08/2006] [Indexed: 12/02/2022] Open
Abstract
Background Widespread availability of geographic information systems software has facilitated the use of disease mapping in academia, government and private sector. Maps that display the address of affected patients are often exchanged in public forums, and published in peer-reviewed journal articles. As previously reported, a search of figure legends in five major medical journals found 19 articles from 1994–2004 that identify over 19,000 patient addresses. In this report, a method is presented to evaluate whether patient privacy is being breached in the publication of low-resolution disease maps. Results To demonstrate the effect, a hypothetical low-resolution map of geocoded patient addresses was created and the accuracy with which patient addresses can be resolved is described. Through georeferencing and unsupervised classification of the original image, the method precisely re-identified 26% (144/550) of the patient addresses from a presentation quality map and 79% (432/550) from a publication quality map. For the presentation quality map, 99.8% of the addresses were within 70 meters (approximately one city block length) of the predicted patient location, 51.6% of addresses were identified within five buildings, 70.7% within ten buildings and 93% within twenty buildings. For the publication quality map, all addresses were within 14 meters and 11 buildings of the predicted patient location. Conclusion This study demonstrates that lowering the resolution of a map displaying geocoded patient addresses does not sufficiently protect patient addresses from re-identification. Guidelines to protect patient privacy, including those of medical journals, should reflect policies that ensure privacy protection when spatial data are displayed or published.
Collapse
Affiliation(s)
- John S Brownstein
- Children's Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology, 1 Autumn St, Boston, MA, USA
- Division of Emergency Medicine, 300 Longwood Ave, Children's Hospital Boston, Boston, MA, USA
- Department of Pediatrics, 300 Longwood Ave, Harvard Medical School, Boston, MA, USA
| | - Christopher A Cassa
- Children's Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology, 1 Autumn St, Boston, MA, USA
- Division of Emergency Medicine, 300 Longwood Ave, Children's Hospital Boston, Boston, MA, USA
| | - Isaac S Kohane
- Children's Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology, 1 Autumn St, Boston, MA, USA
- Department of Pediatrics, 300 Longwood Ave, Harvard Medical School, Boston, MA, USA
| | - Kenneth D Mandl
- Children's Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology, 1 Autumn St, Boston, MA, USA
- Division of Emergency Medicine, 300 Longwood Ave, Children's Hospital Boston, Boston, MA, USA
- Department of Pediatrics, 300 Longwood Ave, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
25
|
Pickle LW, Szczur M, Lewis DR, Stinchcomb DG. The crossroads of GIS and health information: a workshop on developing a research agenda to improve cancer control. Int J Health Geogr 2006; 5:51. [PMID: 17118204 PMCID: PMC1665447 DOI: 10.1186/1476-072x-5-51] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2006] [Accepted: 11/21/2006] [Indexed: 11/18/2022] Open
Abstract
Cancer control researchers seek to reduce the burden of cancer by studying interventions, their impact in defined populations, and the means by which they can be better used. The first step in cancer control is identifying where the cancer burden is elevated, which suggests locations where interventions are needed. Geographic information systems (GIS) and other spatial analytic methods provide such a solution and thus can play a major role in cancer control. This report presents findings from a workshop held June 16-17, 2005, to bring together experts and stakeholders to address current issues in GIScience and cancer control. A broad range of areas of expertise and interest was represented, including epidemiology, geography, statistics, environmental health, social science, cancer control, cancer registry operations, and cancer advocacy. The goals of this workshop were to build consensus on important policy and research questions, identify roadblocks to future progress in this field, and provide recommendations to overcome these roadblocks.
Collapse
Affiliation(s)
- Linda Williams Pickle
- Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, MD USA
| | - Martha Szczur
- Division of Specialized Information Services, National Library of Medicine, Bethesda, MD USA
| | - Denise Riedel Lewis
- Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, MD USA
| | - David G Stinchcomb
- Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, MD USA
| |
Collapse
|
26
|
Olson KL, Grannis SJ, Mandl KD. Privacy protection versus cluster detection in spatial epidemiology. Am J Public Health 2006; 96:2002-8. [PMID: 17018828 PMCID: PMC1751806 DOI: 10.2105/ajph.2005.069526] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/22/2006] [Indexed: 11/04/2022]
Abstract
OBJECTIVES Patient data that includes precise locations can reveal patients' identities, whereas data aggregated into administrative regions may preserve privacy and confidentiality. We investigated the effect of varying degrees of address precision (exact latitude and longitude vs the center points of zip code or census tracts) on detection of spatial clusters of cases. METHODS We simulated disease outbreaks by adding supplementary spatially clustered emergency department visits to authentic hospital emergency department syndromic surveillance data. We identified clusters with a spatial scan statistic and evaluated detection rate and accuracy. RESULTS More clusters were identified, and clusters were more accurately detected, when exact locations were used. That is, these clusters contained at least half of the simulated points and involved few additional emergency department visits. These results were especially apparent when the synthetic clustered points crossed administrative boundaries and fell into multiple zip code or census tracts. CONCLUSIONS The spatial cluster detection algorithm performed better when addresses were analyzed as exact locations than when they were analyzed as center points of zip code or census tracts, particularly when the clustered points crossed administrative boundaries. Use of precise addresses offers improved performance, but this practice must be weighed against privacy concerns in the establishment of public health data exchange policies.
Collapse
Affiliation(s)
- Karen L Olson
- Children's Hospital Informatics Program, Harvard-MIT Division of Health Sciences and Technology, Children's Hospital Boston, Boston, Mass 02215, USA.
| | | | | |
Collapse
|
27
|
Curtis AJ, Mills JW, Leitner M. Spatial confidentiality and GIS: re-engineering mortality locations from published maps about Hurricane Katrina. Int J Health Geogr 2006; 5:44. [PMID: 17032448 PMCID: PMC1626452 DOI: 10.1186/1476-072x-5-44] [Citation(s) in RCA: 74] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2006] [Accepted: 10/10/2006] [Indexed: 11/10/2022] Open
Abstract
Background Geographic Information Systems (GIS) can provide valuable insight into patterns of human activity. Online spatial display applications, such as Google Earth, can democratise this information by disseminating it to the general public. Although this is a generally positive advance for society, there is a legitimate concern involving the disclosure of confidential information through spatial display. Although guidelines exist for aggregated data, little has been written concerning the display of point level information. The concern is that a map containing points representing cases of cancer or an infectious disease, could be re-engineered back to identify an actual residence. This risk is investigated using point mortality locations from Hurricane Katrina re-engineered from a map published in the Baton Rouge Advocate newspaper, and a field team validating these residences using search and rescue building markings. Results We show that the residence of an individual, visualized as a generalized point covering approximately one and half city blocks on a map, can be re-engineered back to identify the actual house location, or at least a close neighbour, even if the map contains little spatial reference information. The degree of re-engineering success is also shown to depend on the urban characteristic of the neighborhood. Conclusion The results in this paper suggest a need to re-evaluate current guidelines for the display of point (address level) data. Examples of other point maps displaying health data extracted from the academic literature are presented where a similar re-engineering approach might cause concern with respect to violating confidentiality. More research is also needed into the role urban structure plays in the accuracy of re-engineering. We suggest that health and spatial scientists should be proactive and suggest a series of point level spatial confidentiality guidelines before governmental decisions are made which may be reactionary toward the threat of revealing confidential information, thereby imposing draconian limits on research using a GIS.
Collapse
Affiliation(s)
- Andrew J Curtis
- World Health Organization Collaborating Center for Remote Sensing and GIS for Public Health, Department of Geography and Anthropology, Louisiana State University, Baton Rouge, USA
| | - Jacqueline W Mills
- LSU GIS Clearinghouse Cooperative, Disaster Science Management Louisiana State University, Baton Rouge, USA
| | - Michael Leitner
- World Health Organization Collaborating Center for Remote Sensing and GIS for Public Health, Department of Geography and Anthropology, Louisiana State University, Baton Rouge, USA
| |
Collapse
|