Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

10
(from Reference Citation Analysis)

Article PDFs (4)

Cited by > 0 (8)

Searched Name

Data anonymization

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Kondylakis H, Kalokyri V, Sfakianakis S, Marias K, Tsiknakis M, Jimenez-Pastor A, Camacho-Ramos E, Blanquer I, Segrelles JD, López-Huguet S, Barelle C, Kogut-Czarkowska M, Tsakou G, Siopis N, Sakellariou Z, Bizopoulos P, Drossou V, Lalas A, Votis K, Mallol P, Marti-Bonmati L, Alberich LC, Seymour K, Boucher S, Ciarrocchi E, Fromont L, Rambla J, Harms A, Gutierrez A, Starmans MPA, Prior F, Gelpi JL, Lekadir K. Data infrastructures for AI in medical imaging: a report on the experiences of five EU projects. Eur Radiol Exp 2023;7:20. [PMID: 37150779 PMCID: PMC10164664 DOI: 10.1186/s41747-023-00336-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 03/02/2023] [Indexed: 05/09/2023] Open

Abstract

Artificial intelligence (AI) is transforming the field of medical imaging and has the potential to bring medicine from the era of 'sick-care' to the era of healthcare and prevention. The development of AI requires access to large, complete, and harmonized real-world datasets, representative of the population, and disease diversity. However, to date, efforts are fragmented, based on single-institution, size-limited, and annotation-limited datasets. Available public datasets (e.g., The Cancer Imaging Archive, TCIA, USA) are limited in scope, making model generalizability really difficult. In this direction, five European Union projects are currently working on the development of big data infrastructures that will enable European, ethically and General Data Protection Regulation-compliant, quality-controlled, cancer-related, medical imaging platforms, in which both large-scale data and AI algorithms will coexist. The vision is to create sustainable AI cloud-based platforms for the development, implementation, verification, and validation of trustable, usable, and reliable AI models for addressing specific unmet needs regarding cancer care provision. In this paper, we present an overview of the development efforts highlighting challenges and approaches selected providing valuable feedback to future attempts in the area.Key points• Artificial intelligence models for health imaging require access to large amounts of harmonized imaging data and metadata.• Main infrastructures adopted either collect centrally anonymized data or enable access to pseudonymized distributed data.• Developing a common data model for storing all relevant information is a challenge.• Trust of data providers in data sharing initiatives is essential.• An online European Union meta-tool-repository is a necessity minimizing effort duplication for the various projects in the area.

Collapse

Affiliation(s)

Haridimos Kondylakis FORTH-ICS, FORTH-ICS, N. Plastira 100, Heraklion, Crete, Greece.
Varvara Kalokyri FORTH-ICS, FORTH-ICS, N. Plastira 100, Heraklion, Crete, Greece
Stelios Sfakianakis FORTH-ICS, FORTH-ICS, N. Plastira 100, Heraklion, Crete, Greece
Kostas Marias FORTH-ICS, FORTH-ICS, N. Plastira 100, Heraklion, Crete, Greece
Manolis Tsiknakis FORTH-ICS, FORTH-ICS, N. Plastira 100, Heraklion, Crete, Greece
Ana Jimenez-Pastor Quibim SL, Valencia, Spain
Eduardo Camacho-Ramos Quibim SL, Valencia, Spain
Ignacio Blanquer Universitat Politècnica de València, Valencia, Spain
J Damian Segrelles Universitat Politècnica de València, Valencia, Spain
Sergio López-Huguet Universitat Politècnica de València, Valencia, Spain
Caroline Barelle European Dynamics, Maroussi, Athens, Greece
Magdalena Kogut-Czarkowska Timelex CVBA, Brussels, Belgium
Gianna Tsakou MAGGIOLI S.P.A., Research and Development Lab, Marousi, Greece
Nikolaos Siopis Centre of Research & Technology - Hellas, Information Technologies Institute, Thermi - Thessaloniki, Greece
Zisis Sakellariou Centre of Research & Technology - Hellas, Information Technologies Institute, Thermi - Thessaloniki, Greece
Paschalis Bizopoulos Centre of Research & Technology - Hellas, Information Technologies Institute, Thermi - Thessaloniki, Greece
Vicky Drossou Centre of Research & Technology - Hellas, Information Technologies Institute, Thermi - Thessaloniki, Greece
Antonios Lalas Centre of Research & Technology - Hellas, Information Technologies Institute, Thermi - Thessaloniki, Greece
Konstantinos Votis Centre of Research & Technology - Hellas, Information Technologies Institute, Thermi - Thessaloniki, Greece
Pedro Mallol La Fe Health Research Institute, Valencia, Spain
Luis Marti-Bonmati La Fe Health Research Institute, Valencia, Spain
Leonor Cerdá Alberich La Fe Health Research Institute, Valencia, Spain
Karine Seymour Medexprim, Labège, France
Samuel Boucher Medexprim, Labège, France
Esther Ciarrocchi Department of Physics, University of Pisa, Pisa, Italy
Lauren Fromont European Genome-Phenome Archive, Centre for Genomic Regulation, Barcelona, Spain
Jordi Rambla European Genome-Phenome Archive, Centre for Genomic Regulation, Barcelona, Spain
Alexander Harms Erasmus MC, Rotterdam, The Netherlands
Andrea Gutierrez Erasmus MC, Rotterdam, The Netherlands
Martijn P A Starmans Erasmus MC, Rotterdam, The Netherlands
Fred Prior Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, USA
Josep Ll Gelpi University of Barcelona, Barcelona, Spain
Karim Lekadir University of Barcelona, Barcelona, Spain

Collapse

Cabezón Ruiz S, Morilla Romero de la Osa R. [Big Data in health: a new paradigm to regulate, a challenge for social justice.]. Rev Esp Salud Publica 2021;95:e202110150. [PMID: 34617519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Accepted: 09/09/2021] [Indexed: 06/13/2023] Open

Lee H, Chung YD. Differentially private release of medical microdata: an efficient and practical approach for preserving informative attribute values. BMC Med Inform Decis Mak 2020;20:155. [PMID: 32641043 PMCID: PMC7346516 DOI: 10.1186/s12911-020-01171-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Accepted: 06/26/2020] [Indexed: 11/21/2022] Open

Hauswaldt J, Demmer I, Heinemann S, Himmel W, Hummers E, Pung J, Schlegelmilch F, Drepper J. [The risk of re-identification when analyzing electronic health records: a critical appraisal and possible solutions]. Z Evid Fortbild Qual Gesundhwes 2019;149:22-31. [PMID: 32165110 DOI: 10.1016/j.zefq.2020.01.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2019] [Revised: 11/14/2019] [Accepted: 01/15/2020] [Indexed: 11/22/2022]

Abstract

BACKGROUND AND OBJECTIVES

The use of primary care data gathered from electronic health records in local practices could be an important building block for the future of health services research. However, the risks and reservations associated with using this data for research purposes should not be underestimated. We show the data protection and privacy problems that may arise through secondary analysis of routine primary care data and describe the technical solutions that are available to address these concerns - as a trust-building measure.

METHODS

We screened 40 variables that are deemed important for documentation in the electronic health records of primary care physicians and rated the risk of patient re-identification when using these records from routine medical data for research purposes. The criteria used to rate the risk of re-identification were "expert perception" (inferences of a professional observer of phenotypical characteristics which are documented in the 40 variables), "researchable additional knowledge" (knowledge of characteristics of a person through publicly available information and social media networks), and "statistic frequency" according to diagnosis and medication statistics.

RESULTS

Diagnoses and reasons for contacting a general practitioner can contain particularly identifiable characteristics such as "obesity" (ICD-10 E66) and "nicotine dependence" (F17). About half of all ICD codes documented in primary care fall below a critical threshold value in their absolute frequency; this is all the more problematic if diagnoses allow for re-identification due to phenotypical characteristics. Medication information holds little potential risk of re-identification of a person. However, the application of medications could be a source of re-identification, e. g., self-injections of insulin or use of inhalators. Information about times and dates are especially sensitive for the re-identification of a person. Sex and age of a patient generally pose no problems, except in the case of very young or very old individuals when these age groups are seldom represented in the practice.

DISCUSSION

Routine health data are, in principle, sensitive data. Knowledge about the variables in primary care data gathered from electronic health records in local practices and the evaluation of this data allow us to more accurately estimate the risk of re-identification for the persons concerned. In particular, chronic diagnoses and/or diagnoses in long text, calendar dates for patient contacts and therapies bear a high risk of re-identification. Technical measures such as removing data, masking values and coding should make re-identification considerably more difficult. There will always be a remaining risk of re-identification which should be openly discussed to counteract concerns about a lack of data protection or a sweeping critique of digitization in healthcare.

Collapse

Eicher J, Bild R, Spengler H, Kuhn KA, Prasser F. A comprehensive tool for creating and evaluating privacy-preserving biomedical prediction models. BMC Med Inform Decis Mak 2020;20:29. [PMID: 32046701 PMCID: PMC7014648 DOI: 10.1186/s12911-020-1041-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Accepted: 01/30/2020] [Indexed: 02/07/2023] Open

Abstract

Background

Modern data driven medical research promises to provide new insights into the development and course of disease and to enable novel methods of clinical decision support. To realize this, machine learning models can be trained to make predictions from clinical, paraclinical and biomolecular data. In this process, privacy protection and regulatory requirements need careful consideration, as the resulting models may leak sensitive personal information. To counter this threat, a wide range of methods for integrating machine learning with formal methods of privacy protection have been proposed. However, there is a significant lack of practical tools to create and evaluate such privacy-preserving models. In this software article, we report on our ongoing efforts to bridge this gap.

Results

We have extended the well-known ARX anonymization tool for biomedical data with machine learning techniques to support the creation of privacy-preserving prediction models. Our methods are particularly well suited for applications in biomedicine, as they preserve the truthfulness of data (e.g. no noise is added) and they are intuitive and relatively easy to explain to non-experts. Moreover, our implementation is highly versatile, as it supports binomial and multinomial target variables, different types of prediction models and a wide range of privacy protection techniques. All methods have been integrated into a sound framework that supports the creation, evaluation and refinement of models through intuitive graphical user interfaces. To demonstrate the broad applicability of our solution, we present three case studies in which we created and evaluated different types of privacy-preserving prediction models for breast cancer diagnosis, diagnosis of acute inflammation of the urinary system and prediction of the contraceptive method used by women. In this process, we also used a wide range of different privacy models (k-anonymity, differential privacy and a game-theoretic approach) as well as different data transformation techniques.

Conclusions

With the tool presented in this article, accurate prediction models can be created that preserve the privacy of individuals represented in the training set in a variety of threat scenarios. Our implementation is available as open source software.

Collapse

Shokraneh F. Reproducibility and replicability of systematic reviews. World J Meta-Anal 2019;7:66-71. [DOI: 10.13105/wjma.v7.i3.66] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 03/23/2019] [Accepted: 03/26/2019] [Indexed: 02/06/2023] Open

Lee H, Kim S, Kim JW, Chung YD. Utility-preserving anonymization for health data publishing. BMC Med Inform Decis Mak 2017;17:104. [PMID: 28693480 PMCID: PMC5504813 DOI: 10.1186/s12911-017-0499-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Accepted: 06/28/2017] [Indexed: 11/23/2022] Open

Abstract

BACKGROUND

Publishing raw electronic health records (EHRs) may be considered as a breach of the privacy of individuals because they usually contain sensitive information. A common practice for the privacy-preserving data publishing is to anonymize the data before publishing, and thus satisfy privacy models such as k-anonymity. Among various anonymization techniques, generalization is the most commonly used in medical/health data processing. Generalization inevitably causes information loss, and thus, various methods have been proposed to reduce information loss. However, existing generalization-based data anonymization methods cannot avoid excessive information loss and preserve data utility.

METHODS

We propose a utility-preserving anonymization for privacy preserving data publishing (PPDP). To preserve data utility, the proposed method comprises three parts: (1) utility-preserving model, (2) counterfeit record insertion, (3) catalog of the counterfeit records. We also propose an anonymization algorithm using the proposed method. Our anonymization algorithm applies full-domain generalization algorithm. We evaluate our method in comparison with existence method on two aspects, information loss measured through various quality metrics and error rate of analysis result.

RESULTS

With all different types of quality metrics, our proposed method show the lower information loss than the existing method. In the real-world EHRs analysis, analysis results show small portion of error between the anonymized data through the proposed method and original data.

CONCLUSIONS

We propose a new utility-preserving anonymization method and an anonymization algorithm using the proposed method. Through experiments on various datasets, we show that the utility of EHRs anonymized by the proposed method is significantly better than those anonymized by previous approaches.

Collapse

Eicher J, Kuhn KA, Prasser F. An Experimental Comparison of Quality Models for Health Data De-Identification. Stud Health Technol Inform 2017;245:704-708. [PMID: 29295189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Gong Q, Luo J, Yang M, Ni W, Li XB. Anonymizing 1:M microdata with high utility. Knowl Based Syst 2017;115:15-26. [PMID: 28603388 DOI: 10.1016/j.knosys.2016.10.012] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Lin WY, Yang DC, Wang JT. Privacy preserving data anonymization of spontaneous ADE reporting system dataset. BMC Med Inform Decis Mak 2016;16 Suppl 1:58. [PMID: 27454754 DOI: 10.1186/s12911-016-0293-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open

Abstract

Background

To facilitate long-term safety surveillance of marketing drugs, many spontaneously reporting systems (SRSs) of ADR events have been established world-wide. Since the data collected by SRSs contain sensitive personal health information that should be protected to prevent the identification of individuals, it procures the issue of privacy preserving data publishing (PPDP), that is, how to sanitize (anonymize) raw data before publishing. Although much work has been done on PPDP, very few studies have focused on protecting privacy of SRS data and none of the anonymization methods is favorable for SRS datasets, due to which contain some characteristics such as rare events, multiple individual records, and multi-valued sensitive attributes.

Methods

We propose a new privacy model called MS(k, θ^*)-bounding for protecting published spontaneous ADE reporting data from privacy attacks. Our model has the flexibility of varying privacy thresholds, i.e., θ^*, for different sensitive values and takes the characteristics of SRS data into consideration. We also propose an anonymization algorithm for sanitizing the raw data to meet the requirements specified through the proposed model. Our algorithm adopts a greedy-based clustering strategy to group the records into clusters, conforming to an innovative anonymization metric aiming to minimize the privacy risk as well as maintain the data utility for ADR detection. Empirical study was conducted using FAERS dataset from 2004Q1 to 2011Q4. We compared our model with four prevailing methods, including k-anonymity, (X, Y)-anonymity, Multi-sensitive l-diversity, and (α, k)-anonymity, evaluated via two measures, Danger Ratio (DR) and Information Loss (IL), and considered three different scenarios of threshold setting for θ^*, including uniform setting, level-wise setting and frequency-based setting. We also conducted experiments to inspect the impact of anonymized data on the strengths of discovered ADR signals.

Results

With all three different threshold settings for sensitive value, our method can successively prevent the disclosure of sensitive values (nearly all observed DRs are zeros) without sacrificing too much of data utility. With non-uniform threshold setting, level-wise or frequency-based, our MS(k, θ^*)-bounding exhibits the best data utility and the least privacy risk among all the models. The experiments conducted on selected ADR signals from MedWatch show that only very small difference on signal strength (PRR or ROR) were observed. The results show that our method can effectively prevent the disclosure of patient sensitive information without sacrificing data utility for ADR signal detection.

Conclusions

We propose a new privacy model for protecting SRS data that possess some characteristics overlooked by contemporary models and an anonymization algorithm to sanitize SRS data in accordance with the proposed model. Empirical evaluation on the real SRS dataset, i.e., FAERS, shows that our method can effectively solve the privacy problem in SRS data without influencing the ADR signal strength.

Collapse