Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Chevrier R, Foufi V, Gaudet-Blavignac C, Robert A, Lovis C. Use and Understanding of Anonymization and De-Identification in the Biomedical Literature: Scoping Review. J Med Internet Res 2019;21:e13484. [PMID: 31152528 PMCID: PMC6658290 DOI: 10.2196/13484] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 03/29/2019] [Accepted: 04/26/2019] [Indexed: 01/19/2023] Open

For:	Chevrier R, Foufi V, Gaudet-Blavignac C, Robert A, Lovis C. Use and Understanding of Anonymization and De-Identification in the Biomedical Literature: Scoping Review. J Med Internet Res 2019;21:e13484. [PMID: 31152528 PMCID: PMC6658290 DOI: 10.2196/13484] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 03/29/2019] [Accepted: 04/26/2019] [Indexed: 01/19/2023] Open

Number

Cited by Other Article(s)

Pesapane F, Cuocolo R, Sardanelli F. The Picasso's skepticism on computer science and the dawn of generative AI: questions after the answers to keep "machines-in-the-loop". Eur Radiol Exp 2024;8:81. [PMID: 39046535 PMCID: PMC11269548 DOI: 10.1186/s41747-024-00485-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Accepted: 06/16/2024] [Indexed: 07/25/2024] Open

Kondylakis H, Catalan R, Alabart SM, Barelle C, Bizopoulos P, Bobowicz M, Bona J, Fotiadis DI, Garcia T, Gomez I, Jimenez-Pastor A, Karatzanis G, Lekadir K, Kogut-Czarkowska M, Lalas A, Marias K, Marti-Bonmati L, Munuera J, Nikiforaki K, Pelissier M, Prior F, Rutherford M, Saint-Aubert L, Sakellariou Z, Seymour K, Trouillard T, Votis K, Tsiknakis M. Documenting the de-identification process of clinical and imaging data for AI for health imaging projects. Insights Imaging 2024;15:130. [PMID: 38816658 PMCID: PMC11139818 DOI: 10.1186/s13244-024-01711-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 04/26/2024] [Indexed: 06/01/2024] Open

Affiliation(s)

Haridimos Kondylakis FORTH-ICS, Heraklion, Crete, Greece.
Rocio Catalan La Fe University and Polytechnic Hospital, La Fe Health Research Institute, Valencia, Spain
Sara Martinez Alabart TIC Salut Social Foundation, Barcelona, Spain
Caroline Barelle European Dynamics, Luxembourg, Luxembourg
Paschalis Bizopoulos Centre for Research & Technology Hellas, Information Technologies Institute (CERTH-ITI), Central Directorate, Thermi, Thessaloniki, Greece
Maciej Bobowicz Medical University of Gdańsk, Gdańsk, Poland
Jonathan Bona University of Arkansas for Medical Sciences, Little Rock, AR, USA
Dimitrios I Fotiadis Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina, Greece
Teresa Garcia Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
Ignacio Gomez La Fe University and Polytechnic Hospital, La Fe Health Research Institute, Valencia, Spain
Ana Jimenez-Pastor Quantitative Imaging Biomarkers in Medicine, Quibim, Valencia, Spain
Giannis Karatzanis FORTH-ICS, Heraklion, Crete, Greece
Karim Lekadir Artificial Intelligence in Medicine Labm Universitat de Barcelona, Barcelona, Spain
Magdalena Kogut-Czarkowska Timelex BV/SRL, Brussels, Belgium
Antonios Lalas Centre for Research & Technology Hellas, Information Technologies Institute (CERTH-ITI), Central Directorate, Thermi, Thessaloniki, Greece
Kostas Marias FORTH-ICS, Heraklion, Crete, Greece
Luis Marti-Bonmati Hospital Universitario y Politécnico La Fe, Grupo de Investigación Biomédica en Imagen IIS La Fe, Valencia, España
Jose Munuera Quantitative Imaging Biomarkers in Medicine, Quibim, Valencia, Spain
Katerina Nikiforaki FORTH-ICS, Heraklion, Crete, Greece
Manon Pelissier MEDEXPRIM, Labège, France
Fred Prior University of Arkansas for Medical Sciences, Little Rock, AR, USA
Michael Rutherford University of Arkansas for Medical Sciences, Little Rock, AR, USA
Laure Saint-Aubert MEDEXPRIM, Labège, France
Zisis Sakellariou Centre for Research & Technology Hellas, Information Technologies Institute (CERTH-ITI), Central Directorate, Thermi, Thessaloniki, Greece
Karine Seymour MEDEXPRIM, Labège, France
Thomas Trouillard MEDEXPRIM, Labège, France
Konstantinos Votis Centre for Research & Technology Hellas, Information Technologies Institute (CERTH-ITI), Central Directorate, Thermi, Thessaloniki, Greece
Manolis Tsiknakis FORTH-ICS, Heraklion, Crete, Greece

Collapse

Cengiz N, Kabanda SM, Moodley K. Cross-border data sharing through the lens of research ethics committee members in sub-Saharan Africa. PLoS One 2024;19:e0303828. [PMID: 38781141 PMCID: PMC11115285 DOI: 10.1371/journal.pone.0303828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 05/01/2024] [Indexed: 05/25/2024] Open

Abstract

BACKGROUND

Several factors thwart successful data sharing-ambiguous or fragmented regulatory landscapes, conflicting institutional/researcher interests and varying levels of data science-related expertise are among these. Traditional ethics oversight mechanisms and practices may not be well placed to guarantee adequate research oversight given the unique challenges presented by digital technologies and artificial intelligence (AI). Data-intensive research has raised new, contextual ethics and legal challenges that are particularly relevant in an African research setting. Yet, no empirical research has been conducted to explore these challenges.

MATERIALS AND METHODS

We explored REC members' views and experiences on data sharing by conducting 20 semi-structured interviews online between June 2022 and February 2023. Using purposive sampling and snowballing, we recruited representatives across sub-Saharan Africa (SSA). We transcribed verbatim and thematically analysed the data with Atlas.ti V22.

RESULTS

Three dominant themes were identified: (i) experiences in reviewing data sharing protocols, (ii) perceptions of data transfer tools and (iii) ethical, legal and social challenges of data sharing. Several sub-themes emerged as: (i.a) frequency of and approaches used in reviewing data sharing protocols, (i.b) practical/technical challenges, (i.c) training, (ii.a) ideal structure of data transfer tools, (ii.b) key elements of data transfer tools, (ii.c) implementation level, (ii.d) key stakeholders in developing and reviewing a data transfer agreement (DTA), (iii.a) confidentiality and anonymity, (iii.b) consent, (iii.c) regulatory frameworks, and (iii.d) stigmatisation and discrimination.

CONCLUSIONS

Our results indicated variability in REC members' perceptions, suboptimal awareness of the existence of data protection laws and a unanimously expressed need for REC member training. To promote efficient data sharing within and across SSA, guidelines that incorporate ethical, legal and social elements need to be developed in consultation with relevant stakeholders and field experts, along with the training accreditation of REC members in the review of data-intensive protocols.

Collapse

Kovačević A, Bašaragin B, Milošević N, Nenadić G. De-identification of clinical free text using natural language processing: A systematic review of current approaches. Artif Intell Med 2024;151:102845. [PMID: 38555848 DOI: 10.1016/j.artmed.2024.102845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 03/13/2024] [Accepted: 03/18/2024] [Indexed: 04/02/2024]

Abstract

BACKGROUND

Electronic health records (EHRs) are a valuable resource for data-driven medical research. However, the presence of protected health information (PHI) makes EHRs unsuitable to be shared for research purposes. De-identification, i.e. the process of removing PHI is a critical step in making EHR data accessible. Natural language processing has repeatedly demonstrated its feasibility in automating the de-identification process.

OBJECTIVES

Our study aims to provide systematic evidence on how the de-identification of clinical free text written in English has evolved in the last thirteen years, and to report on the performances and limitations of the current state-of-the-art systems for the English language. In addition, we aim to identify challenges and potential research opportunities in this field.

METHODS

A systematic search in PubMed, Web of Science, and the DBLP was conducted for studies published between January 2010 and February 2023. Titles and abstracts were examined to identify the relevant studies. Selected studies were then analysed in-depth, and information was collected on de-identification methodologies, data sources, and measured performance.

RESULTS

A total of 2125 publications were identified for the title and abstract screening. 69 studies were found to be relevant. Machine learning (37 studies) and hybrid (26 studies) approaches are predominant, while six studies relied only on rules. The majority of the approaches were trained and evaluated on public corpora. The 2014 i2b2/UTHealth corpus is the most frequently used (36 studies), followed by the 2006 i2b2 (18 studies) and 2016 CEGS N-GRID (10 studies) corpora.

CONCLUSION

Earlier de-identification approaches aimed at English were mainly rule and machine learning hybrids with extensive feature engineering and post-processing, while more recent performance improvements are due to feature-inferring recurrent neural networks. Current leading performance is achieved using attention-based neural models. Recent studies report state-of-the-art F1-scores (over 98 %) when evaluated in the manner usually adopted by the clinical natural language processing community. However, their performance needs to be more thoroughly assessed with different measures to judge their reliability to safely de-identify data in a real-world setting. Without additional manually labeled training data, state-of-the-art systems fail to generalise well across a wide range of clinical sub-domains.

Collapse

Pilgram L, Meurers T, Malin B, Schaeffner E, Eckardt KU, Prasser F. The Costs of Anonymization: Case Study Using Clinical Data. J Med Internet Res 2024;26:e49445. [PMID: 38657232 PMCID: PMC11079766 DOI: 10.2196/49445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 01/14/2024] [Accepted: 02/13/2024] [Indexed: 04/26/2024] Open

Abstract

BACKGROUND

Sharing data from clinical studies can accelerate scientific progress, improve transparency, and increase the potential for innovation and collaboration. However, privacy concerns remain a barrier to data sharing. Certain concerns, such as reidentification risk, can be addressed through the application of anonymization algorithms, whereby data are altered so that it is no longer reasonably related to a person. Yet, such alterations have the potential to influence the data set's statistical properties, such that the privacy-utility trade-off must be considered. This has been studied in theory, but evidence based on real-world individual-level clinical data is rare, and anonymization has not broadly been adopted in clinical practice.

OBJECTIVE

The goal of this study is to contribute to a better understanding of anonymization in the real world by comprehensively evaluating the privacy-utility trade-off of differently anonymized data using data and scientific results from the German Chronic Kidney Disease (GCKD) study.

METHODS

The GCKD data set extracted for this study consists of 5217 records and 70 variables. A 2-step procedure was followed to determine which variables constituted reidentification risks. To capture a large portion of the risk-utility space, we decided on risk thresholds ranging from 0.02 to 1. The data were then transformed via generalization and suppression, and the anonymization process was varied using a generic and a use case-specific configuration. To assess the utility of the anonymized GCKD data, general-purpose metrics (ie, data granularity and entropy), as well as use case-specific metrics (ie, reproducibility), were applied. Reproducibility was assessed by measuring the overlap of the 95% CI lengths between anonymized and original results.

RESULTS

Reproducibility measured by 95% CI overlap was higher than utility obtained from general-purpose metrics. For example, granularity varied between 68.2% and 87.6%, and entropy varied between 25.5% and 46.2%, whereas the average 95% CI overlap was above 90% for all risk thresholds applied. A nonoverlapping 95% CI was detected in 6 estimates across all analyses, but the overwhelming majority of estimates exhibited an overlap over 50%. The use case-specific configuration outperformed the generic one in terms of actual utility (ie, reproducibility) at the same level of privacy.

CONCLUSIONS

Our results illustrate the challenges that anonymization faces when aiming to support multiple likely and possibly competing uses, while use case-specific anonymization can provide greater utility. This aspect should be taken into account when evaluating the associated costs of anonymized data and attempting to maintain sufficiently high levels of privacy for anonymized data.

TRIAL REGISTRATION

German Clinical Trials Register DRKS00003971; https://drks.de/search/en/trial/DRKS00003971.

INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID)

RR2-10.1093/ndt/gfr456.

Collapse

Brunette CA, Harris EJ, Antwi AA, Lemke AA, Kerman BJ, Vassy JL. Data from a national survey of United States primary care physicians on genetic risk scores for common disease prevention. Data Brief 2024;52:109930. [PMID: 38093856 PMCID: PMC10716767 DOI: 10.1016/j.dib.2023.109930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 12/05/2023] [Indexed: 02/01/2024] Open

Ormond KE, Bavamian S, Becherer C, Currat C, Joerger F, Geiger TR, Hiendlmeyer E, Maurer J, Staub T, Vayena E. What are the bottlenecks to health data sharing in Switzerland? An interview study. Swiss Med Wkly 2024;154:3538. [PMID: 38579329 DOI: 10.57187/s.3538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2024] Open

Abstract

BACKGROUND

While health data sharing for research purposes is strongly supported in principle, it can be challenging to implement in practice. Little is known about the actual bottlenecks to health data sharing in Switzerland.

AIMS OF THE STUDY

This study aimed to assess the obstacles to Swiss health data sharing, including legal, ethical and logistical bottlenecks.

METHODS

We identified 37 key stakeholders in data sharing via the Swiss Personalised Health Network ecosystem, defined as being an expert on sharing sensitive health data for research purposes at a Swiss university hospital (or a Swiss disease cohort) or being a stakeholder in data sharing at a public or private institution that uses such data. We conducted semi-structured interviews, which were transcribed, translated when necessary, and de-identified. The entire research team discussed the transcripts and notes taken during each interview before an inductive coding process occurred.

RESULTS

Eleven semi-structured interviews were conducted (primarily in English) with 17 individuals representing lawyers, data protection officers, ethics committee members, scientists, project managers, bioinformaticians, clinical trials unit members, and biobank stakeholders. Most respondents felt that it was not the actual data transfer that was the bottleneck but rather the processes and systems around it, which were considered time-intensive and confusing. The templates developed by the Swiss Personalised Health Network and the Swiss General Consent process were generally felt to have streamlined processes significantly. However, these logistics and data quality issues remain practical bottlenecks in Swiss health data sharing. Areas of legal uncertainty include privacy laws when sharing data internationally, questions of "who owns the data", inconsistencies created because the Swiss general consent is perceived as being implemented differently across different institutions, and definitions and operationalisation of anonymisation and pseudo-anonymisation. Many participants desired to create a "culture of data sharing" and to recognise that data sharing is a process with many steps, not an event, that requires sustainability efforts and personnel. Some participants also stressed a desire to move away from data sharing and the current privacy focus towards processes that facilitate data access.

CONCLUSIONS

Facilitating a data access culture in Switzerland may require legal clarifications, further education about the process and resources to support data sharing, and further investment in sustainable infrastructureby funders and institutions.

Collapse

Monachino G, Zanchi B, Fiorillo L, Conte G, Auricchio A, Tzovara A, Faraci FD. Deep Generative Models: The winning key for large and easily accessible ECG datasets? Comput Biol Med 2023;167:107655. [PMID: 37976830 DOI: 10.1016/j.compbiomed.2023.107655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 10/04/2023] [Accepted: 10/31/2023] [Indexed: 11/19/2023]

Affiliation(s)

Giuliana Monachino Institute of Digital Technologies for Personalized Healthcare - MeDiTech, Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Via la Santa 1, Lugano 6900, Switzerland; Institute of Informatics, University of Bern, Neubrückstrasse 10, Bern 3012, Switzerland.
Beatrice Zanchi Institute of Digital Technologies for Personalized Healthcare - MeDiTech, Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Via la Santa 1, Lugano 6900, Switzerland; Department of Quantitative Biomedicine, University of Zurich, Schmelzbergstrasse 26, Zurich 8091, Switzerland
Luigi Fiorillo Institute of Digital Technologies for Personalized Healthcare - MeDiTech, Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Via la Santa 1, Lugano 6900, Switzerland
Giulio Conte Division of Cardiology, Fondazione Cardiocentro Ticino, Via Tesserete 48, Lugano 6900, Switzerland; Centre for Computational Medicine in Cardiology, Faculty of Informatics, Università della Svizzera Italiana, Via la Santa 1, Lugano 6900, Switzerland
Angelo Auricchio Division of Cardiology, Fondazione Cardiocentro Ticino, Via Tesserete 48, Lugano 6900, Switzerland; Centre for Computational Medicine in Cardiology, Faculty of Informatics, Università della Svizzera Italiana, Via la Santa 1, Lugano 6900, Switzerland
Athina Tzovara Institute of Informatics, University of Bern, Neubrückstrasse 10, Bern 3012, Switzerland; Sleep Wake Epilepsy Center \| NeuroTec, Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 16, Bern 3010, Switzerland
Francesca Dalia Faraci Institute of Digital Technologies for Personalized Healthcare - MeDiTech, Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Via la Santa 1, Lugano 6900, Switzerland

Collapse

Hurvitz N, Ilan Y. The Constrained-Disorder Principle Assists in Overcoming Significant Challenges in Digital Health: Moving from "Nice to Have" to Mandatory Systems. Clin Pract 2023;13:994-1014. [PMID: 37623270 PMCID: PMC10453547 DOI: 10.3390/clinpract13040089] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 08/16/2023] [Accepted: 08/18/2023] [Indexed: 08/26/2023] Open

Geva R, Gusev A, Polyakov Y, Liram L, Rosolio O, Alexandru A, Genise N, Blatt M, Duchin Z, Waissengrin B, Mirelman D, Bukstein F, Blumenthal DT, Wolf I, Pelles-Avraham S, Schaffer T, Lavi LA, Micciancio D, Vaikuntanathan V, Badawi AA, Goldwasser S. Collaborative privacy-preserving analysis of oncological data using multiparty homomorphic encryption. Proc Natl Acad Sci U S A 2023;120:e2304415120. [PMID: 37549296 PMCID: PMC10437415 DOI: 10.1073/pnas.2304415120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Accepted: 06/09/2023] [Indexed: 08/09/2023] Open

Marschik PB, Kulvicius T, Flügge S, Widmann C, Nielsen-Saines K, Schulte-Rüther M, Hüning B, Bölte S, Poustka L, Sigafoos J, Wörgötter F, Einspieler C, Zhang D. Open video data sharing in developmental science and clinical practice. iScience 2023;26:106348. [PMID: 36994082 PMCID: PMC10040728 DOI: 10.1016/j.isci.2023.106348] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 12/19/2022] [Accepted: 03/02/2023] [Indexed: 03/08/2023] Open

Affiliation(s)

Peter B. Marschik Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, 37075 Göttingen, Germany Center of Neurodevelopmental Disorders (KIND), Centre for Psychiatry Research; Department of Women’s and Children’s Health, Karolinska Institutet, 11330 Stockholm, Sweden iDN – interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, 8036 Graz, Austria Leibniz-ScienceCampus Primate Cognition, 37075 Göttingen, Germany
Tomas Kulvicius Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, 37075 Göttingen, Germany Department for Computational Neuroscience, Third Institute of Physics-Biophysics, Georg-August-University of Göttingen, 37077 Göttingen, Germany
Sarah Flügge Department for Computational Neuroscience, Third Institute of Physics-Biophysics, Georg-August-University of Göttingen, 37077 Göttingen, Germany
Claudius Widmann Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, 37075 Göttingen, Germany
Karin Nielsen-Saines Division of Pediatric Infectious Diseases, David Geffen UCLA School of Medicine Los Angeles, CA 90095, USA
Martin Schulte-Rüther Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, 37075 Göttingen, Germany Leibniz-ScienceCampus Primate Cognition, 37075 Göttingen, Germany
Britta Hüning Department of Pediatrics I, Neonatology, University Children’s Hospital Essen, University Duisburg-Essen, 45147 Essen, Germany
Sven Bölte Center of Neurodevelopmental Disorders (KIND), Centre for Psychiatry Research; Department of Women’s and Children’s Health, Karolinska Institutet, 11330 Stockholm, Sweden Child and Adolescent Psychiatry, Stockholm Health Care Services, Region Stockholm, 11861 Stockholm, Sweden Curtin Autism Research Group, Curtin School of Allied Health, Curtin University, 6102 Perth, WA
Luise Poustka Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, 37075 Göttingen, Germany Leibniz-ScienceCampus Primate Cognition, 37075 Göttingen, Germany
Jeff Sigafoos School of Education, Victoria University of Wellington, 6012 Wellington, New Zealand
Florentin Wörgötter Leibniz-ScienceCampus Primate Cognition, 37075 Göttingen, Germany Department for Computational Neuroscience, Third Institute of Physics-Biophysics, Georg-August-University of Göttingen, 37077 Göttingen, Germany
Christa Einspieler iDN – interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, 8036 Graz, Austria
Dajie Zhang Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, 37075 Göttingen, Germany iDN – interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, 8036 Graz, Austria Leibniz-ScienceCampus Primate Cognition, 37075 Göttingen, Germany

Collapse

Clunie DA, Flanders A, Taylor A, Erickson B, Bialecki B, Brundage D, Gutman D, Prior F, Seibert JA, Perry J, Gichoya JW, Kirby J, Andriole K, Geneslaw L, Moore S, Fitzgerald TJ, Tellis W, Xiao Y, Farahani K, Luo J, Rosenthal A, Kandarpa K, Rosen R, Goetz K, Babcock D, Xu B, Hsiao J. Report of the Medical Image De-Identification (MIDI) Task Group - Best Practices and Recommendations. ARXIV 2023:arXiv:2303.10473v2. [PMID: 37033463 PMCID: PMC10081345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Subscribe] [Scholar Register] [Indexed: 04/11/2023]

Lathe R. Restricted access data in the neurosciences: Are the restrictions always justified? Front Neurosci 2023;16:975795. [PMID: 36760799 PMCID: PMC9904205 DOI: 10.3389/fnins.2022.975795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 11/25/2022] [Indexed: 01/26/2023] Open

Scheibner J, Ienca M, Vayena E. Health data privacy through homomorphic encryption and distributed ledger computing: an ethical-legal qualitative expert assessment study. BMC Med Ethics 2022;23:121. [PMID: 36451210 PMCID: PMC9713155 DOI: 10.1186/s12910-022-00852-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 10/28/2022] [Indexed: 12/03/2022] Open

Abstract

BACKGROUND

Increasingly, hospitals and research institutes are developing technical solutions for sharing patient data in a privacy preserving manner. Two of these technical solutions are homomorphic encryption and distributed ledger technology. Homomorphic encryption allows computations to be performed on data without this data ever being decrypted. Therefore, homomorphic encryption represents a potential solution for conducting feasibility studies on cohorts of sensitive patient data stored in distributed locations. Distributed ledger technology provides a permanent record on all transfers and processing of patient data, allowing data custodians to audit access. A significant portion of the current literature has examined how these technologies might comply with data protection and research ethics frameworks. In the Swiss context, these instruments include the Federal Act on Data Protection and the Human Research Act. There are also institutional frameworks that govern the processing of health related and genetic data at different universities and hospitals. Given Switzerland's geographical proximity to European Union (EU) member states, the General Data Protection Regulation (GDPR) may impose additional obligations.

METHODS

To conduct this assessment, we carried out a series of qualitative interviews with key stakeholders at Swiss hospitals and research institutions. These included legal and clinical data management staff, as well as clinical and research ethics experts. These interviews were carried out with two series of vignettes that focused on data discovery using homomorphic encryption and data erasure from a distributed ledger platform.

RESULTS

For our first set of vignettes, interviewees were prepared to allow data discovery requests if patients had provided general consent or ethics committee approval, depending on the types of data made available. Our interviewees highlighted the importance of protecting against the risk of reidentification given different types of data. For our second set, there was disagreement amongst interviewees on whether they would delete patient data locally, or delete data linked to a ledger with cryptographic hashes. Our interviewees were also willing to delete data locally or on the ledger, subject to local legislation.

CONCLUSION

Our findings can help guide the deployment of these technologies, as well as determine ethics and legal requirements for such technologies.

Collapse

Samlali K, Thornbury M, Venter A. Community-led risk analysis of direct-to-consumer whole-genome sequencing. Biochem Cell Biol 2022;100:499-509. [PMID: 35939839 DOI: 10.1139/bcb-2021-0506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open

Wahid KA, Glerean E, Sahlsten J, Jaskari J, Kaski K, Naser MA, He R, Mohamed ASR, Fuller CD. Artificial Intelligence for Radiation Oncology Applications Using Public Datasets. Semin Radiat Oncol 2022;32:400-414. [PMID: 36202442 PMCID: PMC9587532 DOI: 10.1016/j.semradonc.2022.06.009] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Roguljić M, Šimunović D, Poklepović Peričić T, Viđak M, Utrobičić A, Marušić M, Marušić A. Publishing Identifiable Patient Photographs in Scientific Journals: Scoping Review of Policies and Practices. J Med Internet Res 2022;24:e37594. [PMID: 36044262 PMCID: PMC9475410 DOI: 10.2196/37594] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2022] [Revised: 05/27/2022] [Accepted: 06/23/2022] [Indexed: 11/20/2022] Open

Abstract

Background

Publishing identifiable patient data in scientific journals may jeopardize patient privacy and confidentiality if best ethical practices are not followed. Current journal practices show considerable diversity in the publication of identifiable patient photographs, and different stakeholders may have different opinions of and practices in publishing patient photographs.

Objective

This scoping review aimed to identify existing evidence and map knowledge gaps in medical research on the policies and practices of publishing identifiable photographs in scientific articles.

Methods

We performed a comprehensive search of the Cochrane Central Register of Controlled Trials, Cochrane Database of Systematic Reviews, CINAHL with Full Text, Database of Abstracts of Reviews of Effects, Ovid MEDLINE, and Scopus. The Open Science Framework, PROSPERO, BASE, Google Scholar, OpenGrey, ClinicalTrials.gov, the Campbell Collaboration Library, and Science.gov were also searched.

Results

After screening the initial 15,949 titles and abstracts, 98 (0.61%) publications were assessed for eligibility at the full-text level, and 30 (0.19%) publications were included in this review. The studies were published between 1994 and 2020; most had a cross-sectional design and were published in journals covering different medical disciplines. We identified 3 main topics. The first included ethical aspects of the use of facial photographs in publications. In different clinical settings, the consent process was not conducted properly, and health professionals did not recognize the importance of obtaining written patient consent for taking and using patient medical photographs. They often considered verbal consent sufficient or even used the photographs without consent. The second topic included studies that investigated the practices and use of medical photography in publishing. Both patients and doctors asked for confidential storage and maintenance of medical photographs. Patients preferred to be photographed by their physicians using an institutional camera and preferred nonidentifiable medical photographs not only for publication but also in general. Conventional methods of deidentification of facial photographs concealing the eye area were recognized as unsuccessful in protecting patient privacy. The third topic emerged from studies investigating medical photography in journal articles. These studies showed great diversity in publishing practices regarding consent for publication of medical photographs. Journal policies regarding the consent process and consent forms were insufficient, and existing ethical professional guidelines were not fully implemented in actual practices. Patients’ photographs from open-access medical journals were found on public web-based platforms.

Conclusions

This scoping review showed a diversity of practices in publishing identifiable patient photographs and an unsatisfactory level of knowledge of this issue among different stakeholders despite existing standards. Emerging issues include the availability of patients’ photographs from open-access journals or preprints in the digital environment. There is a need to improve standards and processes to obtain proper consent to fully protect the privacy of patients in published articles.

Collapse

Mawji A, Longstaff H, Trawin J, Dunsmuir D, Komugisha C, Novakowski SK, Wiens MO, Akech S, Tagoola A, Kissoon N, Ansermino JM. A proposed de-identification framework for a cohort of children presenting at a health facility in Uganda. PLOS DIGITAL HEALTH 2022;1:e0000027. [PMID: 36812586 PMCID: PMC9931294 DOI: 10.1371/journal.pdig.0000027] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 07/08/2022] [Indexed: 11/18/2022]

Abstract

Data sharing has enormous potential to accelerate and improve the accuracy of research, strengthen collaborations, and restore trust in the clinical research enterprise. Nevertheless, there remains reluctancy to openly share raw data sets, in part due to concerns regarding research participant confidentiality and privacy. Statistical data de-identification is an approach that can be used to preserve privacy and facilitate open data sharing. We have proposed a standardized framework for the de-identification of data generated from cohort studies in children in a low-and-middle income country. We applied a standardized de-identification framework to a data sets comprised of 241 health related variables collected from a cohort of 1750 children with acute infections from Jinja Regional Referral Hospital in Eastern Uganda. Variables were labeled as direct and quasi-identifiers based on conditions of replicability, distinguishability, and knowability with consensus from two independent evaluators. Direct identifiers were removed from the data sets, while a statistical risk-based de-identification approach using the k-anonymity model was applied to quasi-identifiers. Qualitative assessment of the level of privacy invasion associated with data set disclosure was used to determine an acceptable re-identification risk threshold, and corresponding k-anonymity requirement. A de-identification model using generalization, followed by suppression was applied using a logical stepwise approach to achieve k-anonymity. The utility of the de-identified data was demonstrated using a typical clinical regression example. The de-identified data sets was published on the Pediatric Sepsis Data CoLaboratory Dataverse which provides moderated data access. Researchers are faced with many challenges when providing access to clinical data. We provide a standardized de-identification framework that can be adapted and refined based on specific context and risks. This process will be combined with moderated access to foster coordination and collaboration in the clinical research community.

Collapse

Affiliation(s)

Alishah Mawji Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, British Columbia, Canada Centre for International Child Health, BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada * E-mail:
Holly Longstaff Privacy and Access, PHSA Research and New Initiatives, Provincial Health Services Authority, Vancouver, British Columbia Canada
Jessica Trawin Centre for International Child Health, BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada
Dustin Dunsmuir Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, British Columbia, Canada Centre for International Child Health, BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada
Clare Komugisha WALIMU, Kololo, Kampala, Uganda
Stefanie K. Novakowski Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, British Columbia, Canada
Matthew O. Wiens Centre for International Child Health, BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada WALIMU, Kololo, Kampala, Uganda Mbarara University of Science and Technology, Mbarara, Uganda
Samuel Akech Kenya Medical Research Institute/Wellcome Trust Research Programme, Nairobi, Kenya
Abner Tagoola Department of Pediatrics, Jinja Regional Referral Hospital, Rotary Rd, Jinja, Uganda
Niranjan Kissoon Centre for International Child Health, BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada Department of Pediatrics, University of British Columbia, Vancouver, British Columbia, Canada
J. Mark Ansermino Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, British Columbia, Canada Centre for International Child Health, BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada

Collapse

Dammen LV, Finseth TT, McCurdy BH, Barnett NP, Conrady RA, Leach AG, Deick AF, Van Steenis AL, Gardner R, Smith BL, Kay A, Shirtcliff EA. Evoking stress reactivity in virtual reality: A systematic review and meta-analysis. Neurosci Biobehav Rev 2022;138:104709. [PMID: 35644278 DOI: 10.1016/j.neubiorev.2022.104709] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 04/08/2022] [Accepted: 05/21/2022] [Indexed: 01/04/2023]

Rodriguez A, Tuck C, Dozier MF, Lewis SC, Eldridge S, Jackson T, Murray A, Weir CJ. Current recommendations/practices for anonymising data from clinical trials in order to make it available for sharing: A scoping review. Clin Trials 2022;19:452-463. [PMID: 35730910 PMCID: PMC9373195 DOI: 10.1177/17407745221087469] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Auwerx C, Sadler MC, Reymond A, Kutalik Z. From pharmacogenetics to pharmaco-omics: Milestones and future directions. HGG ADVANCES 2022;3:100100. [PMID: 35373152 PMCID: PMC8971318 DOI: 10.1016/j.xhgg.2022.100100] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open

Sampurno F, Kowalski C, Connor SE, Nguyen AV, Acuña ÀP, Ng CF, Foster C, Feick G, Boronat OG, Dieng S, Brglevska S, Ferrante S, Leung S, Villanti P, Moore CM, Graham ID, Millar JL, Litwin MS, Papa N. Knowledge and insights from a maturing international clinical quality registry. J Am Med Inform Assoc 2022;29:964-969. [PMID: 35048976 PMCID: PMC9006702 DOI: 10.1093/jamia/ocab281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 11/17/2021] [Accepted: 12/06/2021] [Indexed: 01/22/2023] Open

Shin SY, Kim HS. Data Pseudonymization in a Range That Does Not Affect Data Quality: Correlation with the Degree of Participation of Clinicians. J Korean Med Sci 2021;36:e299. [PMID: 34783216 PMCID: PMC8593412 DOI: 10.3346/jkms.2021.36.e299] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Accepted: 10/18/2021] [Indexed: 12/28/2022] Open

Sung M, Cha D, Park YR. Local Differential Privacy in the Medical Domain to Protect Sensitive Information: Algorithm Development and Real-World Validation. JMIR Med Inform 2021;9:e26914. [PMID: 34747711 PMCID: PMC8663640 DOI: 10.2196/26914] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Revised: 02/10/2021] [Accepted: 09/06/2021] [Indexed: 01/25/2023] Open

Abstract

Background

Privacy is of increasing interest in the present big data era, particularly the privacy of medical data. Specifically, differential privacy has emerged as the standard method for preservation of privacy during data analysis and publishing.

Objective

Using machine learning techniques, we applied differential privacy to medical data with diverse parameters and checked the feasibility of our algorithms with synthetic data as well as the balance between data privacy and utility.

Methods

All data were normalized to a range between –1 and 1, and the bounded Laplacian method was applied to prevent the generation of out-of-bound values after applying the differential privacy algorithm. To preserve the cardinality of the categorical variables, we performed postprocessing via discretization. The algorithm was evaluated using both synthetic and real-world data (from the eICU Collaborative Research Database). We evaluated the difference between the original data and the perturbated data using misclassification rates and the mean squared error for categorical data and continuous data, respectively. Further, we compared the performance of classification models that predict in-hospital mortality using real-world data.

Results

The misclassification rate of categorical variables ranged between 0.49 and 0.85 when the value of ε was 0.1, and it converged to 0 as ε increased. When ε was between 10² and 10³, the misclassification rate rapidly dropped to 0. Similarly, the mean squared error of the continuous variables decreased as ε increased. The performance of the model developed from perturbed data converged to that of the model developed from original data as ε increased. In particular, the accuracy of a random forest model developed from the original data was 0.801, and this value ranged from 0.757 to 0.81 when ε was 10^-1 and 10⁴, respectively.

Conclusions

We applied local differential privacy to medical domain data, which are diverse and high dimensional. Higher noise may offer enhanced privacy, but it simultaneously hinders utility. We should choose an appropriate degree of noise for data perturbation to balance privacy and utility depending on specific situations.

Collapse

Arefolov A, Adam L, Brown S, Budovskaya Y, Chen C, Das D, Farhy C, Ferguson R, Huang H, Kanigel K, Lu C, Polesskaya O, Staton T, Tajhya R, Whitley M, Wong JY, Zeng X, McCreary M. Implementation of the FAIR Data Principles for Exploratory Biomarker Data from Clinical Trials. DATA INTELLIGENCE 2021. [DOI: 10.1162/dint_a_00106] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open

Zuo Z, Watson M, Budgen D, Hall R, Kennelly C, Al Moubayed N. Data Anonymization for Pervasive Health Care: Systematic Literature Mapping Study. JMIR Med Inform 2021;9:e29871. [PMID: 34652278 PMCID: PMC8556642 DOI: 10.2196/29871] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 06/21/2021] [Accepted: 08/02/2021] [Indexed: 01/29/2023] Open

Abstract

BACKGROUND

Data science offers an unparalleled opportunity to identify new insights into many aspects of human life with recent advances in health care. Using data science in digital health raises significant challenges regarding data privacy, transparency, and trustworthiness. Recent regulations enforce the need for a clear legal basis for collecting, processing, and sharing data, for example, the European Union's General Data Protection Regulation (2016) and the United Kingdom's Data Protection Act (2018). For health care providers, legal use of the electronic health record (EHR) is permitted only in clinical care cases. Any other use of the data requires thoughtful considerations of the legal context and direct patient consent. Identifiable personal and sensitive information must be sufficiently anonymized. Raw data are commonly anonymized to be used for research purposes, with risk assessment for reidentification and utility. Although health care organizations have internal policies defined for information governance, there is a significant lack of practical tools and intuitive guidance about the use of data for research and modeling. Off-the-shelf data anonymization tools are developed frequently, but privacy-related functionalities are often incomparable with regard to use in different problem domains. In addition, tools to support measuring the risk of the anonymized data with regard to reidentification against the usefulness of the data exist, but there are question marks over their efficacy.

OBJECTIVE

In this systematic literature mapping study, we aim to alleviate the aforementioned issues by reviewing the landscape of data anonymization for digital health care.

METHODS

We used Google Scholar, Web of Science, Elsevier Scopus, and PubMed to retrieve academic studies published in English up to June 2020. Noteworthy gray literature was also used to initialize the search. We focused on review questions covering 5 bottom-up aspects: basic anonymization operations, privacy models, reidentification risk and usability metrics, off-the-shelf anonymization tools, and the lawful basis for EHR data anonymization.

RESULTS

We identified 239 eligible studies, of which 60 were chosen for general background information; 16 were selected for 7 basic anonymization operations; 104 covered 72 conventional and machine learning-based privacy models; four and 19 papers included seven and 15 metrics, respectively, for measuring the reidentification risk and degree of usability; and 36 explored 20 data anonymization software tools. In addition, we also evaluated the practical feasibility of performing anonymization on EHR data with reference to their usability in medical decision-making. Furthermore, we summarized the lawful basis for delivering guidance on practical EHR data anonymization.

CONCLUSIONS

This systematic literature mapping study indicates that anonymization of EHR data is theoretically achievable; yet, it requires more research efforts in practical implementations to balance privacy preservation and usability to ensure more reliable health care applications.

Collapse

Islam SMS, Mishra V, Siddiqui MU, Moses JC, Adibi S, Nguyen L, Wickramasinghe N. Smartphone Apps for Diabetes Medication Adherence: A Systematic Review (Preprint). JMIR Diabetes 2021;7:e33264. [PMID: 35727613 PMCID: PMC9257622 DOI: 10.2196/33264] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 02/24/2022] [Accepted: 04/08/2022] [Indexed: 11/13/2022] Open

Cole CL, Sengupta S, Rossetti Née Collins S, Vawdrey DK, Halaas M, Maddox TM, Gordon G, Dave T, Payne PRO, Williams AE, Estrin D. Ten principles for data sharing and commercialization. J Am Med Inform Assoc 2021;28:646-649. [PMID: 33186458 PMCID: PMC7936510 DOI: 10.1093/jamia/ocaa260] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 08/07/2020] [Accepted: 10/02/2020] [Indexed: 11/18/2022] Open

Rutherford M, Mun SK, Levine B, Bennett W, Smith K, Farmer P, Jarosz Q, Wagner U, Freyman J, Blake G, Tarbox L, Farahani K, Prior F. A DICOM dataset for evaluation of medical image de-identification. Sci Data 2021;8:183. [PMID: 34272388 PMCID: PMC8285420 DOI: 10.1038/s41597-021-00967-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 06/07/2021] [Indexed: 11/23/2022] Open

Recent Radiomics Advancements in Breast Cancer: Lessons and Pitfalls for the Next Future. ACTA ACUST UNITED AC 2021;28:2351-2372. [PMID: 34202321 PMCID: PMC8293249 DOI: 10.3390/curroncol28040217] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 06/14/2021] [Accepted: 06/21/2021] [Indexed: 12/13/2022]

Hertz DL, Arwood MJ, Stocco G, Singh S, Karnes JH, Ramsey LB. Planning and Conducting a Pharmacogenetics Association Study. Clin Pharmacol Ther 2021;110:688-701. [PMID: 33880756 DOI: 10.1002/cpt.2270] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 04/04/2021] [Indexed: 12/13/2022]

Reichold M, Dietzel N, Chmelirsch C, Kolominsky-Rabas PL, Graessel E, Prokosch HU. Designing and Implementing an IT Architecture for a Digital Multicenter Dementia Registry: digiDEM Bayern. Appl Clin Inform 2021;12:551-563. [PMID: 34134149 PMCID: PMC8208839 DOI: 10.1055/s-0041-1731286] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open

Abstract

Background Registries are an essential research tool to investigate the long-term course of diseases and their impact on the affected. The project digiDEM Bayern will set up a prospective dementia registry to collect long-term data of people with dementia and their caregivers in Bavaria (Germany) supported by more than 300 research partners.

Objective The objective of this article is to outline an information technology (IT) architecture for the integration of a registry and comprehensive participant management in a dementia study. Measures to ensure high data quality, study governance, along with data privacy, and security are to be included in the architecture.

Methods The architecture was developed based on an iterative, stakeholder-oriented process. The development was inspired by the Twin Peaks Model that focuses on the codevelopment of requirements and architecture. We gradually moved from a general to a detailed understanding of both the requirements and design through a series of iterations. The experience learned from the pilot phase was integrated into a further iterative process of continuous improvement of the architecture.

Results The infrastructure provides a standardized workflow to support the electronic data collection and trace each participant's study process. Therefore, the implementation consists of three systems: (1) electronic data capture system for Web-based or offline app-based data collection; (2) participant management system for the administration of the identity data of participants and research partners as well as of the overall study governance process; and (3) videoconferencing software for conducting interviews online. First experiences in the pilot phase have proven the feasibility of the framework.

Conclusion This article outlines an IT architecture to integrate a registry and participant management in a dementia research project. The framework was discussed and developed with the involvement of numerous stakeholders. Due to its adaptability of used software systems, a transfer to other projects should be easily possible.

Collapse

Swales L. The Protection of Personal Information Act and data de-identification. S AFR J SCI 2021. [DOI: 10.17159/sajs.2021/10808] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open

Zegers CML, Witteveen A, Schulte MHJ, Henrich JF, Vermeij A, Klever B, Dekker A. Mind Your Data: Privacy and Legal Matters in eHealth. JMIR Form Res 2021;5:e17456. [PMID: 33729163 PMCID: PMC8075039 DOI: 10.2196/17456] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 05/01/2020] [Accepted: 01/17/2021] [Indexed: 01/30/2023] Open

Syed S, Syed M, Syeda HB, Garza M, Bennett W, Bona J, Begum S, Baghal A, Zozus M, Prior F. API Driven On-Demand Participant ID Pseudonymization in Heterogeneous Multi-Study Research. Healthc Inform Res 2021;27:39-47. [PMID: 33611875 PMCID: PMC7921568 DOI: 10.4258/hir.2021.27.1.39] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 09/23/2020] [Accepted: 10/18/2020] [Indexed: 11/29/2022] Open

Abstract

OBJECTIVES

To facilitate clinical and translational research, imaging and non-imaging clinical data from multiple disparate systems must be aggregated for analysis. Study participant records from various sources are linked together and to patient records when possible to address research questions while ensuring patient privacy. This paper presents a novel tool that pseudonymizes participant identifiers (PIDs) using a researcher-driven automated process that takes advantage of application-programming interface (API) and the Perl Open-Source Digital Imaging and Communications in Medicine Archive (POSDA) to further de-identify PIDs. The tool, on-demand cohort and API participant identifier pseudonymization (O-CAPP), employs a pseudonymization method based on the type of incoming research data.

METHODS

For images, pseudonymization of PIDs is done using API calls that receive PIDs present in Digital Imaging and Communications in Medicine (DICOM) headers and returns the pseudonymized identifiers. For non-imaging clinical research data, PIDs provided by study principal investigators (PIs) are pseudonymized using a nightly automated process. The pseudonymized PIDs (P-PIDs) along with other protected health information is further de-identified using POSDA.

RESULTS

A sample of 250 PIDs pseudonymized by O-CAPP were selected and successfully validated. Of those, 125 PIDs that were pseudonymized by the nightly automated process were validated by multiple clinical trial investigators (CTIs). For the other 125, CTIs validated radiologic image pseudonymization by API request based on the provided PID and P-PID mappings.

CONCLUSIONS

We developed a novel approach of an ondemand pseudonymization process that will aide researchers in obtaining a comprehensive and holistic view of study participant data without compromising patient privacy.

Collapse

Jeong YU, Yoo S, Kim YH, Shim WH. De-Identification of Facial Features in Magnetic Resonance Images: Software Development Using Deep Learning Technology. J Med Internet Res 2020;22:e22739. [PMID: 33208302 PMCID: PMC7759440 DOI: 10.2196/22739] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 09/09/2020] [Accepted: 11/12/2020] [Indexed: 12/14/2022] Open

Abstract

Background

High-resolution medical images that include facial regions can be used to recognize the subject’s face when reconstructing 3-dimensional (3D)-rendered images from 2-dimensional (2D) sequential images, which might constitute a risk of infringement of personal information when sharing data. According to the Health Insurance Portability and Accountability Act (HIPAA) privacy rules, full-face photographic images and any comparable image are direct identifiers and considered as protected health information. Moreover, the General Data Protection Regulation (GDPR) categorizes facial images as biometric data and stipulates that special restrictions should be placed on the processing of biometric data.

Objective

This study aimed to develop software that can remove the header information from Digital Imaging and Communications in Medicine (DICOM) format files and facial features (eyes, nose, and ears) at the 2D sliced-image level to anonymize personal information in medical images.

Methods

A total of 240 cranial magnetic resonance (MR) images were used to train the deep learning model (144, 48, and 48 for the training, validation, and test sets, respectively, from the Alzheimer's Disease Neuroimaging Initiative [ADNI] database). To overcome the small sample size problem, we used a data augmentation technique to create 576 images per epoch. We used attention-gated U-net for the basic structure of our deep learning model. To validate the performance of the software, we adapted an external test set comprising 100 cranial MR images from the Open Access Series of Imaging Studies (OASIS) database.

Results

The facial features (eyes, nose, and ears) were successfully detected and anonymized in both test sets (48 from ADNI and 100 from OASIS). Each result was manually validated in both the 2D image plane and the 3D-rendered images. Furthermore, the ADNI test set was verified using Microsoft Azure's face recognition artificial intelligence service. By adding a user interface, we developed and distributed (via GitHub) software named “Deface program” for medical images as an open-source project.

Conclusions

We developed deep learning–based software for the anonymization of MR images that distorts the eyes, nose, and ears to prevent facial identification of the subject in reconstructed 3D images. It could be used to share medical big data for secondary research while making both data providers and recipients compliant with the relevant privacy regulations.

Collapse

Parker W, Jaremko JL, Cicero M, Azar M, El-Emam K, Gray BG, Hurrell C, Lavoie-Cardinal F, Desjardins B, Lum A, Sheremeta L, Lee E, Reinhold C, Tang A, Bromwich R. Canadian Association of Radiologists White Paper on De-Identification of Medical Imaging: Part 1, General Principles. Can Assoc Radiol J 2020;72:13-24. [PMID: 33138621 DOI: 10.1177/0846537120967349] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Pung J, Rienhoff O. Key components and IT assistance of participant management in clinical research: a scoping review. JAMIA Open 2020;3:449-458. [PMID: 33215078 PMCID: PMC7660951 DOI: 10.1093/jamiaopen/ooaa041] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Revised: 07/16/2020] [Accepted: 08/24/2020] [Indexed: 01/05/2023] Open

Steinkamp JM, Pomeranz T, Adleberg J, Kahn CE, Cook TS. Evaluation of Automated Public De-Identification Tools on a Corpus of Radiology Reports. Radiol Artif Intell 2020;2:e190137. [PMID: 33937843 DOI: 10.1148/ryai.2020190137] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 05/05/2020] [Accepted: 05/14/2020] [Indexed: 11/11/2022]

Abstract

Purpose

To evaluate publicly available de-identification tools on a large corpus of narrative-text radiology reports.

Materials and Methods

In this retrospective study, 21 categories of protected health information (PHI) in 2503 radiology reports were annotated from a large multihospital academic health system, collected between January 1, 2012 and January 8, 2019. A subset consisting of 1023 reports served as a test set; the remainder were used as domain-specific training data. The types and frequencies of PHI present within the reports were tallied. Five public de-identification tools were evaluated: MITRE Identification Scrubber Toolkit, U.S. National Library of Medicine‒Scrubber, Massachusetts Institute of Technology de-identification software, Emory Health Information DE-identification (HIDE) software, and Neuro named-entity recognition (NeuroNER). The tools were compared using metrics including recall, precision, and F1 score (the harmonic mean of recall and precision) for each category of PHI.

Results

The annotators identified 3528 spans of PHI text within the 2503 reports. Cohen κ for interrater agreement was 0.938. Dates accounted for the majority of PHI found in the dataset of radiology reports (n = 2755 [78%]). The two best-performing tools both used machine learning methods-NeuroNER (precision, 94.5%; recall, 92.6%; microaveraged F1 score [F1], 93.6%) and Emory HIDE (precision, 96.6%; recall, 88.2%; F1, 92.2%)-but none exceeded 50% F1 on the important patient names category.

Conclusion

PHI appeared infrequently within the corpus of reports studied, which created difficulties for training machine learning systems. Out-of-the-box de-identification tools achieved limited performance on the corpus of radiology reports, suggesting the need for further advancements in public datasets and trained models.Supplemental material is available for this article.See also the commentary by Tenenholtz and Wood in this issue.© RSNA, 2020.

Collapse

Solomonides A. Review of Clinical Research Informatics. Yearb Med Inform 2020;29:193-202. [PMID: 32823316 PMCID: PMC7442526 DOI: 10.1055/s-0040-1701988] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open

Abstract

OBJECTIVES

Clinical Research Informatics (CRI) declares its scope in its name, but its content, both in terms of the clinical research it supports-and sometimes initiates-and the methods it has developed over time, reach much further than the name suggests. The goal of this review is to celebrate the extraordinary diversity of activity and of results, not as a prize-giving pageant, but in recognition of the field, the community that both serves and is sustained by it, and of its interdisciplinarity and its international dimension.

METHODS

Beyond personal awareness of a range of work commensurate with the author's own research, it is clear that, even with a thorough literature search, a comprehensive review is impossible. Moreover, the field has grown and subdivided to an extent that makes it very hard for one individual to be familiar with every branch or with more than a few branches in any depth. A literature survey was conducted that focused on informatics-related terms in the general biomedical and healthcare literature, and specific concerns ("artificial intelligence", "data models", "analytics", etc.) in the biomedical informatics (BMI) literature. In addition to a selection from the results from these searches, suggestive references within them were also considered.

RESULTS

The substantive sections of the paper-Artificial Intelligence, Machine Learning, and "Big Data" Analytics; Common Data Models, Data Quality, and Standards; Phenotyping and Cohort Discovery; Privacy: Deidentification, Distributed Computation, Blockchain; Causal Inference and Real-World Evidence-provide broad coverage of these active research areas, with, no doubt, a bias towards this reviewer's interests and preferences, landing on a number of papers that stood out in one way or another, or, alternatively, exemplified a particular line of work.

CONCLUSIONS

CRI is thriving, not only in the familiar major centers of research, but more widely, throughout the world. This is not to pretend that the distribution is uniform, but to highlight the potential for this domain to play a prominent role in supporting progress in medicine, healthcare, and wellbeing everywhere. We conclude with the observation that CRI and its practitioners would make apt stewards of the new medical knowledge that their methods will bring forward.

Collapse

Petersen C, Subbian V. Special Section on Ethics in Health Informatics. Yearb Med Inform 2020;29:77-80. [PMID: 32823299 PMCID: PMC7442530 DOI: 10.1055/s-0040-1702014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open

Larson DB, Magnus DC, Lungren MP, Shah NH, Langlotz CP. Ethics of Using and Sharing Clinical Imaging Data for Artificial Intelligence: A Proposed Framework. Radiology 2020;295:675-682. [PMID: 32208097 DOI: 10.1148/radiol.2020192536] [Citation(s) in RCA: 78] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Abstract

In this article, the authors propose an ethical framework for using and sharing clinical data for the development of artificial intelligence (AI) applications. The philosophical premise is as follows: when clinical data are used to provide care, the primary purpose for acquiring the data is fulfilled. At that point, clinical data should be treated as a form of public good, to be used for the benefit of future patients. In their 2013 article, Faden et al argued that all who participate in the health care system, including patients, have a moral obligation to contribute to improving that system. The authors extend that framework to questions surrounding the secondary use of clinical data for AI applications. Specifically, the authors propose that all individuals and entities with access to clinical data become data stewards, with fiduciary (or trust) responsibilities to patients to carefully safeguard patient privacy, and to the public to ensure that the data are made widely available for the development of knowledge and tools to benefit future patients. According to this framework, the authors maintain that it is unethical for providers to "sell" clinical data to other parties by granting access to clinical data, especially under exclusive arrangements, in exchange for monetary or in-kind payments that exceed costs. The authors also propose that patient consent is not required before the data are used for secondary purposes when obtaining such consent is prohibitively costly or burdensome, as long as mechanisms are in place to ensure that ethical standards are strictly followed. Rather than debate whether patients or provider organizations "own" the data, the authors propose that clinical data are not owned at all in the traditional sense, but rather that all who interact with or control the data have an obligation to ensure that the data are used for the benefit of future patients and society.

Collapse

Lovis C. Unlocking the Power of Artificial Intelligence and Big Data in Medicine. J Med Internet Res 2019;21:e16607. [PMID: 31702565 PMCID: PMC6874800 DOI: 10.2196/16607] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Revised: 10/18/2019] [Accepted: 10/20/2019] [Indexed: 12/17/2022] Open