1
|
Pesapane F, Cuocolo R, Sardanelli F. The Picasso's skepticism on computer science and the dawn of generative AI: questions after the answers to keep "machines-in-the-loop". Eur Radiol Exp 2024; 8:81. [PMID: 39046535 PMCID: PMC11269548 DOI: 10.1186/s41747-024-00485-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Accepted: 06/16/2024] [Indexed: 07/25/2024] Open
Abstract
Starting from Picasso's quote ("Computers are useless. They can only give you answers"), we discuss the introduction of generative artificial intelligence (AI), including generative adversarial networks (GANs) and transformer-based architectures such as large language models (LLMs) in radiology, where their potential in reporting, image synthesis, and analysis is notable. However, the need for improvements, evaluations, and regulations prior to clinical use is also clear. Integration of LLMs into clinical workflow needs cautiousness, to avoid or at least mitigate risks associated with false diagnostic suggestions. We highlight challenges in synthetic image generation, inherent biases in AI models, and privacy concerns, stressing the importance of diverse training datasets and robust data privacy measures. We examine the regulatory landscape, including the 2023 Executive Order on AI in the United States and the 2024 AI Act in the European Union, which set standards for AI applications in healthcare. This manuscript contributes to the field by emphasizing the necessity of maintaining the human element in medical procedures while leveraging generative AI, advocating for a "machines-in-the-loop" approach.
Collapse
Affiliation(s)
- Filippo Pesapane
- Breast Imaging Division, IEO European Institute of Oncology IRCCS, Milan, Italy.
| | - Renato Cuocolo
- Department of Medicine, Surgery and Dentistry, University of Salerno, Via Salvador Allende 43, Baronissi, 84081, Salerno, Italy
| | - Francesco Sardanelli
- Unit of Radiology, IRCCS Policlinico San Donato, Via Morandi 30, San Donato Milanese, 20097, Milan, Italy
- Lega Italiana Tumori (LILT) Milano Monza Brianza, Piazzale Gorini 22, 20133, Milan, Italy
| |
Collapse
|
2
|
Kondylakis H, Catalan R, Alabart SM, Barelle C, Bizopoulos P, Bobowicz M, Bona J, Fotiadis DI, Garcia T, Gomez I, Jimenez-Pastor A, Karatzanis G, Lekadir K, Kogut-Czarkowska M, Lalas A, Marias K, Marti-Bonmati L, Munuera J, Nikiforaki K, Pelissier M, Prior F, Rutherford M, Saint-Aubert L, Sakellariou Z, Seymour K, Trouillard T, Votis K, Tsiknakis M. Documenting the de-identification process of clinical and imaging data for AI for health imaging projects. Insights Imaging 2024; 15:130. [PMID: 38816658 PMCID: PMC11139818 DOI: 10.1186/s13244-024-01711-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 04/26/2024] [Indexed: 06/01/2024] Open
Abstract
Artificial intelligence (AI) is revolutionizing the field of medical imaging, holding the potential to shift medicine from a reactive "sick-care" approach to a proactive focus on healthcare and prevention. The successful development of AI in this domain relies on access to large, comprehensive, and standardized real-world datasets that accurately represent diverse populations and diseases. However, images and data are sensitive, and as such, before using them in any way the data needs to be modified to protect the privacy of the patients. This paper explores the approaches in the domain of five EU projects working on the creation of ethically compliant and GDPR-regulated European medical imaging platforms, focused on cancer-related data. It presents the individual approaches to the de-identification of imaging data, and describes the problems and the solutions adopted in each case. Further, lessons learned are provided, enabling future projects to optimally handle the problem of data de-identification. CRITICAL RELEVANCE STATEMENT: This paper presents key approaches from five flagship EU projects for the de-identification of imaging and clinical data offering valuable insights and guidelines in the domain. KEY POINTS: ΑΙ models for health imaging require access to large amounts of data. Access to large imaging datasets requires an appropriate de-identification process. This paper provides de-identification guidelines from the AI for health imaging (AI4HI) projects.
Collapse
Affiliation(s)
| | - Rocio Catalan
- La Fe University and Polytechnic Hospital, La Fe Health Research Institute, Valencia, Spain
| | | | | | - Paschalis Bizopoulos
- Centre for Research & Technology Hellas, Information Technologies Institute (CERTH-ITI), Central Directorate, Thermi, Thessaloniki, Greece
| | | | - Jonathan Bona
- University of Arkansas for Medical Sciences, Little Rock, AR, USA
| | - Dimitrios I Fotiadis
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina, Greece
| | - Teresa Garcia
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Ignacio Gomez
- La Fe University and Polytechnic Hospital, La Fe Health Research Institute, Valencia, Spain
| | | | | | - Karim Lekadir
- Artificial Intelligence in Medicine Labm Universitat de Barcelona, Barcelona, Spain
| | | | - Antonios Lalas
- Centre for Research & Technology Hellas, Information Technologies Institute (CERTH-ITI), Central Directorate, Thermi, Thessaloniki, Greece
| | | | - Luis Marti-Bonmati
- Hospital Universitario y Politécnico La Fe, Grupo de Investigación Biomédica en Imagen IIS La Fe, Valencia, España
| | - Jose Munuera
- Quantitative Imaging Biomarkers in Medicine, Quibim, Valencia, Spain
| | | | | | - Fred Prior
- University of Arkansas for Medical Sciences, Little Rock, AR, USA
| | | | | | - Zisis Sakellariou
- Centre for Research & Technology Hellas, Information Technologies Institute (CERTH-ITI), Central Directorate, Thermi, Thessaloniki, Greece
| | | | | | - Konstantinos Votis
- Centre for Research & Technology Hellas, Information Technologies Institute (CERTH-ITI), Central Directorate, Thermi, Thessaloniki, Greece
| | | |
Collapse
|
3
|
Cengiz N, Kabanda SM, Moodley K. Cross-border data sharing through the lens of research ethics committee members in sub-Saharan Africa. PLoS One 2024; 19:e0303828. [PMID: 38781141 PMCID: PMC11115285 DOI: 10.1371/journal.pone.0303828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 05/01/2024] [Indexed: 05/25/2024] Open
Abstract
BACKGROUND Several factors thwart successful data sharing-ambiguous or fragmented regulatory landscapes, conflicting institutional/researcher interests and varying levels of data science-related expertise are among these. Traditional ethics oversight mechanisms and practices may not be well placed to guarantee adequate research oversight given the unique challenges presented by digital technologies and artificial intelligence (AI). Data-intensive research has raised new, contextual ethics and legal challenges that are particularly relevant in an African research setting. Yet, no empirical research has been conducted to explore these challenges. MATERIALS AND METHODS We explored REC members' views and experiences on data sharing by conducting 20 semi-structured interviews online between June 2022 and February 2023. Using purposive sampling and snowballing, we recruited representatives across sub-Saharan Africa (SSA). We transcribed verbatim and thematically analysed the data with Atlas.ti V22. RESULTS Three dominant themes were identified: (i) experiences in reviewing data sharing protocols, (ii) perceptions of data transfer tools and (iii) ethical, legal and social challenges of data sharing. Several sub-themes emerged as: (i.a) frequency of and approaches used in reviewing data sharing protocols, (i.b) practical/technical challenges, (i.c) training, (ii.a) ideal structure of data transfer tools, (ii.b) key elements of data transfer tools, (ii.c) implementation level, (ii.d) key stakeholders in developing and reviewing a data transfer agreement (DTA), (iii.a) confidentiality and anonymity, (iii.b) consent, (iii.c) regulatory frameworks, and (iii.d) stigmatisation and discrimination. CONCLUSIONS Our results indicated variability in REC members' perceptions, suboptimal awareness of the existence of data protection laws and a unanimously expressed need for REC member training. To promote efficient data sharing within and across SSA, guidelines that incorporate ethical, legal and social elements need to be developed in consultation with relevant stakeholders and field experts, along with the training accreditation of REC members in the review of data-intensive protocols.
Collapse
Affiliation(s)
- Nezerith Cengiz
- Department of Medicine, Division for Medical Ethics and Law, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Siti M. Kabanda
- Department of Medicine, Division for Medical Ethics and Law, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Keymanthri Moodley
- Department of Medicine, Division for Medical Ethics and Law, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| |
Collapse
|
4
|
Kovačević A, Bašaragin B, Milošević N, Nenadić G. De-identification of clinical free text using natural language processing: A systematic review of current approaches. Artif Intell Med 2024; 151:102845. [PMID: 38555848 DOI: 10.1016/j.artmed.2024.102845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 03/13/2024] [Accepted: 03/18/2024] [Indexed: 04/02/2024]
Abstract
BACKGROUND Electronic health records (EHRs) are a valuable resource for data-driven medical research. However, the presence of protected health information (PHI) makes EHRs unsuitable to be shared for research purposes. De-identification, i.e. the process of removing PHI is a critical step in making EHR data accessible. Natural language processing has repeatedly demonstrated its feasibility in automating the de-identification process. OBJECTIVES Our study aims to provide systematic evidence on how the de-identification of clinical free text written in English has evolved in the last thirteen years, and to report on the performances and limitations of the current state-of-the-art systems for the English language. In addition, we aim to identify challenges and potential research opportunities in this field. METHODS A systematic search in PubMed, Web of Science, and the DBLP was conducted for studies published between January 2010 and February 2023. Titles and abstracts were examined to identify the relevant studies. Selected studies were then analysed in-depth, and information was collected on de-identification methodologies, data sources, and measured performance. RESULTS A total of 2125 publications were identified for the title and abstract screening. 69 studies were found to be relevant. Machine learning (37 studies) and hybrid (26 studies) approaches are predominant, while six studies relied only on rules. The majority of the approaches were trained and evaluated on public corpora. The 2014 i2b2/UTHealth corpus is the most frequently used (36 studies), followed by the 2006 i2b2 (18 studies) and 2016 CEGS N-GRID (10 studies) corpora. CONCLUSION Earlier de-identification approaches aimed at English were mainly rule and machine learning hybrids with extensive feature engineering and post-processing, while more recent performance improvements are due to feature-inferring recurrent neural networks. Current leading performance is achieved using attention-based neural models. Recent studies report state-of-the-art F1-scores (over 98 %) when evaluated in the manner usually adopted by the clinical natural language processing community. However, their performance needs to be more thoroughly assessed with different measures to judge their reliability to safely de-identify data in a real-world setting. Without additional manually labeled training data, state-of-the-art systems fail to generalise well across a wide range of clinical sub-domains.
Collapse
Affiliation(s)
- Aleksandar Kovačević
- The University of Novi Sad, Faculty of Technical Sciences, Trg Dositeja Obradovića 6, 21002 Novi Sad, Serbia
| | - Bojana Bašaragin
- The Institute for Artificial Intelligence Research and Development of Serbia, Fruškogorska 1, 21000 Novi Sad, Serbia.
| | - Nikola Milošević
- The Institute for Artificial Intelligence Research and Development of Serbia, Fruškogorska 1, 21000 Novi Sad, Serbia; Bayer A.G., Research and Development, Mullerstrasse 173, Berlin 13342, Germany
| | - Goran Nenadić
- The University of Manchester, Department of Computer Science, Manchester, United Kingdom
| |
Collapse
|
5
|
Pilgram L, Meurers T, Malin B, Schaeffner E, Eckardt KU, Prasser F. The Costs of Anonymization: Case Study Using Clinical Data. J Med Internet Res 2024; 26:e49445. [PMID: 38657232 PMCID: PMC11079766 DOI: 10.2196/49445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 01/14/2024] [Accepted: 02/13/2024] [Indexed: 04/26/2024] Open
Abstract
BACKGROUND Sharing data from clinical studies can accelerate scientific progress, improve transparency, and increase the potential for innovation and collaboration. However, privacy concerns remain a barrier to data sharing. Certain concerns, such as reidentification risk, can be addressed through the application of anonymization algorithms, whereby data are altered so that it is no longer reasonably related to a person. Yet, such alterations have the potential to influence the data set's statistical properties, such that the privacy-utility trade-off must be considered. This has been studied in theory, but evidence based on real-world individual-level clinical data is rare, and anonymization has not broadly been adopted in clinical practice. OBJECTIVE The goal of this study is to contribute to a better understanding of anonymization in the real world by comprehensively evaluating the privacy-utility trade-off of differently anonymized data using data and scientific results from the German Chronic Kidney Disease (GCKD) study. METHODS The GCKD data set extracted for this study consists of 5217 records and 70 variables. A 2-step procedure was followed to determine which variables constituted reidentification risks. To capture a large portion of the risk-utility space, we decided on risk thresholds ranging from 0.02 to 1. The data were then transformed via generalization and suppression, and the anonymization process was varied using a generic and a use case-specific configuration. To assess the utility of the anonymized GCKD data, general-purpose metrics (ie, data granularity and entropy), as well as use case-specific metrics (ie, reproducibility), were applied. Reproducibility was assessed by measuring the overlap of the 95% CI lengths between anonymized and original results. RESULTS Reproducibility measured by 95% CI overlap was higher than utility obtained from general-purpose metrics. For example, granularity varied between 68.2% and 87.6%, and entropy varied between 25.5% and 46.2%, whereas the average 95% CI overlap was above 90% for all risk thresholds applied. A nonoverlapping 95% CI was detected in 6 estimates across all analyses, but the overwhelming majority of estimates exhibited an overlap over 50%. The use case-specific configuration outperformed the generic one in terms of actual utility (ie, reproducibility) at the same level of privacy. CONCLUSIONS Our results illustrate the challenges that anonymization faces when aiming to support multiple likely and possibly competing uses, while use case-specific anonymization can provide greater utility. This aspect should be taken into account when evaluating the associated costs of anonymized data and attempting to maintain sufficiently high levels of privacy for anonymized data. TRIAL REGISTRATION German Clinical Trials Register DRKS00003971; https://drks.de/search/en/trial/DRKS00003971. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) RR2-10.1093/ndt/gfr456.
Collapse
Affiliation(s)
- Lisa Pilgram
- Junior Digital Clinician Scientist Program, Biomedical Innovation Academy, Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany
- Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Thierry Meurers
- Medical Informatics Group, Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Bradley Malin
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Elke Schaeffner
- Institute of Public Health, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Kai-Uwe Eckardt
- Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Berlin, Germany
- Department of Nephrology and Hypertension, Universitätsklinikum Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | - Fabian Prasser
- Medical Informatics Group, Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany
| |
Collapse
|
6
|
Brunette CA, Harris EJ, Antwi AA, Lemke AA, Kerman BJ, Vassy JL. Data from a national survey of United States primary care physicians on genetic risk scores for common disease prevention. Data Brief 2024; 52:109930. [PMID: 38093856 PMCID: PMC10716767 DOI: 10.1016/j.dib.2023.109930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 12/05/2023] [Indexed: 02/01/2024] Open
Abstract
Genetic risk scores (GRS) are an emerging and rapidly evolving genomic medicine innovation that may contribute to more precise risk stratification for disease prevention. Inclusion of GRS in routine medical care is imminent, and understanding how physicians perceive and intend to utilize GRS in practice is an important first step in facilitating uptake. This dataset was derived from an electronic survey and comprises one of the first, largest, and broadest samples of United States primary care physician perceptions on the clinical decision-making, benefits, barriers, and utility of GRS to date. The dataset is nearly complete (<1% missing data) and contains responses from 369 PCPs spanning 58 column variables. The public repository includes minimally filtered, de-identified data, all underlying survey versions and items, a data dictionary, and associated analytic files.
Collapse
Affiliation(s)
- Charles A. Brunette
- Veterans Affairs Boston Healthcare System, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Elizabeth J. Harris
- Veterans Affairs Boston Healthcare System, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | | | - Amy A. Lemke
- Norton Children's Research Institute, University of Louisville School of Medicine, Louisville, KY, USA
| | - Benjamin J. Kerman
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA
| | - Jason L. Vassy
- Veterans Affairs Boston Healthcare System, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA
- Precision Population Health, Ariadne Labs, Boston, MA, USA
| |
Collapse
|
7
|
Ormond KE, Bavamian S, Becherer C, Currat C, Joerger F, Geiger TR, Hiendlmeyer E, Maurer J, Staub T, Vayena E. What are the bottlenecks to health data sharing in Switzerland? An interview study. Swiss Med Wkly 2024; 154:3538. [PMID: 38579329 DOI: 10.57187/s.3538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2024] Open
Abstract
BACKGROUND While health data sharing for research purposes is strongly supported in principle, it can be challenging to implement in practice. Little is known about the actual bottlenecks to health data sharing in Switzerland. AIMS OF THE STUDY This study aimed to assess the obstacles to Swiss health data sharing, including legal, ethical and logistical bottlenecks. METHODS We identified 37 key stakeholders in data sharing via the Swiss Personalised Health Network ecosystem, defined as being an expert on sharing sensitive health data for research purposes at a Swiss university hospital (or a Swiss disease cohort) or being a stakeholder in data sharing at a public or private institution that uses such data. We conducted semi-structured interviews, which were transcribed, translated when necessary, and de-identified. The entire research team discussed the transcripts and notes taken during each interview before an inductive coding process occurred. RESULTS Eleven semi-structured interviews were conducted (primarily in English) with 17 individuals representing lawyers, data protection officers, ethics committee members, scientists, project managers, bioinformaticians, clinical trials unit members, and biobank stakeholders. Most respondents felt that it was not the actual data transfer that was the bottleneck but rather the processes and systems around it, which were considered time-intensive and confusing. The templates developed by the Swiss Personalised Health Network and the Swiss General Consent process were generally felt to have streamlined processes significantly. However, these logistics and data quality issues remain practical bottlenecks in Swiss health data sharing. Areas of legal uncertainty include privacy laws when sharing data internationally, questions of "who owns the data", inconsistencies created because the Swiss general consent is perceived as being implemented differently across different institutions, and definitions and operationalisation of anonymisation and pseudo-anonymisation. Many participants desired to create a "culture of data sharing" and to recognise that data sharing is a process with many steps, not an event, that requires sustainability efforts and personnel. Some participants also stressed a desire to move away from data sharing and the current privacy focus towards processes that facilitate data access. CONCLUSIONS Facilitating a data access culture in Switzerland may require legal clarifications, further education about the process and resources to support data sharing, and further investment in sustainable infrastructureby funders and institutions.
Collapse
Affiliation(s)
- Kelly E Ormond
- D-HEST, Health Ethics and Policy Lab, ETH-Zurich, Zurich, Switzerland
| | | | - Claudia Becherer
- Swiss Clinical Trial Organisation, Bern, Switzerland
- Department Clinical Research (DKF), University Basel, University Hospital Basel, Basel, Switzerland
| | | | - Francisca Joerger
- Swiss Clinical Trial Organisation, Bern, Switzerland
- Clinical Trials Center, University Hospital Zurich, Zurich, Switzerland
| | - Thomas R Geiger
- Swiss Personalized Health Network (SPHN), Swiss Academy of Medical Sciences, Bern, Switzerland
| | - Elke Hiendlmeyer
- Swiss Clinical Trial Organisation, Bern, Switzerland
- Clinical trials unit (CTU), Kantonsspital St. Gallen, St. Gallen, Switzerland
| | - Julia Maurer
- Personalized Health Informatics Group, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Timo Staub
- Bern Center for Precision Medicine, University of Bern, Bern, Switzerland
| | - Effy Vayena
- D-HEST,Health Ethics and Policy Lab, ETH-Zurich, Zurich, Switzerland
| |
Collapse
|
8
|
Monachino G, Zanchi B, Fiorillo L, Conte G, Auricchio A, Tzovara A, Faraci FD. Deep Generative Models: The winning key for large and easily accessible ECG datasets? Comput Biol Med 2023; 167:107655. [PMID: 37976830 DOI: 10.1016/j.compbiomed.2023.107655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 10/04/2023] [Accepted: 10/31/2023] [Indexed: 11/19/2023]
Abstract
Large high-quality datasets are essential for building powerful artificial intelligence (AI) algorithms capable of supporting advancement in cardiac clinical research. However, researchers working with electrocardiogram (ECG) signals struggle to get access and/or to build one. The aim of the present work is to shed light on a potential solution to address the lack of large and easily accessible ECG datasets. Firstly, the main causes of such a lack are identified and examined. Afterward, the potentials and limitations of cardiac data generation via deep generative models (DGMs) are deeply analyzed. These very promising algorithms have been found capable not only of generating large quantities of ECG signals but also of supporting data anonymization processes, to simplify data sharing while respecting patients' privacy. Their application could help research progress and cooperation in the name of open science. However several aspects, such as a standardized synthetic data quality evaluation and algorithm stability, need to be further explored.
Collapse
Affiliation(s)
- Giuliana Monachino
- Institute of Digital Technologies for Personalized Healthcare - MeDiTech, Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Via la Santa 1, Lugano 6900, Switzerland; Institute of Informatics, University of Bern, Neubrückstrasse 10, Bern 3012, Switzerland.
| | - Beatrice Zanchi
- Institute of Digital Technologies for Personalized Healthcare - MeDiTech, Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Via la Santa 1, Lugano 6900, Switzerland; Department of Quantitative Biomedicine, University of Zurich, Schmelzbergstrasse 26, Zurich 8091, Switzerland
| | - Luigi Fiorillo
- Institute of Digital Technologies for Personalized Healthcare - MeDiTech, Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Via la Santa 1, Lugano 6900, Switzerland
| | - Giulio Conte
- Division of Cardiology, Fondazione Cardiocentro Ticino, Via Tesserete 48, Lugano 6900, Switzerland; Centre for Computational Medicine in Cardiology, Faculty of Informatics, Università della Svizzera Italiana, Via la Santa 1, Lugano 6900, Switzerland
| | - Angelo Auricchio
- Division of Cardiology, Fondazione Cardiocentro Ticino, Via Tesserete 48, Lugano 6900, Switzerland; Centre for Computational Medicine in Cardiology, Faculty of Informatics, Università della Svizzera Italiana, Via la Santa 1, Lugano 6900, Switzerland
| | - Athina Tzovara
- Institute of Informatics, University of Bern, Neubrückstrasse 10, Bern 3012, Switzerland; Sleep Wake Epilepsy Center | NeuroTec, Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 16, Bern 3010, Switzerland
| | - Francesca Dalia Faraci
- Institute of Digital Technologies for Personalized Healthcare - MeDiTech, Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Via la Santa 1, Lugano 6900, Switzerland
| |
Collapse
|
9
|
Hurvitz N, Ilan Y. The Constrained-Disorder Principle Assists in Overcoming Significant Challenges in Digital Health: Moving from "Nice to Have" to Mandatory Systems. Clin Pract 2023; 13:994-1014. [PMID: 37623270 PMCID: PMC10453547 DOI: 10.3390/clinpract13040089] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 08/16/2023] [Accepted: 08/18/2023] [Indexed: 08/26/2023] Open
Abstract
The success of artificial intelligence depends on whether it can penetrate the boundaries of evidence-based medicine, the lack of policies, and the resistance of medical professionals to its use. The failure of digital health to meet expectations requires rethinking some of the challenges faced. We discuss some of the most significant challenges faced by patients, physicians, payers, pharmaceutical companies, and health systems in the digital world. The goal of healthcare systems is to improve outcomes. Assisting in diagnosing, collecting data, and simplifying processes is a "nice to have" tool, but it is not essential. Many of these systems have yet to be shown to improve outcomes. Current outcome-based expectations and economic constraints make "nice to have," "assists," and "ease processes" insufficient. Complex biological systems are defined by their inherent disorder, bounded by dynamic boundaries, as described by the constrained disorder principle (CDP). It provides a platform for correcting systems' malfunctions by regulating their degree of variability. A CDP-based second-generation artificial intelligence system provides solutions to some challenges digital health faces. Therapeutic interventions are held to improve outcomes with these systems. In addition to improving clinically meaningful endpoints, CDP-based second-generation algorithms ensure patient and physician engagement and reduce the health system's costs.
Collapse
Affiliation(s)
| | - Yaron Ilan
- Hadassah Medical Center, Department of Medicine, Faculty of Medicine, Hebrew University, POB 1200, Jerusalem IL91120, Israel;
| |
Collapse
|
10
|
Geva R, Gusev A, Polyakov Y, Liram L, Rosolio O, Alexandru A, Genise N, Blatt M, Duchin Z, Waissengrin B, Mirelman D, Bukstein F, Blumenthal DT, Wolf I, Pelles-Avraham S, Schaffer T, Lavi LA, Micciancio D, Vaikuntanathan V, Badawi AA, Goldwasser S. Collaborative privacy-preserving analysis of oncological data using multiparty homomorphic encryption. Proc Natl Acad Sci U S A 2023; 120:e2304415120. [PMID: 37549296 PMCID: PMC10437415 DOI: 10.1073/pnas.2304415120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Accepted: 06/09/2023] [Indexed: 08/09/2023] Open
Abstract
Real-world healthcare data sharing is instrumental in constructing broader-based and larger clinical datasets that may improve clinical decision-making research and outcomes. Stakeholders are frequently reluctant to share their data without guaranteed patient privacy, proper protection of their datasets, and control over the usage of their data. Fully homomorphic encryption (FHE) is a cryptographic capability that can address these issues by enabling computation on encrypted data without intermediate decryptions, so the analytics results are obtained without revealing the raw data. This work presents a toolset for collaborative privacy-preserving analysis of oncological data using multiparty FHE. Our toolset supports survival analysis, logistic regression training, and several common descriptive statistics. We demonstrate using oncological datasets that the toolset achieves high accuracy and practical performance, which scales well to larger datasets. As part of this work, we propose a cryptographic protocol for interactive bootstrapping in multiparty FHE, which is of independent interest. The toolset we develop is general-purpose and can be applied to other collaborative medical and healthcare application domains.
Collapse
Affiliation(s)
- Ravit Geva
- Tel Aviv Sorasky Medical Center, Tel Aviv64239, Israel
| | - Alexander Gusev
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA02215
| | | | - Lior Liram
- Duality Technologies, Inc., Hoboken, NJ07103
| | | | | | | | | | | | | | - Dan Mirelman
- Tel Aviv Sorasky Medical Center, Tel Aviv64239, Israel
| | | | | | - Ido Wolf
- Tel Aviv Sorasky Medical Center, Tel Aviv64239, Israel
| | | | - Tali Schaffer
- Tel Aviv Sorasky Medical Center, Tel Aviv64239, Israel
| | - Lee A. Lavi
- Tel Aviv Sorasky Medical Center, Tel Aviv64239, Israel
| | - Daniele Micciancio
- Duality Technologies, Inc., Hoboken, NJ07103
- University of California, San Diego, CA92093
| | - Vinod Vaikuntanathan
- Duality Technologies, Inc., Hoboken, NJ07103
- Massachusetts Institute of Technology, Cambridge, MA02139
| | | | - Shafi Goldwasser
- Duality Technologies, Inc., Hoboken, NJ07103
- Simons Institute for the Theory of Computing, University of California, Berkeley, CA94720
| |
Collapse
|
11
|
Marschik PB, Kulvicius T, Flügge S, Widmann C, Nielsen-Saines K, Schulte-Rüther M, Hüning B, Bölte S, Poustka L, Sigafoos J, Wörgötter F, Einspieler C, Zhang D. Open video data sharing in developmental science and clinical practice. iScience 2023; 26:106348. [PMID: 36994082 PMCID: PMC10040728 DOI: 10.1016/j.isci.2023.106348] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 12/19/2022] [Accepted: 03/02/2023] [Indexed: 03/08/2023] Open
Abstract
In behavioral research and clinical practice video data has rarely been shared or pooled across sites due to ethical concerns of confidentiality, although the need of shared large-scaled datasets remains increasing. This demand is even more imperative when data-heavy computer-based approaches are involved. To share data while abiding by privacy protection rules, a critical question arises whether efforts at data de-identification reduce data utility? We addressed this question by showcasing an established and video-based diagnostic tool for detecting neurological deficits. We demonstrated for the first time that, for analyzing infant neuromotor functions, pseudonymization by face-blurring video recordings is a viable approach. The redaction did not affect classification accuracy for either human assessors or artificial intelligence methods, suggesting an adequate and easy-to-apply solution for sharing behavioral video data. Our work shall encourage more innovative solutions to share and merge stand-alone video datasets into large data pools to advance science and public health.
Collapse
Affiliation(s)
- Peter B. Marschik
- Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, 37075 Göttingen, Germany
- Center of Neurodevelopmental Disorders (KIND), Centre for Psychiatry Research; Department of Women’s and Children’s Health, Karolinska Institutet, 11330 Stockholm, Sweden
- iDN – interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, 8036 Graz, Austria
- Leibniz-ScienceCampus Primate Cognition, 37075 Göttingen, Germany
| | - Tomas Kulvicius
- Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, 37075 Göttingen, Germany
- Department for Computational Neuroscience, Third Institute of Physics-Biophysics, Georg-August-University of Göttingen, 37077 Göttingen, Germany
| | - Sarah Flügge
- Department for Computational Neuroscience, Third Institute of Physics-Biophysics, Georg-August-University of Göttingen, 37077 Göttingen, Germany
| | - Claudius Widmann
- Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Karin Nielsen-Saines
- Division of Pediatric Infectious Diseases, David Geffen UCLA School of Medicine Los Angeles, CA 90095, USA
| | - Martin Schulte-Rüther
- Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, 37075 Göttingen, Germany
- Leibniz-ScienceCampus Primate Cognition, 37075 Göttingen, Germany
| | - Britta Hüning
- Department of Pediatrics I, Neonatology, University Children’s Hospital Essen, University Duisburg-Essen, 45147 Essen, Germany
| | - Sven Bölte
- Center of Neurodevelopmental Disorders (KIND), Centre for Psychiatry Research; Department of Women’s and Children’s Health, Karolinska Institutet, 11330 Stockholm, Sweden
- Child and Adolescent Psychiatry, Stockholm Health Care Services, Region Stockholm, 11861 Stockholm, Sweden
- Curtin Autism Research Group, Curtin School of Allied Health, Curtin University, 6102 Perth, WA
| | - Luise Poustka
- Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, 37075 Göttingen, Germany
- Leibniz-ScienceCampus Primate Cognition, 37075 Göttingen, Germany
| | - Jeff Sigafoos
- School of Education, Victoria University of Wellington, 6012 Wellington, New Zealand
| | - Florentin Wörgötter
- Leibniz-ScienceCampus Primate Cognition, 37075 Göttingen, Germany
- Department for Computational Neuroscience, Third Institute of Physics-Biophysics, Georg-August-University of Göttingen, 37077 Göttingen, Germany
| | - Christa Einspieler
- iDN – interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, 8036 Graz, Austria
| | - Dajie Zhang
- Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, 37075 Göttingen, Germany
- iDN – interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, 8036 Graz, Austria
- Leibniz-ScienceCampus Primate Cognition, 37075 Göttingen, Germany
| |
Collapse
|
12
|
Clunie DA, Flanders A, Taylor A, Erickson B, Bialecki B, Brundage D, Gutman D, Prior F, Seibert JA, Perry J, Gichoya JW, Kirby J, Andriole K, Geneslaw L, Moore S, Fitzgerald TJ, Tellis W, Xiao Y, Farahani K, Luo J, Rosenthal A, Kandarpa K, Rosen R, Goetz K, Babcock D, Xu B, Hsiao J. Report of the Medical Image De-Identification (MIDI) Task Group - Best Practices and Recommendations. ARXIV 2023:arXiv:2303.10473v2. [PMID: 37033463 PMCID: PMC10081345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Subscribe] [Scholar Register] [Indexed: 04/11/2023]
Affiliation(s)
| | | | | | | | | | | | | | - Fred Prior
- University of Arkansas for Medical Sciences
| | | | | | | | - Justin Kirby
- Frederick National Laboratory for Cancer Research
| | | | | | | | | | | | - Ying Xiao
- University of Pennsylvania Health System
| | | | - James Luo
- National Heart, Lung, and Blood Institute (NHLBI)
| | - Alex Rosenthal
- National Institute of Allergy and Infectious Diseases (NIAID)
| | - Kris Kandarpa
- National Institute of Biomedical Imaging and Bioengineering (NIBIB)
| | - Rebecca Rosen
- Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD)
| | | | - Debra Babcock
- National Institute of Neurological Disorders and Stroke (NINDS)
| | - Ben Xu
- National Institute on Alcohol Abuse and Alcoholism (NIAAA)
| | | |
Collapse
|
13
|
Lathe R. Restricted access data in the neurosciences: Are the restrictions always justified? Front Neurosci 2023; 16:975795. [PMID: 36760799 PMCID: PMC9904205 DOI: 10.3389/fnins.2022.975795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 11/25/2022] [Indexed: 01/26/2023] Open
|
14
|
Scheibner J, Ienca M, Vayena E. Health data privacy through homomorphic encryption and distributed ledger computing: an ethical-legal qualitative expert assessment study. BMC Med Ethics 2022; 23:121. [PMID: 36451210 PMCID: PMC9713155 DOI: 10.1186/s12910-022-00852-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 10/28/2022] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Increasingly, hospitals and research institutes are developing technical solutions for sharing patient data in a privacy preserving manner. Two of these technical solutions are homomorphic encryption and distributed ledger technology. Homomorphic encryption allows computations to be performed on data without this data ever being decrypted. Therefore, homomorphic encryption represents a potential solution for conducting feasibility studies on cohorts of sensitive patient data stored in distributed locations. Distributed ledger technology provides a permanent record on all transfers and processing of patient data, allowing data custodians to audit access. A significant portion of the current literature has examined how these technologies might comply with data protection and research ethics frameworks. In the Swiss context, these instruments include the Federal Act on Data Protection and the Human Research Act. There are also institutional frameworks that govern the processing of health related and genetic data at different universities and hospitals. Given Switzerland's geographical proximity to European Union (EU) member states, the General Data Protection Regulation (GDPR) may impose additional obligations. METHODS To conduct this assessment, we carried out a series of qualitative interviews with key stakeholders at Swiss hospitals and research institutions. These included legal and clinical data management staff, as well as clinical and research ethics experts. These interviews were carried out with two series of vignettes that focused on data discovery using homomorphic encryption and data erasure from a distributed ledger platform. RESULTS For our first set of vignettes, interviewees were prepared to allow data discovery requests if patients had provided general consent or ethics committee approval, depending on the types of data made available. Our interviewees highlighted the importance of protecting against the risk of reidentification given different types of data. For our second set, there was disagreement amongst interviewees on whether they would delete patient data locally, or delete data linked to a ledger with cryptographic hashes. Our interviewees were also willing to delete data locally or on the ledger, subject to local legislation. CONCLUSION Our findings can help guide the deployment of these technologies, as well as determine ethics and legal requirements for such technologies.
Collapse
Affiliation(s)
- James Scheibner
- grid.5801.c0000 0001 2156 2780Health Ethics and Policy Laboratory, Department of Health Sciences and Technology (D-HEST), ETH Zürich, Zurich, Switzerland ,grid.1014.40000 0004 0367 2697College of Business, Government and Law, Flinders University, Adelaide, Australia
| | - Marcello Ienca
- grid.5801.c0000 0001 2156 2780Health Ethics and Policy Laboratory, Department of Health Sciences and Technology (D-HEST), ETH Zürich, Zurich, Switzerland ,grid.5333.60000000121839049College of Humanities, EPFL, Lausanne, Switzerland
| | - Effy Vayena
- grid.5801.c0000 0001 2156 2780Health Ethics and Policy Laboratory, Department of Health Sciences and Technology (D-HEST), ETH Zürich, Zurich, Switzerland ,grid.5801.c0000 0001 2156 2780Department of Health Sciences and Technology, ETH Zürich, Zurich, Switzerland
| |
Collapse
|
15
|
Samlali K, Thornbury M, Venter A. Community-led risk analysis of direct-to-consumer whole-genome sequencing. Biochem Cell Biol 2022; 100:499-509. [PMID: 35939839 DOI: 10.1139/bcb-2021-0506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Direct-to-consumer (DTC) genetic testing is cheaper and more accessible than ever before; however, the intention to combine, reuse, and resell this genetic information as powerful data sets is generally hidden from the consumer. This financial gain is creating a competitive DTC market, reducing the price of whole-genome sequencing (WGS) to under 300 USD. Entering this transition from single-nucleotide polymorphism-based DTC testing to WGS DTC testing, individuals looking for access to their whole-genomic information face new privacy and security risks. Differences between WGS and other methods of consumer genetic tests are left unexplored by regulation, leading to the application of legal data anonymization methods on whole-genome data, and questionable consent methods. Large representative genomic data sets are important for research and improve the standard of medicine and personalized care. However, these data can also be used by market players, law enforcement, and governments for surveillance, population analyses, marketing purposes, and discrimination. Here, we present a summary of the state of WGS DTC genetic testing and its current regulation, through a community-based lens to expose dual-use risks in consumer-facing biotechnologies.
Collapse
Affiliation(s)
- Kenza Samlali
- BricoBio Community Biology Lab, Montréal, QC, Canada.,Centre for Applied Synthetic Biology, Concordia University, Montréal, QC, Canada.,Department of Electrical and Computer Engineering, Concordia University, Montréal, QC, Canada
| | - Mackenzie Thornbury
- BricoBio Community Biology Lab, Montréal, QC, Canada.,Centre for Applied Synthetic Biology, Concordia University, Montréal, QC, Canada.,Department of Biology, Concordia University, Montréal, QC, Canada
| | - Andrei Venter
- BricoBio Community Biology Lab, Montréal, QC, Canada
| |
Collapse
|
16
|
Wahid KA, Glerean E, Sahlsten J, Jaskari J, Kaski K, Naser MA, He R, Mohamed ASR, Fuller CD. Artificial Intelligence for Radiation Oncology Applications Using Public Datasets. Semin Radiat Oncol 2022; 32:400-414. [PMID: 36202442 PMCID: PMC9587532 DOI: 10.1016/j.semradonc.2022.06.009] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Artificial intelligence (AI) has exceptional potential to positively impact the field of radiation oncology. However, large curated datasets - often involving imaging data and corresponding annotations - are required to develop radiation oncology AI models. Importantly, the recent establishment of Findable, Accessible, Interoperable, Reusable (FAIR) principles for scientific data management have enabled an increasing number of radiation oncology related datasets to be disseminated through data repositories, thereby acting as a rich source of data for AI model building. This manuscript reviews the current and future state of radiation oncology data dissemination, with a particular emphasis on published imaging datasets, AI data challenges, and associated infrastructure. Moreover, we provide historical context of FAIR data dissemination protocols, difficulties in the current distribution of radiation oncology data, and recommendations regarding data dissemination for eventual utilization in AI models. Through FAIR principles and standardized approaches to data dissemination, radiation oncology AI research has nothing to lose and everything to gain.
Collapse
Affiliation(s)
- Kareem A Wahid
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Enrico Glerean
- Department of Neuroscience and Biomedical Engineering, Aalto University School of Science, Espoo, Finland; Department of Computer Science, Aalto University School of Science, Espoo, Finland
| | - Jaakko Sahlsten
- Department of Computer Science, Aalto University School of Science, Espoo, Finland
| | - Joel Jaskari
- Department of Computer Science, Aalto University School of Science, Espoo, Finland
| | - Kimmo Kaski
- Department of Computer Science, Aalto University School of Science, Espoo, Finland
| | - Mohamed A Naser
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Renjie He
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Abdallah S R Mohamed
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Clifton D Fuller
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA.
| |
Collapse
|
17
|
Roguljić M, Šimunović D, Poklepović Peričić T, Viđak M, Utrobičić A, Marušić M, Marušić A. Publishing Identifiable Patient Photographs in Scientific Journals: Scoping Review of Policies and Practices. J Med Internet Res 2022; 24:e37594. [PMID: 36044262 PMCID: PMC9475410 DOI: 10.2196/37594] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2022] [Revised: 05/27/2022] [Accepted: 06/23/2022] [Indexed: 11/20/2022] Open
Abstract
Background Publishing identifiable patient data in scientific journals may jeopardize patient privacy and confidentiality if best ethical practices are not followed. Current journal practices show considerable diversity in the publication of identifiable patient photographs, and different stakeholders may have different opinions of and practices in publishing patient photographs. Objective This scoping review aimed to identify existing evidence and map knowledge gaps in medical research on the policies and practices of publishing identifiable photographs in scientific articles. Methods We performed a comprehensive search of the Cochrane Central Register of Controlled Trials, Cochrane Database of Systematic Reviews, CINAHL with Full Text, Database of Abstracts of Reviews of Effects, Ovid MEDLINE, and Scopus. The Open Science Framework, PROSPERO, BASE, Google Scholar, OpenGrey, ClinicalTrials.gov, the Campbell Collaboration Library, and Science.gov were also searched. Results After screening the initial 15,949 titles and abstracts, 98 (0.61%) publications were assessed for eligibility at the full-text level, and 30 (0.19%) publications were included in this review. The studies were published between 1994 and 2020; most had a cross-sectional design and were published in journals covering different medical disciplines. We identified 3 main topics. The first included ethical aspects of the use of facial photographs in publications. In different clinical settings, the consent process was not conducted properly, and health professionals did not recognize the importance of obtaining written patient consent for taking and using patient medical photographs. They often considered verbal consent sufficient or even used the photographs without consent. The second topic included studies that investigated the practices and use of medical photography in publishing. Both patients and doctors asked for confidential storage and maintenance of medical photographs. Patients preferred to be photographed by their physicians using an institutional camera and preferred nonidentifiable medical photographs not only for publication but also in general. Conventional methods of deidentification of facial photographs concealing the eye area were recognized as unsuccessful in protecting patient privacy. The third topic emerged from studies investigating medical photography in journal articles. These studies showed great diversity in publishing practices regarding consent for publication of medical photographs. Journal policies regarding the consent process and consent forms were insufficient, and existing ethical professional guidelines were not fully implemented in actual practices. Patients’ photographs from open-access medical journals were found on public web-based platforms. Conclusions This scoping review showed a diversity of practices in publishing identifiable patient photographs and an unsatisfactory level of knowledge of this issue among different stakeholders despite existing standards. Emerging issues include the availability of patients’ photographs from open-access journals or preprints in the digital environment. There is a need to improve standards and processes to obtain proper consent to fully protect the privacy of patients in published articles.
Collapse
Affiliation(s)
- Marija Roguljić
- Department of Oral Medicine and Periodontology, University of Split School of Medicine, Split, Croatia
| | | | - Tina Poklepović Peričić
- Department of Prosthodontics, Study of Dental Medicine, School of Medicine, University of Split Library, Split, Croatia.,Department of Research in Biomedicine and Health, Center for Evidence-based Medicine, University of Split School of Medicine, Split, Croatia
| | - Marin Viđak
- Department of Research in Biomedicine and Health, Center for Evidence-based Medicine, University of Split School of Medicine, Split, Croatia
| | | | - Matko Marušić
- Department of Research in Biomedicine and Health, Center for Evidence-based Medicine, University of Split School of Medicine, Split, Croatia
| | - Ana Marušić
- Department of Research in Biomedicine and Health, Center for Evidence-based Medicine, University of Split School of Medicine, Split, Croatia
| |
Collapse
|
18
|
Mawji A, Longstaff H, Trawin J, Dunsmuir D, Komugisha C, Novakowski SK, Wiens MO, Akech S, Tagoola A, Kissoon N, Ansermino JM. A proposed de-identification framework for a cohort of children presenting at a health facility in Uganda. PLOS DIGITAL HEALTH 2022; 1:e0000027. [PMID: 36812586 PMCID: PMC9931294 DOI: 10.1371/journal.pdig.0000027] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 07/08/2022] [Indexed: 11/18/2022]
Abstract
Data sharing has enormous potential to accelerate and improve the accuracy of research, strengthen collaborations, and restore trust in the clinical research enterprise. Nevertheless, there remains reluctancy to openly share raw data sets, in part due to concerns regarding research participant confidentiality and privacy. Statistical data de-identification is an approach that can be used to preserve privacy and facilitate open data sharing. We have proposed a standardized framework for the de-identification of data generated from cohort studies in children in a low-and-middle income country. We applied a standardized de-identification framework to a data sets comprised of 241 health related variables collected from a cohort of 1750 children with acute infections from Jinja Regional Referral Hospital in Eastern Uganda. Variables were labeled as direct and quasi-identifiers based on conditions of replicability, distinguishability, and knowability with consensus from two independent evaluators. Direct identifiers were removed from the data sets, while a statistical risk-based de-identification approach using the k-anonymity model was applied to quasi-identifiers. Qualitative assessment of the level of privacy invasion associated with data set disclosure was used to determine an acceptable re-identification risk threshold, and corresponding k-anonymity requirement. A de-identification model using generalization, followed by suppression was applied using a logical stepwise approach to achieve k-anonymity. The utility of the de-identified data was demonstrated using a typical clinical regression example. The de-identified data sets was published on the Pediatric Sepsis Data CoLaboratory Dataverse which provides moderated data access. Researchers are faced with many challenges when providing access to clinical data. We provide a standardized de-identification framework that can be adapted and refined based on specific context and risks. This process will be combined with moderated access to foster coordination and collaboration in the clinical research community.
Collapse
Affiliation(s)
- Alishah Mawji
- Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, British Columbia, Canada
- Centre for International Child Health, BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada
- * E-mail:
| | - Holly Longstaff
- Privacy and Access, PHSA Research and New Initiatives, Provincial Health Services Authority, Vancouver, British Columbia Canada
| | - Jessica Trawin
- Centre for International Child Health, BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada
| | - Dustin Dunsmuir
- Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, British Columbia, Canada
- Centre for International Child Health, BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada
| | | | - Stefanie K. Novakowski
- Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Matthew O. Wiens
- Centre for International Child Health, BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada
- WALIMU, Kololo, Kampala, Uganda
- Mbarara University of Science and Technology, Mbarara, Uganda
| | - Samuel Akech
- Kenya Medical Research Institute/Wellcome Trust Research Programme, Nairobi, Kenya
| | - Abner Tagoola
- Department of Pediatrics, Jinja Regional Referral Hospital, Rotary Rd, Jinja, Uganda
| | - Niranjan Kissoon
- Centre for International Child Health, BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada
- Department of Pediatrics, University of British Columbia, Vancouver, British Columbia, Canada
| | - J. Mark Ansermino
- Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, British Columbia, Canada
- Centre for International Child Health, BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada
| |
Collapse
|
19
|
Dammen LV, Finseth TT, McCurdy BH, Barnett NP, Conrady RA, Leach AG, Deick AF, Van Steenis AL, Gardner R, Smith BL, Kay A, Shirtcliff EA. Evoking stress reactivity in virtual reality: A systematic review and meta-analysis. Neurosci Biobehav Rev 2022; 138:104709. [PMID: 35644278 DOI: 10.1016/j.neubiorev.2022.104709] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 04/08/2022] [Accepted: 05/21/2022] [Indexed: 01/04/2023]
Abstract
BACKGROUND Virtual reality (VR) research probes stress environments that are infeasible to create in the real world. However, because research simulations are applied to narrow populations, it remains unclear if VR simulations can stimulate a broadly applicable stress-response. This systematic review and meta-analysis was conducted on studies using VR stress tasks and biomarkers. METHODS Included papers (N = 52) measured cortisol, heart rate (HR), galvanic skin response (GSR), systolic blood pressure (SBP), diastolic blood pressure (DBP), respiratory sinus arrhythmia (RSA), parasympathetic activity (RMSSD), sympathovagal balance (LF/HF), and/or salivary alpha-amylase (sAA). Effect sizes (ES) and confidence intervals (CI) were calculated based on standardized mean change of baseline-to-peak biomarker levels. RESULTS From baseline-to-peak (ES, CI), analyses showed a statistically significant change in cortisol (0.56, 0.28-0.83), HR (0.68, 0.53-0.82), GSR (0.59, 0.36-0.82), SBP (.55, 0.19-0.90), DBP (.64, 0.23-1.05), RSA (-0.59, -0.88 to -0.30), and sAA (0.27, 0.092-0.45). There was no effect for RMSSD and LF/HF. CONCLUSION VR stress tasks elicited a varied magnitude of physiological stress reactivity. VR may be an effective tool in stress research.
Collapse
Affiliation(s)
- Lotte van Dammen
- Iowa State University, Virtual Reality Applications Center, Ames, IA, USA
| | - Tor T Finseth
- Iowa State University, Virtual Reality Applications Center, Ames, IA, USA.
| | - Bethany H McCurdy
- Iowa State University, Virtual Reality Applications Center, Ames, IA, USA
| | - Neil P Barnett
- Iowa State University, Virtual Reality Applications Center, Ames, IA, USA
| | - Roselynn A Conrady
- Iowa State University, Virtual Reality Applications Center, Ames, IA, USA
| | - Alexis G Leach
- Iowa State University, Virtual Reality Applications Center, Ames, IA, USA
| | - Andrew F Deick
- Iowa State University, Virtual Reality Applications Center, Ames, IA, USA
| | | | - Reece Gardner
- Iowa State University, Virtual Reality Applications Center, Ames, IA, USA
| | - Brandon L Smith
- Iowa State University, Virtual Reality Applications Center, Ames, IA, USA
| | - Anita Kay
- Iowa State University, Virtual Reality Applications Center, Ames, IA, USA
| | | |
Collapse
|
20
|
Rodriguez A, Tuck C, Dozier MF, Lewis SC, Eldridge S, Jackson T, Murray A, Weir CJ. Current recommendations/practices for anonymising data from clinical trials in order to make it available for sharing: A scoping review. Clin Trials 2022; 19:452-463. [PMID: 35730910 PMCID: PMC9373195 DOI: 10.1177/17407745221087469] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Background/Aims There are increasing pressures for anonymised datasets from clinical trials
to be shared across the scientific community, and differing recommendations
exist on how to perform anonymisation prior to sharing. We aimed to
systematically identify, describe and synthesise existing recommendations
for anonymising clinical trial datasets to prepare for data sharing. Methods We systematically searched MEDLINE®, EMBASE and Web of Science
from inception to 8 February 2021. We also searched other resources to
ensure the comprehensiveness of our search. Any publication reporting
recommendations on anonymisation to enable data sharing from clinical trials
was included. Two reviewers independently screened titles, abstracts and
full text for eligibility. One reviewer extracted data from included papers
using thematic synthesis, which then was sense-checked by a second reviewer.
Results were summarised by narrative analysis. Results Fifty-nine articles (from 43 studies) were eligible for inclusion. Three
distinct themes are emerging: anonymisation, de-identification and
pseudonymisation. The most commonly used anonymisation techniques are:
removal of direct patient identifiers; and careful evaluation and
modification of indirect identifiers to minimise the risk of identification.
Anonymised datasets joined with controlled access was the preferred method
for data sharing. Conclusions There is no single standardised set of recommendations on how to anonymise
clinical trial datasets for sharing. However, this systematic review shows a
developing consensus on techniques used to achieve anonymisation.
Researchers in clinical trials still consider that anonymisation techniques
by themselves are insufficient to protect patient privacy, and they need to
be paired with controlled access.
Collapse
Affiliation(s)
- Aryelly Rodriguez
- Edinburgh Clinical Trials Unit, Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, UK
| | - Christopher Tuck
- Centre for Cardiovascular Science, The University of Edinburgh, Edinburgh, UK
| | - Marshall F Dozier
- Library & University Collections, Information Services, The University of Edinburgh, Edinburgh, UK
| | - Stephanie C Lewis
- Edinburgh Clinical Trials Unit, Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, UK
| | - Sandra Eldridge
- Pragmatic Clinical Trials Unit, Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Tracy Jackson
- Asthma UK Centre for Applied Research, Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, UK
| | | | - Christopher J Weir
- Edinburgh Clinical Trials Unit, Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, UK
| |
Collapse
|
21
|
Auwerx C, Sadler MC, Reymond A, Kutalik Z. From pharmacogenetics to pharmaco-omics: Milestones and future directions. HGG ADVANCES 2022; 3:100100. [PMID: 35373152 PMCID: PMC8971318 DOI: 10.1016/j.xhgg.2022.100100] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
The origins of pharmacogenetics date back to the 1950s, when it was established that inter-individual differences in drug response are partially determined by genetic factors. Since then, pharmacogenetics has grown into its own field, motivated by the translation of identified gene-drug interactions into therapeutic applications. Despite numerous challenges ahead, our understanding of the human pharmacogenetic landscape has greatly improved thanks to the integration of tools originating from disciplines as diverse as biochemistry, molecular biology, statistics, and computer sciences. In this review, we discuss past, present, and future developments of pharmacogenetics methodology, focusing on three milestones: how early research established the genetic basis of drug responses, how technological progress made it possible to assess the full extent of pharmacological variants, and how multi-dimensional omics datasets can improve the identification, functional validation, and mechanistic understanding of the interplay between genes and drugs. We outline novel strategies to repurpose and integrate molecular and clinical data originating from biobanks to gain insights analogous to those obtained from randomized controlled trials. Emphasizing the importance of increased diversity, we envision future directions for the field that should pave the way to the clinical implementation of pharmacogenetics.
Collapse
Affiliation(s)
- Chiara Auwerx
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- University Center for Primary Care and Public Health, Lausanne, Switzerland
| | - Marie C. Sadler
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- University Center for Primary Care and Public Health, Lausanne, Switzerland
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | - Zoltán Kutalik
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- University Center for Primary Care and Public Health, Lausanne, Switzerland
| |
Collapse
|
22
|
Sampurno F, Kowalski C, Connor SE, Nguyen AV, Acuña ÀP, Ng CF, Foster C, Feick G, Boronat OG, Dieng S, Brglevska S, Ferrante S, Leung S, Villanti P, Moore CM, Graham ID, Millar JL, Litwin MS, Papa N. Knowledge and insights from a maturing international clinical quality registry. J Am Med Inform Assoc 2022; 29:964-969. [PMID: 35048976 PMCID: PMC9006702 DOI: 10.1093/jamia/ocab281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 11/17/2021] [Accepted: 12/06/2021] [Indexed: 01/22/2023] Open
Abstract
Since 2017, the TrueNTH Global Registry (TNGR) has aimed to drive improvement in patient outcomes for individuals with localized prostate cancer by collating data from healthcare institutions across 13 countries. As TNGR matures, a systematic evaluation of existing processes and documents is necessary to evaluate whether the registry is operating as intended. The main supporting documents: protocol and data dictionary, were comprehensively reviewed in a series of meetings over a 10-month period by an international working group. In parallel, individual consultations with local institutions regarding a benchmarking quality-of-care report were conducted. Four consensus areas for improvement emerged: updating operational definitions, appraisal of the recruitment process, refinement of data elements, and improvement of data quality and reporting. Recommendations presented were drawn from our collective experience and accumulated knowledge in operating an international registry. These can be readily generalized to other health-related reporting programs beyond clinical registries.
Collapse
Affiliation(s)
- Fanny Sampurno
- Corresponding Author: Fanny Sampurno, BA, BSc (Hons), School of Public Health and Preventive Medicine, Monash University, 553 St Kilda Road, Melbourne, Victoria 3004, Australia;
| | | | - Sarah E Connor
- Department of Urology, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| | - Anissa V Nguyen
- Department of Urology, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| | - Àngels Pont Acuña
- Health Services Research Group, IMIM (Hospital del Mar Medical Research Institute), Barcelona, Spain
| | - Chi-Fai Ng
- SH Ho Urology Centre, The Chinese University of Hong Kong, Hong Kong, China
| | - Claire Foster
- School of Health Sciences, University of Southampton, Southampton, UK
| | - Günter Feick
- Patient Support Association Bundesverband Prostatakrebs Selbsthilfe, Bonn, Germany
| | - Olatz Garin Boronat
- Health Services Research Group, IMIM (Hospital del Mar Medical Research Institute), Barcelona, Spain
| | | | | | - Stephanie Ferrante
- Department of Urology, University of Michigan (on behalf of MUSIC), Ann Arbor, Michigan, USA
| | - Steven Leung
- SH Ho Urology Centre, The Chinese University of Hong Kong, Hong Kong, China
| | | | - Caroline M Moore
- Department of Urology, Division of Surgical and Interventional Science, University College London, London, UK
| | - Ian D Graham
- School of Epidemiology and Public Health, University of Ottawa, Ottawa, Ontario, Canada
| | - Jeremy L Millar
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
| | | | | |
Collapse
|
23
|
Shin SY, Kim HS. Data Pseudonymization in a Range That Does Not Affect Data Quality: Correlation with the Degree of Participation of Clinicians. J Korean Med Sci 2021; 36:e299. [PMID: 34783216 PMCID: PMC8593412 DOI: 10.3346/jkms.2021.36.e299] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Accepted: 10/18/2021] [Indexed: 12/28/2022] Open
Abstract
Personal medical information is an essential resource for research; however, there are laws that regulate its use, and it typically has to be pseudonymized or anonymized. When data are anonymized, the quantity and quality of extractable information decrease significantly. From the perspective of a clinical researcher, a method of achieving pseudonymized data without degrading data quality while also preventing data loss is proposed herein. As the level of pseudonymization varies according to the research purpose, the pseudonymization method applied should be carefully chosen. Therefore, the active participation of clinicians is crucial to transform the data according to the research purpose. This can contribute to data security by simply transforming the data through secondary data processing. Case studies demonstrated that, compared with the initial baseline data, there was a clinically significant difference in the number of datapoints added with the participation of a clinician (from 267,979 to 280,127 points, P < 0.001). Thus, depending on the degree of clinician participation, data anonymization may not affect data quality and quantity, and proper data quality management along with data security are emphasized. Although the pseudonymization level and clinical use of data have a trade-off relationship, it is possible to create pseudonymized data while maintaining the data quality required for a given research purpose. Therefore, rather than relying solely on security guidelines, the active participation of clinicians is important.
Collapse
Affiliation(s)
- Soo-Yong Shin
- Department of Digital Health, Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Seoul, Korea
- Center for Research Resource Standardization, Samsung Medical Center, Seoul, Korea
| | - Hun-Sung Kim
- Department of Medical Informatics, College of Medicine, The Catholic University of Korea, Seoul, Korea
- Division of Endocrinology and Metabolism, Department of Internal Medicine, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea.
| |
Collapse
|
24
|
Sung M, Cha D, Park YR. Local Differential Privacy in the Medical Domain to Protect Sensitive Information: Algorithm Development and Real-World Validation. JMIR Med Inform 2021; 9:e26914. [PMID: 34747711 PMCID: PMC8663640 DOI: 10.2196/26914] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Revised: 02/10/2021] [Accepted: 09/06/2021] [Indexed: 01/25/2023] Open
Abstract
Background Privacy is of increasing interest in the present big data era, particularly the privacy of medical data. Specifically, differential privacy has emerged as the standard method for preservation of privacy during data analysis and publishing. Objective Using machine learning techniques, we applied differential privacy to medical data with diverse parameters and checked the feasibility of our algorithms with synthetic data as well as the balance between data privacy and utility. Methods All data were normalized to a range between –1 and 1, and the bounded Laplacian method was applied to prevent the generation of out-of-bound values after applying the differential privacy algorithm. To preserve the cardinality of the categorical variables, we performed postprocessing via discretization. The algorithm was evaluated using both synthetic and real-world data (from the eICU Collaborative Research Database). We evaluated the difference between the original data and the perturbated data using misclassification rates and the mean squared error for categorical data and continuous data, respectively. Further, we compared the performance of classification models that predict in-hospital mortality using real-world data. Results The misclassification rate of categorical variables ranged between 0.49 and 0.85 when the value of ε was 0.1, and it converged to 0 as ε increased. When ε was between 102 and 103, the misclassification rate rapidly dropped to 0. Similarly, the mean squared error of the continuous variables decreased as ε increased. The performance of the model developed from perturbed data converged to that of the model developed from original data as ε increased. In particular, the accuracy of a random forest model developed from the original data was 0.801, and this value ranged from 0.757 to 0.81 when ε was 10-1 and 104, respectively. Conclusions We applied local differential privacy to medical domain data, which are diverse and high dimensional. Higher noise may offer enhanced privacy, but it simultaneously hinders utility. We should choose an appropriate degree of noise for data perturbation to balance privacy and utility depending on specific situations.
Collapse
Affiliation(s)
- MinDong Sung
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Dongchul Cha
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea.,Department of Otorhinolaryngology, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Yu Rang Park
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
25
|
Arefolov A, Adam L, Brown S, Budovskaya Y, Chen C, Das D, Farhy C, Ferguson R, Huang H, Kanigel K, Lu C, Polesskaya O, Staton T, Tajhya R, Whitley M, Wong JY, Zeng X, McCreary M. Implementation of the FAIR Data Principles for Exploratory Biomarker
Data from Clinical Trials. DATA INTELLIGENCE 2021. [DOI: 10.1162/dint_a_00106] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
The FAIR data guiding principles have been recently developed and widely adopted to improve the Findability, Accessibility, Interoperability, and Reuse of digital assets in the face of an exponential increase of data volume and complexity. The FAIR data principles have been formulated on a general level and the technological implementation of these principles remains up to the industries and organizations working on maximizing the value of their data. Here, we describe the data management and curation methodologies and best practices developed for FAIRification of clinical exploratory biomarker data collected from over 250 clinical studies. We discuss the data curation effort involved, the resulting output, and the business and scientific impact of our work. Finally, we propose prospective planning for FAIR data to optimize data management efforts and maximize data value.
Collapse
Affiliation(s)
| | - Laura Adam
- Rancho BioSciences LLC., San Diego, CA 92127, USA
| | | | | | - Cong Chen
- Rancho BioSciences LLC., San Diego, CA 92127, USA
| | - Diya Das
- Development Sciences Informatics, Genentech Inc., South San Francisco, CA 94080-4990, USA
| | - Chen Farhy
- Rancho BioSciences LLC., San Diego, CA 92127, USA
| | | | - Hongmei Huang
- Development Sciences Informatics, Genentech Inc., South San Francisco, CA 94080-4990, USA
| | | | - Christina Lu
- Development Sciences Informatics, Genentech Inc., South San Francisco, CA 94080-4990, USA
| | | | - Tracy Staton
- Development Sciences OMNI-Biomarker Development, Genentech Inc., South San Francisco, CA 94080-4990, USA
| | | | | | - Jee-Yeon Wong
- Development Sciences Informatics, Genentech Inc., South San Francisco, CA 94080-4990, USA
| | | | - Mark McCreary
- Development Sciences Informatics, Genentech Inc., South San Francisco, CA 94080-4990, USA
| |
Collapse
|
26
|
Zuo Z, Watson M, Budgen D, Hall R, Kennelly C, Al Moubayed N. Data Anonymization for Pervasive Health Care: Systematic Literature Mapping Study. JMIR Med Inform 2021; 9:e29871. [PMID: 34652278 PMCID: PMC8556642 DOI: 10.2196/29871] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 06/21/2021] [Accepted: 08/02/2021] [Indexed: 01/29/2023] Open
Abstract
BACKGROUND Data science offers an unparalleled opportunity to identify new insights into many aspects of human life with recent advances in health care. Using data science in digital health raises significant challenges regarding data privacy, transparency, and trustworthiness. Recent regulations enforce the need for a clear legal basis for collecting, processing, and sharing data, for example, the European Union's General Data Protection Regulation (2016) and the United Kingdom's Data Protection Act (2018). For health care providers, legal use of the electronic health record (EHR) is permitted only in clinical care cases. Any other use of the data requires thoughtful considerations of the legal context and direct patient consent. Identifiable personal and sensitive information must be sufficiently anonymized. Raw data are commonly anonymized to be used for research purposes, with risk assessment for reidentification and utility. Although health care organizations have internal policies defined for information governance, there is a significant lack of practical tools and intuitive guidance about the use of data for research and modeling. Off-the-shelf data anonymization tools are developed frequently, but privacy-related functionalities are often incomparable with regard to use in different problem domains. In addition, tools to support measuring the risk of the anonymized data with regard to reidentification against the usefulness of the data exist, but there are question marks over their efficacy. OBJECTIVE In this systematic literature mapping study, we aim to alleviate the aforementioned issues by reviewing the landscape of data anonymization for digital health care. METHODS We used Google Scholar, Web of Science, Elsevier Scopus, and PubMed to retrieve academic studies published in English up to June 2020. Noteworthy gray literature was also used to initialize the search. We focused on review questions covering 5 bottom-up aspects: basic anonymization operations, privacy models, reidentification risk and usability metrics, off-the-shelf anonymization tools, and the lawful basis for EHR data anonymization. RESULTS We identified 239 eligible studies, of which 60 were chosen for general background information; 16 were selected for 7 basic anonymization operations; 104 covered 72 conventional and machine learning-based privacy models; four and 19 papers included seven and 15 metrics, respectively, for measuring the reidentification risk and degree of usability; and 36 explored 20 data anonymization software tools. In addition, we also evaluated the practical feasibility of performing anonymization on EHR data with reference to their usability in medical decision-making. Furthermore, we summarized the lawful basis for delivering guidance on practical EHR data anonymization. CONCLUSIONS This systematic literature mapping study indicates that anonymization of EHR data is theoretically achievable; yet, it requires more research efforts in practical implementations to balance privacy preservation and usability to ensure more reliable health care applications.
Collapse
Affiliation(s)
- Zheming Zuo
- Department of Computer Science, Durham University, Durham, United Kingdom
| | - Matthew Watson
- Department of Computer Science, Durham University, Durham, United Kingdom
| | - David Budgen
- Department of Computer Science, Durham University, Durham, United Kingdom
| | - Robert Hall
- Cievert Ltd, Newcastle upon Tyne, United Kingdom
| | | | - Noura Al Moubayed
- Department of Computer Science, Durham University, Durham, United Kingdom
| |
Collapse
|
27
|
Islam SMS, Mishra V, Siddiqui MU, Moses JC, Adibi S, Nguyen L, Wickramasinghe N. Smartphone Apps for Diabetes Medication Adherence: A Systematic Review (Preprint). JMIR Diabetes 2021; 7:e33264. [PMID: 35727613 PMCID: PMC9257622 DOI: 10.2196/33264] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 02/24/2022] [Accepted: 04/08/2022] [Indexed: 11/13/2022] Open
Affiliation(s)
- Sheikh Mohammed Shariful Islam
- Institute for Physical Activity and Nutrition, School of Exercise and Nutrition Sciences, Faculty of Health, Deakin University, Melbourne, Australia
| | - Vinaytosh Mishra
- College of Healthcare Management and Economics, Gulf Medical University, Ajman, United Arab Emirates
| | - Muhammad Umer Siddiqui
- Department of Internal Medicine, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, United States
| | | | - Sasan Adibi
- School of Information Technology, Deakin University, Burwood, Australia
| | - Lemai Nguyen
- School of Information Technology, Deakin University, Burwood, Australia
| | - Nilmini Wickramasinghe
- Iverson Health Innovation Research Institute, Swinburne University of Technology, Melbourne, Australia
| |
Collapse
|
28
|
Cole CL, Sengupta S, Rossetti Née Collins S, Vawdrey DK, Halaas M, Maddox TM, Gordon G, Dave T, Payne PRO, Williams AE, Estrin D. Ten principles for data sharing and commercialization. J Am Med Inform Assoc 2021; 28:646-649. [PMID: 33186458 PMCID: PMC7936510 DOI: 10.1093/jamia/ocaa260] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 08/07/2020] [Accepted: 10/02/2020] [Indexed: 11/18/2022] Open
Abstract
Digital medical records have enabled us to employ clinical data in many new and innovative ways. However, these advances have brought with them a complex set of demands for healthcare institutions regarding data sharing with topics such as data ownership, the loss of privacy, and the protection of the intellectual property. The lack of clear guidance from government entities often creates conflicting messages about data policy, leaving institutions to develop guidelines themselves. Through discussions with multiple stakeholders at various institutions, we have generated a set of guidelines with 10 key principles to guide the responsible and appropriate use and sharing of clinical data for the purposes of care and discovery. Industry, universities, and healthcare institutions can build upon these guidelines toward creating a responsible, ethical, and practical response to data sharing.
Collapse
Affiliation(s)
- Curtis L Cole
- Healthcare Policy and Research, Cornell University, New York, New York, USA
| | | | | | | | - Michael Halaas
- Stanford University School of Medicine, Palo Alto, California, USA
| | - Thomas M Maddox
- Healthcare Innovation Lab, BJC HealthCare/Washington University School of Medicine, St. Louis, Missouri, USA
| | - Geoff Gordon
- Informatics Institute, University of Alabama-Birmingham, Birmingham, Alabama, USA
| | - Trushna Dave
- IT Business Solutions, NewYork-Presbyterian Hospital, New York, New York, USA
| | - Philip R O Payne
- Institute for Informatics (I2), Washington University School of Medicine, St. Louis, Missouri, USA
| | | | - Deborah Estrin
- Cornell Tech, Cornell University, New York, New York, USA
| |
Collapse
|
29
|
Rutherford M, Mun SK, Levine B, Bennett W, Smith K, Farmer P, Jarosz Q, Wagner U, Freyman J, Blake G, Tarbox L, Farahani K, Prior F. A DICOM dataset for evaluation of medical image de-identification. Sci Data 2021; 8:183. [PMID: 34272388 PMCID: PMC8285420 DOI: 10.1038/s41597-021-00967-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 06/07/2021] [Indexed: 11/23/2022] Open
Abstract
We developed a DICOM dataset that can be used to evaluate the performance of de-identification algorithms. DICOM objects (a total of 1,693 CT, MRI, PET, and digital X-ray images) were selected from datasets published in the Cancer Imaging Archive (TCIA). Synthetic Protected Health Information (PHI) was generated and inserted into selected DICOM Attributes to mimic typical clinical imaging exams. The DICOM Standard and TCIA curation audit logs guided the insertion of synthetic PHI into standard and non-standard DICOM data elements. A TCIA curation team tested the utility of the evaluation dataset. With this publication, the evaluation dataset (containing synthetic PHI) and de-identified evaluation dataset (the result of TCIA curation) are released on TCIA in advance of a competition, sponsored by the National Cancer Institute (NCI), for algorithmic de-identification of medical image datasets. The competition will use a much larger evaluation dataset constructed in the same manner. This paper describes the creation of the evaluation datasets and guidelines for their use. Measurement(s) | Deidentification • Clinical Data | Technology Type(s) | data synthesis • digital curation | Factor Type(s) | imaging type | Sample Characteristic - Organism | Homo sapiens |
Machine-accessible metadata file describing the reported data: 10.6084/m9.figshare.14802774
Collapse
Affiliation(s)
- Michael Rutherford
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
| | - Seong K Mun
- Arlington Innovation Center: Health Research, Virginia Tech, Arlington, Virginia, USA
| | - Betty Levine
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
| | - William Bennett
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
| | - Kirk Smith
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
| | - Phil Farmer
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
| | - Quasar Jarosz
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
| | - Ulrike Wagner
- Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA
| | - John Freyman
- Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA
| | - Geri Blake
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
| | - Lawrence Tarbox
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
| | - Keyvan Farahani
- Center for Biomedical Informatics and Information Technology, National Cancer Institute, Bethesda, Maryland, USA
| | - Fred Prior
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA. .,Department of Radiology, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA.
| |
Collapse
|
30
|
Recent Radiomics Advancements in Breast Cancer: Lessons and Pitfalls for the Next Future. ACTA ACUST UNITED AC 2021; 28:2351-2372. [PMID: 34202321 PMCID: PMC8293249 DOI: 10.3390/curroncol28040217] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 06/14/2021] [Accepted: 06/21/2021] [Indexed: 12/13/2022]
Abstract
Radiomics is an emerging translational field of medicine based on the extraction of high-dimensional data from radiological images, with the purpose to reach reliable models to be applied into clinical practice for the purposes of diagnosis, prognosis and evaluation of disease response to treatment. We aim to provide the basic information on radiomics to radiologists and clinicians who are focused on breast cancer care, encouraging cooperation with scientists to mine data for a better application in clinical practice. We investigate the workflow and clinical application of radiomics in breast cancer care, as well as the outlook and challenges based on recent studies. Currently, radiomics has the potential ability to distinguish between benign and malignant breast lesions, to predict breast cancer’s molecular subtypes, the response to neoadjuvant chemotherapy and the lymph node metastases. Even though radiomics has been used in tumor diagnosis and prognosis, it is still in the research phase and some challenges need to be faced to obtain a clinical translation. In this review, we discuss the current limitations and promises of radiomics for improvement in further research.
Collapse
|
31
|
Hertz DL, Arwood MJ, Stocco G, Singh S, Karnes JH, Ramsey LB. Planning and Conducting a Pharmacogenetics Association Study. Clin Pharmacol Ther 2021; 110:688-701. [PMID: 33880756 DOI: 10.1002/cpt.2270] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 04/04/2021] [Indexed: 12/13/2022]
Abstract
Pharmacogenetics (PGx) association studies are used to discover, replicate, and validate the association between an inherited genotype and a treatment outcome. The objective of this tutorial is to provide trainees and novice PGx researchers with an overview of the major decisions that need to be made when designing and conducting a PGx association study. The first critical decision is to determine whether the objective of the study is discovery, replication, or validation. Next, the researcher must identify a patient cohort that has all of the data necessary to conduct the intended analysis. Then, the investigator must select and define the treatment outcome, or phenotype, that will be analyzed. Next, the investigator must determine what genotyping approach and genetic data will be included in the analysis. Finally, the association between the genotype and phenotype is tested using some statistical analysis methodology. This tutorial is divided into five sections; each section describes commonly used approaches and provides suggestions and resources for designing and conducting a PGx association study. Successful PGx association studies are necessary to discover and validate associations between inherited genetic variation and treatment outcomes, which enable clinical translation to improve efficacy and reduce toxicity of treatment.
Collapse
Affiliation(s)
- Daniel L Hertz
- Department of Clinical Pharmacy, University of Michigan College of Pharmacy, Ann Arbor, Michigan, USA
| | - Meghan J Arwood
- Tabula Rasa HealthCare, Precision Pharmacotherapy Research and Development Institute, Orlando, Florida, USA
| | - Gabriele Stocco
- Department of Life Sciences, University of Trieste, Trieste, Italy
| | - Sonal Singh
- Takeda California, San Diego, California, USA
| | - Jason H Karnes
- Department of Pharmacy Practice and Science, University of Arizona College of Pharmacy, Tucson, Arizona, USA.,Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Laura B Ramsey
- Divisions of Clinical Pharmacology & Research in Patient Services, Department of Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA
| |
Collapse
|
32
|
Reichold M, Dietzel N, Chmelirsch C, Kolominsky-Rabas PL, Graessel E, Prokosch HU. Designing and Implementing an IT Architecture for a Digital Multicenter Dementia Registry: digiDEM Bayern. Appl Clin Inform 2021; 12:551-563. [PMID: 34134149 PMCID: PMC8208839 DOI: 10.1055/s-0041-1731286] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Background
Registries are an essential research tool to investigate the long-term course of diseases and their impact on the affected. The project digiDEM Bayern will set up a prospective dementia registry to collect long-term data of people with dementia and their caregivers in Bavaria (Germany) supported by more than 300 research partners.
Objective
The objective of this article is to outline an information technology (IT) architecture for the integration of a registry and comprehensive participant management in a dementia study. Measures to ensure high data quality, study governance, along with data privacy, and security are to be included in the architecture.
Methods
The architecture was developed based on an iterative, stakeholder-oriented process. The development was inspired by the Twin Peaks Model that focuses on the codevelopment of requirements and architecture. We gradually moved from a general to a detailed understanding of both the requirements and design through a series of iterations. The experience learned from the pilot phase was integrated into a further iterative process of continuous improvement of the architecture.
Results
The infrastructure provides a standardized workflow to support the electronic data collection and trace each participant's study process. Therefore, the implementation consists of three systems: (1) electronic data capture system for Web-based or offline app-based data collection; (2) participant management system for the administration of the identity data of participants and research partners as well as of the overall study governance process; and (3) videoconferencing software for conducting interviews online. First experiences in the pilot phase have proven the feasibility of the framework.
Conclusion
This article outlines an IT architecture to integrate a registry and participant management in a dementia research project. The framework was discussed and developed with the involvement of numerous stakeholders. Due to its adaptability of used software systems, a transfer to other projects should be easily possible.
Collapse
Affiliation(s)
- Michael Reichold
- Department of Medical Informatics, Friedrich-Alexander-University Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Nikolas Dietzel
- Interdisciplinary Center for Health Technology Assessment (HTA) and Public Health (IZPH), Friedrich-Alexander University Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Christina Chmelirsch
- Interdisciplinary Center for Health Technology Assessment (HTA) and Public Health (IZPH), Friedrich-Alexander University Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Peter L Kolominsky-Rabas
- Interdisciplinary Center for Health Technology Assessment (HTA) and Public Health (IZPH), Friedrich-Alexander University Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Elmar Graessel
- Center for Health Services Research in Medicine, Department of Psychiatry and Psychotherapy, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Hans-Ulrich Prokosch
- Department of Medical Informatics, Friedrich-Alexander-University Erlangen-Nürnberg (FAU), Erlangen, Germany
| |
Collapse
|
33
|
Affiliation(s)
- Lee Swales
- School of Law, University of KwaZulu-Natal, Durban, South Africa
| |
Collapse
|
34
|
Zegers CML, Witteveen A, Schulte MHJ, Henrich JF, Vermeij A, Klever B, Dekker A. Mind Your Data: Privacy and Legal Matters in eHealth. JMIR Form Res 2021; 5:e17456. [PMID: 33729163 PMCID: PMC8075039 DOI: 10.2196/17456] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 05/01/2020] [Accepted: 01/17/2021] [Indexed: 01/30/2023] Open
Abstract
The health care sector can benefit considerably from developments in digital technology. Consequently, eHealth applications are rapidly increasing in number and sophistication. For successful development and implementation of eHealth, it is paramount to guarantee the privacy and safety of patients and their collected data. At the same time, anonymized data that are collected through eHealth could be used in the development of innovative and personalized diagnostic, prognostic, and treatment tools. To address the needs of researchers, health care providers, and eHealth developers for more information and practical tools to handle privacy and legal matters in eHealth, the Dutch national Digital Society Research Programme organized the “Mind Your Data: Privacy and Legal Matters in eHealth” conference. In this paper, we share the key take home messages from the conference based on the following five tradeoffs: (1) privacy versus independence, (2) informed consent versus convenience, (3) clinical research versus clinical routine data, (4) responsibility and standardization, and (5) privacy versus solidarity.
Collapse
Affiliation(s)
- Catharina M L Zegers
- Digital Society Health & Well-being, The Hague, Netherlands.,Institute of Data Science, Maastricht University, Maastricht, Netherlands.,Department of Radiation Oncology (MAASTRO), GROW - School for Oncology and Development Biology, Maastricht University Medical Centre, Maastricht, Netherlands
| | - Annemieke Witteveen
- Digital Society Health & Well-being, The Hague, Netherlands.,Department of Biomedical Signals and Systems, TechMed Centre, University of Twente, Enschede, Netherlands
| | - Mieke H J Schulte
- Digital Society Health & Well-being, The Hague, Netherlands.,Department of Clinical, Neuro and Developmental Psychology, Vrije Universiteit, Amsterdam, Netherlands
| | - Julia F Henrich
- Digital Society Health & Well-being, The Hague, Netherlands.,Unit of Health, Medical and Neuropsychology, Institute of Psychology, Leiden University, Leiden, Netherlands
| | - Anouk Vermeij
- Digital Society Health & Well-being, The Hague, Netherlands.,Department of Cognitive Neuropsychology, Tilburg University, Tilburg, Netherlands
| | - Brigit Klever
- Digital Society Health & Well-being, The Hague, Netherlands.,University Medical Center, University of Groningen, Groningen, Netherlands
| | - Andre Dekker
- Digital Society Health & Well-being, The Hague, Netherlands.,Department of Radiation Oncology (MAASTRO), GROW - School for Oncology and Development Biology, Maastricht University Medical Centre, Maastricht, Netherlands
| |
Collapse
|
35
|
Syed S, Syed M, Syeda HB, Garza M, Bennett W, Bona J, Begum S, Baghal A, Zozus M, Prior F. API Driven On-Demand Participant ID Pseudonymization in Heterogeneous Multi-Study Research. Healthc Inform Res 2021; 27:39-47. [PMID: 33611875 PMCID: PMC7921568 DOI: 10.4258/hir.2021.27.1.39] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 09/23/2020] [Accepted: 10/18/2020] [Indexed: 11/29/2022] Open
Abstract
OBJECTIVES To facilitate clinical and translational research, imaging and non-imaging clinical data from multiple disparate systems must be aggregated for analysis. Study participant records from various sources are linked together and to patient records when possible to address research questions while ensuring patient privacy. This paper presents a novel tool that pseudonymizes participant identifiers (PIDs) using a researcher-driven automated process that takes advantage of application-programming interface (API) and the Perl Open-Source Digital Imaging and Communications in Medicine Archive (POSDA) to further de-identify PIDs. The tool, on-demand cohort and API participant identifier pseudonymization (O-CAPP), employs a pseudonymization method based on the type of incoming research data. METHODS For images, pseudonymization of PIDs is done using API calls that receive PIDs present in Digital Imaging and Communications in Medicine (DICOM) headers and returns the pseudonymized identifiers. For non-imaging clinical research data, PIDs provided by study principal investigators (PIs) are pseudonymized using a nightly automated process. The pseudonymized PIDs (P-PIDs) along with other protected health information is further de-identified using POSDA. RESULTS A sample of 250 PIDs pseudonymized by O-CAPP were selected and successfully validated. Of those, 125 PIDs that were pseudonymized by the nightly automated process were validated by multiple clinical trial investigators (CTIs). For the other 125, CTIs validated radiologic image pseudonymization by API request based on the provided PID and P-PID mappings. CONCLUSIONS We developed a novel approach of an ondemand pseudonymization process that will aide researchers in obtaining a comprehensive and holistic view of study participant data without compromising patient privacy.
Collapse
Affiliation(s)
- Shorabuddin Syed
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR,
USA
| | - Mahanazuddin Syed
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR,
USA
| | - Hafsa Bareen Syeda
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR,
USA
| | - Maryam Garza
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR,
USA
| | - William Bennett
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR,
USA
| | - Jonathan Bona
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR,
USA
| | - Salma Begum
- Department of Information Technology, University of Arkansas for Medical Sciences, Little Rock, AR,
USA
| | - Ahmad Baghal
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR,
USA
| | - Meredith Zozus
- Department of Population Health Sciences, University of Texas Health Science Center at San Antonio, San Antonio, TX,
USA
| | - Fred Prior
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR,
USA
| |
Collapse
|
36
|
Jeong YU, Yoo S, Kim YH, Shim WH. De-Identification of Facial Features in Magnetic Resonance Images: Software Development Using Deep Learning Technology. J Med Internet Res 2020; 22:e22739. [PMID: 33208302 PMCID: PMC7759440 DOI: 10.2196/22739] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 09/09/2020] [Accepted: 11/12/2020] [Indexed: 12/14/2022] Open
Abstract
Background High-resolution medical images that include facial regions can be used to recognize the subject’s face when reconstructing 3-dimensional (3D)-rendered images from 2-dimensional (2D) sequential images, which might constitute a risk of infringement of personal information when sharing data. According to the Health Insurance Portability and Accountability Act (HIPAA) privacy rules, full-face photographic images and any comparable image are direct identifiers and considered as protected health information. Moreover, the General Data Protection Regulation (GDPR) categorizes facial images as biometric data and stipulates that special restrictions should be placed on the processing of biometric data. Objective This study aimed to develop software that can remove the header information from Digital Imaging and Communications in Medicine (DICOM) format files and facial features (eyes, nose, and ears) at the 2D sliced-image level to anonymize personal information in medical images. Methods A total of 240 cranial magnetic resonance (MR) images were used to train the deep learning model (144, 48, and 48 for the training, validation, and test sets, respectively, from the Alzheimer's Disease Neuroimaging Initiative [ADNI] database). To overcome the small sample size problem, we used a data augmentation technique to create 576 images per epoch. We used attention-gated U-net for the basic structure of our deep learning model. To validate the performance of the software, we adapted an external test set comprising 100 cranial MR images from the Open Access Series of Imaging Studies (OASIS) database. Results The facial features (eyes, nose, and ears) were successfully detected and anonymized in both test sets (48 from ADNI and 100 from OASIS). Each result was manually validated in both the 2D image plane and the 3D-rendered images. Furthermore, the ADNI test set was verified using Microsoft Azure's face recognition artificial intelligence service. By adding a user interface, we developed and distributed (via GitHub) software named “Deface program” for medical images as an open-source project. Conclusions We developed deep learning–based software for the anonymization of MR images that distorts the eyes, nose, and ears to prevent facial identification of the subject in reconstructed 3D images. It could be used to share medical big data for secondary research while making both data providers and recipients compliant with the relevant privacy regulations.
Collapse
Affiliation(s)
- Yeon Uk Jeong
- Department of Medical Science, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Soyoung Yoo
- Human Research Protection Center, Asan Institute of Life Sciences, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Young-Hak Kim
- Division of Cardiology, Department of Internal Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea.,Department of Information Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Woo Hyun Shim
- Department of Medical Science, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea.,Department of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
37
|
Parker W, Jaremko JL, Cicero M, Azar M, El-Emam K, Gray BG, Hurrell C, Lavoie-Cardinal F, Desjardins B, Lum A, Sheremeta L, Lee E, Reinhold C, Tang A, Bromwich R. Canadian Association of Radiologists White Paper on De-Identification of Medical Imaging: Part 1, General Principles. Can Assoc Radiol J 2020; 72:13-24. [PMID: 33138621 DOI: 10.1177/0846537120967349] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
The application of big data, radiomics, machine learning, and artificial intelligence (AI) algorithms in radiology requires access to large data sets containing personal health information. Because machine learning projects often require collaboration between different sites or data transfer to a third party, precautions are required to safeguard patient privacy. Safety measures are required to prevent inadvertent access to and transfer of identifiable information. The Canadian Association of Radiologists (CAR) is the national voice of radiology committed to promoting the highest standards in patient-centered imaging, lifelong learning, and research. The CAR has created an AI Ethical and Legal standing committee with the mandate to guide the medical imaging community in terms of best practices in data management, access to health care data, de-identification, and accountability practices. Part 1 of this article will inform CAR members on principles of de-identification, pseudonymization, encryption, direct and indirect identifiers, k-anonymization, risks of reidentification, implementations, data set release models, and validation of AI algorithms, with a view to developing appropriate standards to safeguard patient information effectively.
Collapse
Affiliation(s)
- William Parker
- Department of Radiology, 8166University of British Columbia, Vancouver, British Columbia, Canada.,SapienML Corp, Vancouver, British Columbia, Canada
| | - Jacob L Jaremko
- Department of Radiology & Diagnostic Imaging, 12357University of Alberta, Edmonton, Canada
| | - Mark Cicero
- 16 Bit Inc, Toronto, Ontario, Canada.,True North Imaging, Thornhill, Ontario, Canada
| | - Marleine Azar
- Department of Medicine, 5622Université de Montréal, Montréal, Quebec, Canada
| | - Khaled El-Emam
- School of Epidemiology and Public Health, University of Ottawa, Ontario, Canada
| | - Bruce G Gray
- Department of Medical Imaging, University of Toronto, Toronto, Canada
| | - Casey Hurrell
- 525917Canadian Association of Radiologists, Ottawa, Canada
| | | | | | - Andrea Lum
- Department of Medical Imaging, 6221Western University, London, Ontario, Canada
| | - Lori Sheremeta
- 41464Northern Alberta Institute of Technology, Alberta, Canada
| | - Emil Lee
- 27355Fraser Health Authority, Vancouver, British Columbia, Canada
| | - Caroline Reinhold
- 54473McGill University Health Center, McGill University, Montreal, Canada.,Augmented Intelligence & Precision Health Laboratory of the Research Institute, McGill University Health Center, McGill University, Montreal, Canada
| | - An Tang
- Department of Radiology, Radio-oncology, and Nuclear Medicine, 5622Universite de Montreal, Montreal, Quebec, Canada
| | - Rebecca Bromwich
- Department of Law and Legal Studies, 6339Carleton University, Ottawa, Canada
| |
Collapse
|
38
|
Pung J, Rienhoff O. Key components and IT assistance of participant management in clinical research: a scoping review. JAMIA Open 2020; 3:449-458. [PMID: 33215078 PMCID: PMC7660951 DOI: 10.1093/jamiaopen/ooaa041] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Revised: 07/16/2020] [Accepted: 08/24/2020] [Indexed: 01/05/2023] Open
Abstract
Objectives Managing participants and their data are fundamental for the success of a clinical trial. Our review identifies and describes processes that deal with management of trial participants and highlights information technology (IT) assistance for clinical research in the context of participant management. Methods A scoping literature review design, based on the Preferred Reporting Items for Systematic Reviews and Meta-analyses statement, was used to identify literature on trial participant-related proceedings, work procedures, or workflows, and assisting electronic systems. Results The literature search identified 1329 articles of which 111 were included for analysis. Participant-related procedures were categorized into 4 major trial processes: recruitment, obtaining informed consent, managing identities, and managing administrative data. Our results demonstrated that management of trial participants is considered in nearly every step of clinical trials, and that IT was successfully introduced to all participant-related areas of a clinical trial to facilitate processes. Discussion There is no precise definition of participant management, so a broad search strategy was necessary, resulting in a high number of articles that had to be excluded. Nevertheless, this review provides a comprehensive overview of participant management-related components, which was lacking so far. The review contributes to a better understanding of how computer-assisted management of participants in clinical trials is possible.
Collapse
Affiliation(s)
- Johannes Pung
- Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
| | - Otto Rienhoff
- Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
| |
Collapse
|
39
|
Steinkamp JM, Pomeranz T, Adleberg J, Kahn CE, Cook TS. Evaluation of Automated Public De-Identification Tools on a Corpus of Radiology Reports. Radiol Artif Intell 2020; 2:e190137. [PMID: 33937843 DOI: 10.1148/ryai.2020190137] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 05/05/2020] [Accepted: 05/14/2020] [Indexed: 11/11/2022]
Abstract
Purpose To evaluate publicly available de-identification tools on a large corpus of narrative-text radiology reports. Materials and Methods In this retrospective study, 21 categories of protected health information (PHI) in 2503 radiology reports were annotated from a large multihospital academic health system, collected between January 1, 2012 and January 8, 2019. A subset consisting of 1023 reports served as a test set; the remainder were used as domain-specific training data. The types and frequencies of PHI present within the reports were tallied. Five public de-identification tools were evaluated: MITRE Identification Scrubber Toolkit, U.S. National Library of Medicine‒Scrubber, Massachusetts Institute of Technology de-identification software, Emory Health Information DE-identification (HIDE) software, and Neuro named-entity recognition (NeuroNER). The tools were compared using metrics including recall, precision, and F1 score (the harmonic mean of recall and precision) for each category of PHI. Results The annotators identified 3528 spans of PHI text within the 2503 reports. Cohen κ for interrater agreement was 0.938. Dates accounted for the majority of PHI found in the dataset of radiology reports (n = 2755 [78%]). The two best-performing tools both used machine learning methods-NeuroNER (precision, 94.5%; recall, 92.6%; microaveraged F1 score [F1], 93.6%) and Emory HIDE (precision, 96.6%; recall, 88.2%; F1, 92.2%)-but none exceeded 50% F1 on the important patient names category. Conclusion PHI appeared infrequently within the corpus of reports studied, which created difficulties for training machine learning systems. Out-of-the-box de-identification tools achieved limited performance on the corpus of radiology reports, suggesting the need for further advancements in public datasets and trained models.Supplemental material is available for this article.See also the commentary by Tenenholtz and Wood in this issue.© RSNA, 2020.
Collapse
Affiliation(s)
- Jackson M Steinkamp
- Department of Radiology, Hospital of the University of Pennsylvania, 3400 Spruce St, Philadelphia, PA 19104 (J.M.S., T.P., J.A., C.E.K., T.S.C.); and Boston University School of Medicine, Boston, Mass (J.M.S.)
| | - Taylor Pomeranz
- Department of Radiology, Hospital of the University of Pennsylvania, 3400 Spruce St, Philadelphia, PA 19104 (J.M.S., T.P., J.A., C.E.K., T.S.C.); and Boston University School of Medicine, Boston, Mass (J.M.S.)
| | - Jason Adleberg
- Department of Radiology, Hospital of the University of Pennsylvania, 3400 Spruce St, Philadelphia, PA 19104 (J.M.S., T.P., J.A., C.E.K., T.S.C.); and Boston University School of Medicine, Boston, Mass (J.M.S.)
| | - Charles E Kahn
- Department of Radiology, Hospital of the University of Pennsylvania, 3400 Spruce St, Philadelphia, PA 19104 (J.M.S., T.P., J.A., C.E.K., T.S.C.); and Boston University School of Medicine, Boston, Mass (J.M.S.)
| | - Tessa S Cook
- Department of Radiology, Hospital of the University of Pennsylvania, 3400 Spruce St, Philadelphia, PA 19104 (J.M.S., T.P., J.A., C.E.K., T.S.C.); and Boston University School of Medicine, Boston, Mass (J.M.S.)
| |
Collapse
|
40
|
Abstract
OBJECTIVES Clinical Research Informatics (CRI) declares its scope in its name, but its content, both in terms of the clinical research it supports-and sometimes initiates-and the methods it has developed over time, reach much further than the name suggests. The goal of this review is to celebrate the extraordinary diversity of activity and of results, not as a prize-giving pageant, but in recognition of the field, the community that both serves and is sustained by it, and of its interdisciplinarity and its international dimension. METHODS Beyond personal awareness of a range of work commensurate with the author's own research, it is clear that, even with a thorough literature search, a comprehensive review is impossible. Moreover, the field has grown and subdivided to an extent that makes it very hard for one individual to be familiar with every branch or with more than a few branches in any depth. A literature survey was conducted that focused on informatics-related terms in the general biomedical and healthcare literature, and specific concerns ("artificial intelligence", "data models", "analytics", etc.) in the biomedical informatics (BMI) literature. In addition to a selection from the results from these searches, suggestive references within them were also considered. RESULTS The substantive sections of the paper-Artificial Intelligence, Machine Learning, and "Big Data" Analytics; Common Data Models, Data Quality, and Standards; Phenotyping and Cohort Discovery; Privacy: Deidentification, Distributed Computation, Blockchain; Causal Inference and Real-World Evidence-provide broad coverage of these active research areas, with, no doubt, a bias towards this reviewer's interests and preferences, landing on a number of papers that stood out in one way or another, or, alternatively, exemplified a particular line of work. CONCLUSIONS CRI is thriving, not only in the familiar major centers of research, but more widely, throughout the world. This is not to pretend that the distribution is uniform, but to highlight the potential for this domain to play a prominent role in supporting progress in medicine, healthcare, and wellbeing everywhere. We conclude with the observation that CRI and its practitioners would make apt stewards of the new medical knowledge that their methods will bring forward.
Collapse
Affiliation(s)
- Anthony Solomonides
- Outcomes Research Network, Research Institute, NorthShore University HealthSystem, Evanston, IL, USA
| |
Collapse
|
41
|
Abstract
OBJECTIVE To summarize significant research contributions on ethics in medical informatics published in 2019. METHODS An extensive search using PubMed/Medline was conducted to identify the scientific contributions published in 2019 that address ethics issues in medical informatics. The selection process comprised three steps: 1) 15 candidate best papers were first selected by the two section editors; 2) external reviewers from internationally renowned research teams reviewed each candidate best paper; and 3) the final selection of three best papers was conducted by the editorial committee of the Yearbook. RESULTS The three selected best papers explore timely issues of concern to the community and demonstrate how ethics considerations influence applied informatics. CONCLUSION With regard to ethics in informatics, data sharing and privacy remain primary areas of concern. Ethics issues related to the development and implementation of artificial intelligence is an emerging topic of interest.
Collapse
Affiliation(s)
- Carolyn Petersen
- Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota, USA
| | - Vignesh Subbian
- College of Engineering, The University of Arizona, Tucson, Arizona, USA
| | | |
Collapse
|
42
|
Larson DB, Magnus DC, Lungren MP, Shah NH, Langlotz CP. Ethics of Using and Sharing Clinical Imaging Data for Artificial Intelligence: A Proposed Framework. Radiology 2020; 295:675-682. [PMID: 32208097 DOI: 10.1148/radiol.2020192536] [Citation(s) in RCA: 78] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
In this article, the authors propose an ethical framework for using and sharing clinical data for the development of artificial intelligence (AI) applications. The philosophical premise is as follows: when clinical data are used to provide care, the primary purpose for acquiring the data is fulfilled. At that point, clinical data should be treated as a form of public good, to be used for the benefit of future patients. In their 2013 article, Faden et al argued that all who participate in the health care system, including patients, have a moral obligation to contribute to improving that system. The authors extend that framework to questions surrounding the secondary use of clinical data for AI applications. Specifically, the authors propose that all individuals and entities with access to clinical data become data stewards, with fiduciary (or trust) responsibilities to patients to carefully safeguard patient privacy, and to the public to ensure that the data are made widely available for the development of knowledge and tools to benefit future patients. According to this framework, the authors maintain that it is unethical for providers to "sell" clinical data to other parties by granting access to clinical data, especially under exclusive arrangements, in exchange for monetary or in-kind payments that exceed costs. The authors also propose that patient consent is not required before the data are used for secondary purposes when obtaining such consent is prohibitively costly or burdensome, as long as mechanisms are in place to ensure that ethical standards are strictly followed. Rather than debate whether patients or provider organizations "own" the data, the authors propose that clinical data are not owned at all in the traditional sense, but rather that all who interact with or control the data have an obligation to ensure that the data are used for the benefit of future patients and society.
Collapse
Affiliation(s)
- David B Larson
- From the Department of Radiology, Stanford University School of Medicine, 300 Pasteur Dr, Stanford, CA 94305-5105
| | - David C Magnus
- From the Department of Radiology, Stanford University School of Medicine, 300 Pasteur Dr, Stanford, CA 94305-5105
| | - Matthew P Lungren
- From the Department of Radiology, Stanford University School of Medicine, 300 Pasteur Dr, Stanford, CA 94305-5105
| | - Nigam H Shah
- From the Department of Radiology, Stanford University School of Medicine, 300 Pasteur Dr, Stanford, CA 94305-5105
| | - Curtis P Langlotz
- From the Department of Radiology, Stanford University School of Medicine, 300 Pasteur Dr, Stanford, CA 94305-5105
| |
Collapse
|
43
|
Lovis C. Unlocking the Power of Artificial Intelligence and Big Data in Medicine. J Med Internet Res 2019; 21:e16607. [PMID: 31702565 PMCID: PMC6874800 DOI: 10.2196/16607] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Revised: 10/18/2019] [Accepted: 10/20/2019] [Indexed: 12/17/2022] Open
Abstract
Data-driven science and its corollaries in machine learning and the wider field of artificial intelligence have the potential to drive important changes in medicine. However, medicine is not a science like any other: It is deeply and tightly bound with a large and wide network of legal, ethical, regulatory, economical, and societal dependencies. As a consequence, the scientific and technological progresses in handling information and its further processing and cross-linking for decision support and predictive systems must be accompanied by parallel changes in the global environment, with numerous stakeholders, including citizen and society. What can be seen at the first glance as a barrier and a mechanism slowing down the progression of data science must, however, be considered an important asset. Only global adoption can transform the potential of big data and artificial intelligence into an effective breakthroughs in handling health and medicine. This requires science and society, scientists and citizens, to progress together.
Collapse
Affiliation(s)
- Christian Lovis
- Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland
- Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| |
Collapse
|