1
|
Tajabadi M, Martin R, Heider D. Privacy-preserving decentralized learning methods for biomedical applications. Comput Struct Biotechnol J 2024; 23:3281-3287. [PMID: 39296807 PMCID: PMC11408144 DOI: 10.1016/j.csbj.2024.08.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Revised: 08/26/2024] [Accepted: 08/26/2024] [Indexed: 09/21/2024] Open
Abstract
In recent years, decentralized machine learning has emerged as a significant advancement in biomedical applications, offering robust solutions for data privacy, security, and collaboration across diverse healthcare environments. In this review, we examine various decentralized learning methodologies, including federated learning, split learning, swarm learning, gossip learning, edge learning, and some of their applications in the biomedical field. We delve into the underlying principles, network topologies, and communication strategies of each approach, highlighting their advantages and limitations. Ultimately, the selection of a suitable method should be based on specific needs, infrastructures, and computational capabilities.
Collapse
Affiliation(s)
- Mohammad Tajabadi
- Institute of Computer Science, Heinrich-Heine-University Duesseldorf, Graf-Adolf-Str. 63, Duesseldorf, 40215, North Rhine-Westphalia, Germany
- Center for Digital Medicine, Heinrich-Heine-University Duesseldorf, Moorenstr. 5, Duesseldorf, 40215, North Rhine-Westphalia, Germany
| | - Roman Martin
- Institute of Computer Science, Heinrich-Heine-University Duesseldorf, Graf-Adolf-Str. 63, Duesseldorf, 40215, North Rhine-Westphalia, Germany
- Center for Digital Medicine, Heinrich-Heine-University Duesseldorf, Moorenstr. 5, Duesseldorf, 40215, North Rhine-Westphalia, Germany
| | - Dominik Heider
- Institute of Computer Science, Heinrich-Heine-University Duesseldorf, Graf-Adolf-Str. 63, Duesseldorf, 40215, North Rhine-Westphalia, Germany
- Center for Digital Medicine, Heinrich-Heine-University Duesseldorf, Moorenstr. 5, Duesseldorf, 40215, North Rhine-Westphalia, Germany
| |
Collapse
|
2
|
Alves CL, Martinelli T, Sallum LF, Rodrigues FA, Toutain TGLDO, Porto JAM, Thielemann C, Aguiar PMDC, Moeckel M. Multiclass classification of Autism Spectrum Disorder, attention deficit hyperactivity disorder, and typically developed individuals using fMRI functional connectivity analysis. PLoS One 2024; 19:e0305630. [PMID: 39418298 PMCID: PMC11486369 DOI: 10.1371/journal.pone.0305630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 06/03/2024] [Indexed: 10/19/2024] Open
Abstract
Neurodevelopmental conditions, such as Autism Spectrum Disorder (ASD) and Attention Deficit Hyperactivity Disorder (ADHD), present unique challenges due to overlapping symptoms, making an accurate diagnosis and targeted intervention difficult. Our study employs advanced machine learning techniques to analyze functional magnetic resonance imaging (fMRI) data from individuals with ASD, ADHD, and typically developed (TD) controls, totaling 120 subjects in the study. Leveraging multiclass classification (ML) algorithms, we achieve superior accuracy in distinguishing between ASD, ADHD, and TD groups, surpassing existing benchmarks with an area under the ROC curve near 98%. Our analysis reveals distinct neural signatures associated with ASD and ADHD: individuals with ADHD exhibit altered connectivity patterns of regions involved in attention and impulse control, whereas those with ASD show disruptions in brain regions critical for social and cognitive functions. The observed connectivity patterns, on which the ML classification rests, agree with established diagnostic approaches based on clinical symptoms. Furthermore, complex network analyses highlight differences in brain network integration and segregation among the three groups. Our findings pave the way for refined, ML-enhanced diagnostics in accordance with established practices, offering a promising avenue for developing trustworthy clinical decision-support systems.
Collapse
Affiliation(s)
- Caroline L. Alves
- Laboratory for Hybrid Modeling, Aschaffenburg University of Applied Sciences, Aschaffenburg, Bayern, Germany
| | - Tiago Martinelli
- Institute of Mathematical and Computer Sciences, University of São Paulo, São Paulo, São Paulo, Brazil
| | - Loriz Francisco Sallum
- Institute of Mathematical and Computer Sciences, University of São Paulo, São Paulo, São Paulo, Brazil
| | | | | | - Joel Augusto Moura Porto
- Institute of Physics of São Carlos (IFSC), University of São Paulo (USP), São Carlos, São Paulo, Brazil
- Institute of Biological Information Processing, Heinrich Heine University Düsseldorf, Düsseldorf, North Rhine–Westphalia Land, Germany
| | - Christiane Thielemann
- BioMEMS Lab, Aschaffenburg University of Applied Sciences, Aschaffenburg, Bayern, Germany
| | - Patrícia Maria de Carvalho Aguiar
- Hospital Israelita Albert Einstein, São Paulo, São Paulo, Brazil
- Department of Neurology and Neurosurgery, Federal University of São Paulo, São Paulo, São Paulo, Brazil
| | - Michael Moeckel
- Laboratory for Hybrid Modeling, Aschaffenburg University of Applied Sciences, Aschaffenburg, Bayern, Germany
| |
Collapse
|
3
|
Pollard S, Ehman M, Hermansen A, Weymann D, Krebs E, Ho C, Lim HJ, Jones S, Bombard Y, Hanna TP, Hessels C, Longstaff H, Cook-Deegan R, Bubela T, Regier DA. "I Just Assumed This Was Already Being Done": Canadian Patient Preferences for Enhanced Data Sharing for Precision Oncology. JCO Precis Oncol 2024; 8:e2400184. [PMID: 39116357 PMCID: PMC11371116 DOI: 10.1200/po.24.00184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 05/02/2024] [Accepted: 06/11/2024] [Indexed: 08/10/2024] Open
Abstract
PURPOSE In Canada, health data are siloed, slowing bioinnovation and evidence generation for personalized cancer care. Secured data-sharing platforms (SDSPs) can enable data analysis across silos through rapid concatenation across trial and real-world settings and timely researcher access. To motivate patient participation and trust in research, it is critical to ensure that SDSP design and oversight align with patients' values and address their concerns. We sought to qualitatively characterize patient preferences for the design of a pan-Canadian SDSP. METHODS Between January 2022 and July 2023, we conducted pan-Canadian virtual focus groups with individuals who had a personal history of cancer. Following each focus group, participants were invited to provide feedback on early-phase analysis results via a member-checking survey. Three trained qualitative researchers analyzed data using thematic analysis. RESULTS Twenty-eight individuals participated across five focus groups. Four focus groups were conducted in English and one in French. Thematic analysis generated two major and five minor themes. Analytic themes spanned personal and population implications of data sharing and willingness to manage perceived risks. Participants were supportive of increasing access to health data for precision oncology research, while voicing concerns about unintended data use, reidentification, and inequitable access to costly therapeutics. To mitigate perceived risks, participants highlighted the value of data access oversight and governance and informational transparency. CONCLUSION Strategies for secured data sharing should anticipate and mitigate the risks that patients perceive. Participants supported enhancing timely research capability while ensuring safeguards to protect patient autonomy and privacy. Our study informs the development of data-governance and data-sharing frameworks that integrate real-world and trial data, informed by evidence from direct patient input.
Collapse
Affiliation(s)
- Samantha Pollard
- Cancer Control Research, BC Cancer Research Institute, Vancouver, BC, Canada
- Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, Canada
| | - Morgan Ehman
- Cancer Control Research, BC Cancer Research Institute, Vancouver, BC, Canada
| | - Anna Hermansen
- Cancer Control Research, BC Cancer Research Institute, Vancouver, BC, Canada
- School of Population and Public Health, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Deirdre Weymann
- Cancer Control Research, BC Cancer Research Institute, Vancouver, BC, Canada
- Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, Canada
| | - Emanuel Krebs
- Cancer Control Research, BC Cancer Research Institute, Vancouver, BC, Canada
| | - Cheryl Ho
- Department of Medical Oncology, BC Cancer, Vancouver, BC, Canada
- Department of Medicine, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Howard J. Lim
- Department of Medical Oncology, BC Cancer, Vancouver, BC, Canada
- Department of Medicine, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Steven Jones
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Yvonne Bombard
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, ON, Canada
- Genomics Health Services Research Program, Li Ka Shing Knowledge Institute of St Michael's Hospital, Unity Health Toronto, Toronto, ON, Canada
| | - Timothy P. Hanna
- Department of Oncology, Queen's University, Kingston, ON, Canada
- Department of Public Health Science, Queen's University, Kingston, ON, Canada
| | - Chiquita Hessels
- Li-Fraumeni Syndrome Association Canada, British Columbia, Canada
| | | | | | - Tania Bubela
- Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, Canada
| | - Dean A. Regier
- Cancer Control Research, BC Cancer Research Institute, Vancouver, BC, Canada
- School of Population and Public Health, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
4
|
Pirmani A, Oldenhof M, Peeters LM, De Brouwer E, Moreau Y. Accessible Ecosystem for Clinical Research (Federated Learning for Everyone): Development and Usability Study. JMIR Form Res 2024; 8:e55496. [PMID: 39018557 PMCID: PMC11292148 DOI: 10.2196/55496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 04/25/2024] [Accepted: 05/15/2024] [Indexed: 07/19/2024] Open
Abstract
BACKGROUND The integrity and reliability of clinical research outcomes rely heavily on access to vast amounts of data. However, the fragmented distribution of these data across multiple institutions, along with ethical and regulatory barriers, presents significant challenges to accessing relevant data. While federated learning offers a promising solution to leverage insights from fragmented data sets, its adoption faces hurdles due to implementation complexities, scalability issues, and inclusivity challenges. OBJECTIVE This paper introduces Federated Learning for Everyone (FL4E), an accessible framework facilitating multistakeholder collaboration in clinical research. It focuses on simplifying federated learning through an innovative ecosystem-based approach. METHODS The "degree of federation" is a fundamental concept of FL4E, allowing for flexible integration of federated and centralized learning models. This feature provides a customizable solution by enabling users to choose the level of data decentralization based on specific health care settings or project needs, making federated learning more adaptable and efficient. By using an ecosystem-based collaborative learning strategy, FL4E encourages a comprehensive platform for managing real-world data, enhancing collaboration and knowledge sharing among its stakeholders. RESULTS Evaluating FL4E's effectiveness using real-world health care data sets has highlighted its ecosystem-oriented and inclusive design. By applying hybrid models to 2 distinct analytical tasks-classification and survival analysis-within real-world settings, we have effectively measured the "degree of federation" across various contexts. These evaluations show that FL4E's hybrid models not only match the performance of fully federated models but also avoid the substantial overhead usually linked with these models. Achieving this balance greatly enhances collaborative initiatives and broadens the scope of analytical possibilities within the ecosystem. CONCLUSIONS FL4E represents a significant step forward in collaborative clinical research by merging the benefits of centralized and federated learning. Its modular ecosystem-based design and the "degree of federation" feature make it an inclusive, customizable framework suitable for a wide array of clinical research scenarios, promising to revolutionize the field through improved collaboration and data use. Detailed implementation and analyses are available on the associated GitHub repository.
Collapse
Affiliation(s)
- Ashkan Pirmani
- ESAT-STADIUS, KU Leuven, Leuven, Belgium
- Data Science Institute, Hasselt University, Diepenbeek, Belgium
- University Multiple Sclerosis Center, Hasselt University, Diepenbeek, Belgium
- Biomedical Research Institute, Hasselt University, Diepenbeek, Belgium
| | | | - Liesbet M Peeters
- Data Science Institute, Hasselt University, Diepenbeek, Belgium
- University Multiple Sclerosis Center, Hasselt University, Diepenbeek, Belgium
- Biomedical Research Institute, Hasselt University, Diepenbeek, Belgium
| | | | | |
Collapse
|
5
|
Zhang F, Kreuter D, Chen Y, Dittmer S, Tull S, Shadbahr T, Preller J, Rudd JH, Aston JA, Schönlieb CB, Gleadall N, Roberts M. Recent methodological advances in federated learning for healthcare. PATTERNS (NEW YORK, N.Y.) 2024; 5:101006. [PMID: 39005485 PMCID: PMC11240178 DOI: 10.1016/j.patter.2024.101006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
For healthcare datasets, it is often impossible to combine data samples from multiple sites due to ethical, privacy, or logistical concerns. Federated learning allows for the utilization of powerful machine learning algorithms without requiring the pooling of data. Healthcare data have many simultaneous challenges, such as highly siloed data, class imbalance, missing data, distribution shifts, and non-standardized variables, that require new methodologies to address. Federated learning adds significant methodological complexity to conventional centralized machine learning, requiring distributed optimization, communication between nodes, aggregation of models, and redistribution of models. In this systematic review, we consider all papers on Scopus published between January 2015 and February 2023 that describe new federated learning methodologies for addressing challenges with healthcare data. We reviewed 89 papers meeting these criteria. Significant systemic issues were identified throughout the literature, compromising many methodologies reviewed. We give detailed recommendations to help improve methodology development for federated learning in healthcare.
Collapse
Affiliation(s)
- Fan Zhang
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
| | - Daniel Kreuter
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
| | - Yichen Chen
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
| | - Sören Dittmer
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
- ZeTeM, University of Bremen, Bremen, Germany
| | - Samuel Tull
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
| | - Tolou Shadbahr
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Jacobus Preller
- Addenbrooke’s Hospital, Cambridge University Hospitals NHS Trust, Cambridge, UK
| | - James H.F. Rudd
- Department of Medicine, University of Cambridge, Cambridge, UK
| | - John A.D. Aston
- Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, Cambridge, UK
| | - Carola-Bibiane Schönlieb
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
| | | | - Michael Roberts
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
- Department of Medicine, University of Cambridge, Cambridge, UK
| |
Collapse
|
6
|
D'Amico S, Dall’Olio L, Rollo C, Alonso P, Prada-Luengo I, Dall’Olio D, Sala C, Sauta E, Asti G, Lanino L, Maggioni G, Campagna A, Zazzetti E, Delleani M, Bicchieri ME, Morandini P, Savevski V, Arroyo B, Parras J, Zhao LP, Platzbecker U, Diez-Campelo M, Santini V, Fenaux P, Haferlach T, Krogh A, Zazo S, Fariselli P, Sanavia T, Della Porta MG, Castellani G. MOSAIC: An Artificial Intelligence-Based Framework for Multimodal Analysis, Classification, and Personalized Prognostic Assessment in Rare Cancers. JCO Clin Cancer Inform 2024; 8:e2400008. [PMID: 38875514 PMCID: PMC11371092 DOI: 10.1200/cci.24.00008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 03/14/2024] [Accepted: 04/15/2024] [Indexed: 06/16/2024] Open
Abstract
PURPOSE Rare cancers constitute over 20% of human neoplasms, often affecting patients with unmet medical needs. The development of effective classification and prognostication systems is crucial to improve the decision-making process and drive innovative treatment strategies. We have created and implemented MOSAIC, an artificial intelligence (AI)-based framework designed for multimodal analysis, classification, and personalized prognostic assessment in rare cancers. Clinical validation was performed on myelodysplastic syndrome (MDS), a rare hematologic cancer with clinical and genomic heterogeneities. METHODS We analyzed 4,427 patients with MDS divided into training and validation cohorts. Deep learning methods were applied to integrate and impute clinical/genomic features. Clustering was performed by combining Uniform Manifold Approximation and Projection for Dimension Reduction + Hierarchical Density-Based Spatial Clustering of Applications with Noise (UMAP + HDBSCAN) methods, compared with the conventional Hierarchical Dirichlet Process (HDP). Linear and AI-based nonlinear approaches were compared for survival prediction. Explainable AI (Shapley Additive Explanations approach [SHAP]) and federated learning were used to improve the interpretation and the performance of the clinical models, integrating them into distributed infrastructure. RESULTS UMAP + HDBSCAN clustering obtained a more granular patient stratification, achieving a higher average silhouette coefficient (0.16) with respect to HDP (0.01) and higher balanced accuracy in cluster classification by Random Forest (92.7% ± 1.3% and 85.8% ± 0.8%). AI methods for survival prediction outperform conventional statistical techniques and the reference prognostic tool for MDS. Nonlinear Gradient Boosting Survival stands in the internal (Concordance-Index [C-Index], 0.77; SD, 0.01) and external validation (C-Index, 0.74; SD, 0.02). SHAP analysis revealed that similar features drove patients' subgroups and outcomes in both training and validation cohorts. Federated implementation improved the accuracy of developed models. CONCLUSION MOSAIC provides an explainable and robust framework to optimize classification and prognostic assessment of rare cancers. AI-based approaches demonstrated superior accuracy in capturing genomic similarities and providing individual prognostic information compared with conventional statistical methods. Its federated implementation ensures broad clinical application, guaranteeing high performance and data protection.
Collapse
Affiliation(s)
- Saverio D'Amico
- Humanitas Clinical and Research Center—IRCCS, Milan, Italy
- Train s.r.l., Milan, Italy
| | | | - Cesare Rollo
- Computational Biomedicine Unit, Department of Medical Sciences, University of Turin, Turin, Italy
| | - Patricia Alonso
- Department of Signals, Systems and Radiocommunications, Polytechnic University of Madrid, Madrid, Spain
| | | | | | - Claudia Sala
- Experimental, Diagnostic and Specialty Medicine—DIMES, Bologna, Italy
| | | | - Gianluca Asti
- Humanitas Clinical and Research Center—IRCCS, Milan, Italy
| | - Luca Lanino
- Humanitas Clinical and Research Center—IRCCS, Milan, Italy
| | | | | | - Elena Zazzetti
- Humanitas Clinical and Research Center—IRCCS, Milan, Italy
| | | | | | | | | | - Borja Arroyo
- Department of Signals, Systems and Radiocommunications, Polytechnic University of Madrid, Madrid, Spain
| | - Juan Parras
- Department of Signals, Systems and Radiocommunications, Polytechnic University of Madrid, Madrid, Spain
| | - Lin Pierre Zhao
- Hematology and Bone Marrow Transplantation, Hôpital Saint-Louis/University Paris 7, Paris, France
| | - Uwe Platzbecker
- Medical Clinic and Policlinic 1, Hematology and Cellular Therapy, University Hospital Leipzig, Leipzig, Germany
| | - Maria Diez-Campelo
- Hematology Department, Hospital Universitario de Salamanca, Salamanca, Spain
| | - Valeria Santini
- Hematology, Azienda Ospedaliero-Universitaria Careggi & University of Florence, Florence, Italy
| | - Pierre Fenaux
- Hematology and Bone Marrow Transplantation, Hôpital Saint-Louis/University Paris 7, Paris, France
| | | | | | - Santiago Zazo
- Department of Signals, Systems and Radiocommunications, Polytechnic University of Madrid, Madrid, Spain
| | - Piero Fariselli
- Computational Biomedicine Unit, Department of Medical Sciences, University of Turin, Turin, Italy
| | - Tiziana Sanavia
- Computational Biomedicine Unit, Department of Medical Sciences, University of Turin, Turin, Italy
| | - Matteo Giovanni Della Porta
- Humanitas Clinical and Research Center—IRCCS, Milan, Italy
- Department of Biomedical Sciences, Humanitas University, Milan, Italy
| | - Gastone Castellani
- Department of Physics and Astronomy (DIFA), Bologna, Italy
- Experimental, Diagnostic and Specialty Medicine—DIMES, Bologna, Italy
| |
Collapse
|
7
|
Acharya N, Natarajan K. Development and Validation of an Individual Socioeconomic Deprivation Index (ISDI) in the NIH's All of Us Data Network. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2024; 2024:36-45. [PMID: 38827060 PMCID: PMC11141807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Many of the existing composite social determinant of health indices, such as Area Deprivation Index, are constrained by their reliance on geographic approximations and American Community Survey data. This study builds on the body of literature around deprivation indices to construct an individual socioeconomic deprivation index (ISDI) within the NIH's All of Us Data Network by using weighted multiple correspondence analysis on SDOH data elements collected at the participant level. In this study, the correlation between ISDI and another area-approximated index is assessed to the extent possible, along with the changes in an AI models performance due to stratified sampling based on ISDI quintiles. Individual level deprivation indices may have a wide range of utility particularly in the context of precision medicine in both centralized and distributed data networks.
Collapse
Affiliation(s)
- Nripendra Acharya
- Columbia University Medical Center, Department of Biomedical Informatics, New York, New York
| | - Karthik Natarajan
- Columbia University Medical Center, Department of Biomedical Informatics, New York, New York
| |
Collapse
|
8
|
Su C, Wei J, Lei Y, Xuan H, Li J. Empowering precise advertising with Fed-GANCC: A novel federated learning approach leveraging Generative Adversarial Networks and group clustering. PLoS One 2024; 19:e0298261. [PMID: 38598458 PMCID: PMC11006173 DOI: 10.1371/journal.pone.0298261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 01/22/2024] [Indexed: 04/12/2024] Open
Abstract
In the realm of targeted advertising, the demand for precision is paramount, and the traditional centralized machine learning paradigm fails to address this necessity effectively. Two critical challenges persist in the current advertising ecosystem: the data privacy concerns leading to isolated data islands and the complexity in handling non-Independent and Identically Distributed (non-IID) data and concept drift due to the specificity and diversity in user behavior data. Current federated learning frameworks struggle to overcome these hurdles satisfactorily. This paper introduces Fed-GANCC, an innovative federated learning framework that synergizes Generative Adversarial Networks (GANs) and Group Clustering. The framework incorporates a user data augmentation algorithm predicated on adversarial generative networks to enrich user behavior data, curtail the impact of non-uniform data distribution, and enhance the applicability of the global machine learning model. Unlike traditional approaches, our framework offers user data augmentation algorithms based on adversarial generative networks, which not only enriches user behavior data but also reduces the challenges posed by non-uniform data distribution, thereby enhancing the applicability of the global machine learning (ML) model. The effectiveness of Fed-GANCC is distinctly showcased through experimental results, outperforming contemporary methods like FED-AVG and FED-SGD in terms of accuracy, loss value, and receiver operating characteristic (ROC) indicators within the same computing time. Experimental results vindicate the effectiveness of Fed-GANCC, revealing substantial enhancements in accuracy, loss value, and receiver operating characteristic (ROC) metrics compared to FED-AVG and FED-SGD given the same computational time. These outcomes underline Fed-GANCC's exceptional prowess in mitigating issues such as isolated data islands, non-IID data, and concept drift. With its novel approach to addressing the prevailing challenges in targeted advertising such as isolated data islands, non-IID data, and concept drift, the Fed-GANCC framework stands as a benchmark, paving the way for future advancements in federated learning solutions tailored for the advertising domain. The Fed-GANCC framework promises to offer pivotal insights for the future development of efficient and advanced federated learning solutions for targeted advertising.
Collapse
Affiliation(s)
- Caiyu Su
- Guangxi Vocational & Technical Institute of Industry, Nanning, Guangxi, China
| | - Jinri Wei
- Guangxi Vocational & Technical Institute of Industry, Nanning, Guangxi, China
| | - Yuan Lei
- Universiti Pendidikan Sultan Idris, Tanjong Malim, Perak, Malaysia
| | - Hongkun Xuan
- Guangxi University of Foreign Languages, Nanning, Guangxi, China
| | - Jiahui Li
- Guangxi University of Foreign Languages, Nanning, Guangxi, China
| |
Collapse
|
9
|
Teo ZL, Jin L, Li S, Miao D, Zhang X, Ng WY, Tan TF, Lee DM, Chua KJ, Heng J, Liu Y, Goh RSM, Ting DSW. Federated machine learning in healthcare: A systematic review on clinical applications and technical architecture. Cell Rep Med 2024; 5:101419. [PMID: 38340728 PMCID: PMC10897620 DOI: 10.1016/j.xcrm.2024.101419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 11/17/2023] [Accepted: 01/18/2024] [Indexed: 02/12/2024]
Abstract
Federated learning (FL) is a distributed machine learning framework that is gaining traction in view of increasing health data privacy protection needs. By conducting a systematic review of FL applications in healthcare, we identify relevant articles in scientific, engineering, and medical journals in English up to August 31st, 2023. Out of a total of 22,693 articles under review, 612 articles are included in the final analysis. The majority of articles are proof-of-concepts studies, and only 5.2% are studies with real-life application of FL. Radiology and internal medicine are the most common specialties involved in FL. FL is robust to a variety of machine learning models and data types, with neural networks and medical imaging being the most common, respectively. We highlight the need to address the barriers to clinical translation and to assess its real-world impact in this new digital data-driven healthcare scene.
Collapse
Affiliation(s)
- Zhen Ling Teo
- Singapore National Eye Centre, Singapore, Singapore; Singapore Eye Research Institute, Singapore, Singapore
| | - Liyuan Jin
- Singapore Eye Research Institute, Singapore, Singapore; Duke-NUS Medical School, Singapore, Singapore
| | - Siqi Li
- Singapore Eye Research Institute, Singapore, Singapore; Duke-NUS Medical School, Singapore, Singapore
| | - Di Miao
- Singapore Eye Research Institute, Singapore, Singapore; Duke-NUS Medical School, Singapore, Singapore
| | - Xiaoman Zhang
- Singapore Eye Research Institute, Singapore, Singapore; Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
| | - Wei Yan Ng
- Singapore National Eye Centre, Singapore, Singapore; Singapore Eye Research Institute, Singapore, Singapore
| | - Ting Fang Tan
- Singapore National Eye Centre, Singapore, Singapore; Singapore Eye Research Institute, Singapore, Singapore
| | - Deborah Meixuan Lee
- Singapore Eye Research Institute, Singapore, Singapore; Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore, Singapore
| | - Kai Jie Chua
- Singapore National Eye Centre, Singapore, Singapore; Singapore Eye Research Institute, Singapore, Singapore
| | - John Heng
- Singapore National Eye Centre, Singapore, Singapore; Singapore Eye Research Institute, Singapore, Singapore
| | - Yong Liu
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
| | - Rick Siow Mong Goh
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
| | - Daniel Shu Wei Ting
- Singapore National Eye Centre, Singapore, Singapore; Singapore Eye Research Institute, Singapore, Singapore; Duke-NUS Medical School, Singapore, Singapore.
| |
Collapse
|
10
|
Li A, Mullin S, Elkin PL. Improving Prediction of Survival for Extremely Premature Infants Born at 23 to 29 Weeks Gestational Age in the Neonatal Intensive Care Unit: Development and Evaluation of Machine Learning Models. JMIR Med Inform 2024; 12:e42271. [PMID: 38354033 PMCID: PMC10902770 DOI: 10.2196/42271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 02/02/2023] [Accepted: 12/28/2023] [Indexed: 03/02/2024] Open
Abstract
BACKGROUND Infants born at extremely preterm gestational ages are typically admitted to the neonatal intensive care unit (NICU) after initial resuscitation. The subsequent hospital course can be highly variable, and despite counseling aided by available risk calculators, there are significant challenges with shared decision-making regarding life support and transition to end-of-life care. Improving predictive models can help providers and families navigate these unique challenges. OBJECTIVE Machine learning methods have previously demonstrated added predictive value for determining intensive care unit outcomes, and their use allows consideration of a greater number of factors that potentially influence newborn outcomes, such as maternal characteristics. Machine learning-based models were analyzed for their ability to predict the survival of extremely preterm neonates at initial admission. METHODS Maternal and newborn information was extracted from the health records of infants born between 23 and 29 weeks of gestation in the Medical Information Mart for Intensive Care III (MIMIC-III) critical care database. Applicable machine learning models predicting survival during the initial NICU admission were developed and compared. The same type of model was also examined using only features that would be available prepartum for the purpose of survival prediction prior to an anticipated preterm birth. Features most correlated with the predicted outcome were determined when possible for each model. RESULTS Of included patients, 37 of 459 (8.1%) expired. The resulting random forest model showed higher predictive performance than the frequently used Score for Neonatal Acute Physiology With Perinatal Extension II (SNAPPE-II) NICU model when considering extremely preterm infants of very low birth weight. Several other machine learning models were found to have good performance but did not show a statistically significant difference from previously available models in this study. Feature importance varied by model, and those of greater importance included gestational age; birth weight; initial oxygenation level; elements of the APGAR (appearance, pulse, grimace, activity, and respiration) score; and amount of blood pressure support. Important prepartum features also included maternal age, steroid administration, and the presence of pregnancy complications. CONCLUSIONS Machine learning methods have the potential to provide robust prediction of survival in the context of extremely preterm births and allow for consideration of additional factors such as maternal clinical and socioeconomic information. Evaluation of larger, more diverse data sets may provide additional clarity on comparative performance.
Collapse
Affiliation(s)
- Angie Li
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, United States
| | - Sarah Mullin
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, United States
| | - Peter L Elkin
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, United States
| |
Collapse
|
11
|
Soltan AAS, Thakur A, Yang J, Chauhan A, D'Cruz LG, Dickson P, Soltan MA, Thickett DR, Eyre DW, Zhu T, Clifton DA. A scalable federated learning solution for secondary care using low-cost microcomputing: privacy-preserving development and evaluation of a COVID-19 screening test in UK hospitals. Lancet Digit Health 2024; 6:e93-e104. [PMID: 38278619 DOI: 10.1016/s2589-7500(23)00226-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 10/17/2023] [Accepted: 10/30/2023] [Indexed: 01/28/2024]
Abstract
BACKGROUND Multicentre training could reduce biases in medical artificial intelligence (AI); however, ethical, legal, and technical considerations can constrain the ability of hospitals to share data. Federated learning enables institutions to participate in algorithm development while retaining custody of their data but uptake in hospitals has been limited, possibly as deployment requires specialist software and technical expertise at each site. We previously developed an artificial intelligence-driven screening test for COVID-19 in emergency departments, known as CURIAL-Lab, which uses vital signs and blood tests that are routinely available within 1 h of a patient's arrival. Here we aimed to federate our COVID-19 screening test by developing an easy-to-use embedded system-which we introduce as full-stack federated learning-to train and evaluate machine learning models across four UK hospital groups without centralising patient data. METHODS We supplied a Raspberry Pi 4 Model B preloaded with our federated learning software pipeline to four National Health Service (NHS) hospital groups in the UK: Oxford University Hospitals NHS Foundation Trust (OUH; through the locally linked research University, University of Oxford), University Hospitals Birmingham NHS Foundation Trust (UHB), Bedfordshire Hospitals NHS Foundation Trust (BH), and Portsmouth Hospitals University NHS Trust (PUH). OUH, PUH, and UHB participated in federated training, training a deep neural network and logistic regressor over 150 rounds to form and calibrate a global model to predict COVID-19 status, using clinical data from patients admitted before the pandemic (COVID-19-negative) and testing positive for COVID-19 during the first wave of the pandemic. We conducted a federated evaluation of the global model for admissions during the second wave of the pandemic at OUH, PUH, and externally at BH. For OUH and PUH, we additionally performed local fine-tuning of the global model using the sites' individual training data, forming a site-tuned model, and evaluated the resultant model for admissions during the second wave of the pandemic. This study included data collected between Dec 1, 2018, and March 1, 2021; the exact date ranges used varied by site. The primary outcome was overall model performance, measured as the area under the receiver operating characteristic curve (AUROC). Removable micro secure digital (microSD) storage was destroyed on study completion. FINDINGS Clinical data from 130 941 patients (1772 COVID-19-positive), routinely collected across three hospital groups (OUH, PUH, and UHB), were included in federated training. The evaluation step included data from 32 986 patients (3549 COVID-19-positive) attending OUH, PUH, or BH during the second wave of the pandemic. Federated training of a global deep neural network classifier improved upon performance of models trained locally in terms of AUROC by a mean of 27·6% (SD 2·2): AUROC increased from 0·574 (95% CI 0·560-0·589) at OUH and 0·622 (0·608-0·637) at PUH using the locally trained models to 0·872 (0·862-0·882) at OUH and 0·876 (0·865-0·886) at PUH using the federated global model. Performance improvement was smaller for a logistic regression model, with a mean increase in AUROC of 13·9% (0·5%). During federated external evaluation at BH, AUROC for the global deep neural network model was 0·917 (0·893-0·942), with 89·7% sensitivity (83·6-93·6) and 76·6% specificity (73·9-79·1). Site-specific tuning of the global model did not significantly improve performance (change in AUROC <0·01). INTERPRETATION We developed an embedded system for federated learning, using microcomputing to optimise for ease of deployment. We deployed full-stack federated learning across four UK hospital groups to develop a COVID-19 screening test without centralising patient data. Federation improved model performance, and the resultant global models were generalisable. Full-stack federated learning could enable hospitals to contribute to AI development at low cost and without specialist technical expertise at each site. FUNDING The Wellcome Trust, University of Oxford Medical and Life Sciences Translational Fund.
Collapse
Affiliation(s)
- Andrew A S Soltan
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK; Department of Oncology, University of Oxford, Oxford, UK; Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK; Big Data Institute, Nuffield Department of Population Health, University of Oxford, Oxford, UK; Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK.
| | - Anshul Thakur
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
| | - Jenny Yang
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
| | - Anoop Chauhan
- Portsmouth Hospitals University NHS Trust, Portsmouth, UK
| | - Leon G D'Cruz
- Portsmouth Hospitals University NHS Trust, Portsmouth, UK
| | | | - Marina A Soltan
- The Queen Elizabeth Hospital, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK; Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK
| | - David R Thickett
- The Queen Elizabeth Hospital, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK; Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK
| | - David W Eyre
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK; Big Data Institute, Nuffield Department of Population Health, University of Oxford, Oxford, UK; NIHR Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford and Public Health England, Oxford, UK; NIHR Oxford Biomedical Research Centre, Oxford, UK
| | - Tingting Zhu
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
| | - David A Clifton
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK; NIHR Oxford Biomedical Research Centre, Oxford, UK; Oxford-Suzhou Centre for Advanced Research, Suzhou, China
| |
Collapse
|
12
|
Pan W, Xu Z, Rajendran S, Wang F. An adaptive federated learning framework for clinical risk prediction with electronic health records from multiple hospitals. PATTERNS (NEW YORK, N.Y.) 2024; 5:100898. [PMID: 38264713 PMCID: PMC10801228 DOI: 10.1016/j.patter.2023.100898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 09/06/2023] [Accepted: 11/21/2023] [Indexed: 01/25/2024]
Abstract
Clinical risk prediction with electronic health records (EHR) using machine learning has attracted lots of attentions in recent years, where one of the key challenges is how to protect data privacy. Federated learning (FL) provides a promising framework for building predictive models by leveraging the data from multiple institutions without sharing them. However, data distribution drift across different institutions greatly impacts the performance of FL. In this paper, an adaptive FL framework was proposed to address this challenge. Our framework separated the input features into stable, domain-specific, and conditional-irrelevant parts according to their relationships to clinical outcomes. We evaluate this framework on the tasks of predicting the onset risk of sepsis and acute kidney injury (AKI) for patients in the intensive care unit (ICU) from multiple clinical institutions. The results showed that our framework can achieve better prediction performance compared with existing FL baselines and provide reasonable feature interpretations.
Collapse
Affiliation(s)
- Weishen Pan
- Department of Population Health Sciences, Weill Cornell Medical College, Cornell University, New York, NY 10065, USA
- Institute of Artificial Intelligence for Digital Health, Weill Cornell Medical College, Cornell University, New York, NY 10065, USA
| | - Zhenxing Xu
- Department of Population Health Sciences, Weill Cornell Medical College, Cornell University, New York, NY 10065, USA
- Institute of Artificial Intelligence for Digital Health, Weill Cornell Medical College, Cornell University, New York, NY 10065, USA
| | - Suraj Rajendran
- Tri-Institutional Computational Biology & Medicine Program, Weill Cornell Medical College, Cornell University, New York, NY 10065, USA
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medical College, Cornell University, New York, NY 10065, USA
- Institute of Artificial Intelligence for Digital Health, Weill Cornell Medical College, Cornell University, New York, NY 10065, USA
| |
Collapse
|
13
|
Choi G, Cha WC, Lee SU, Shin SY. Survey of Medical Applications of Federated Learning. Healthc Inform Res 2024; 30:3-15. [PMID: 38359845 PMCID: PMC10879826 DOI: 10.4258/hir.2024.30.1.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 01/23/2024] [Accepted: 01/24/2024] [Indexed: 02/17/2024] Open
Abstract
OBJECTIVES Medical artificial intelligence (AI) has recently attracted considerable attention. However, training medical AI models is challenging due to privacy-protection regulations. Among the proposed solutions, federated learning (FL) stands out. FL involves transmitting only model parameters without sharing the original data, making it particularly suitable for the medical field, where data privacy is paramount. This study reviews the application of FL in the medical domain. METHODS We conducted a literature search using the keywords "federated learning" in combination with "medical," "healthcare," or "clinical" on Google Scholar and PubMed. After reviewing titles and abstracts, 58 papers were selected for analysis. These FL studies were categorized based on the types of data used, the target disease, the use of open datasets, the local model of FL, and the neural network model. We also examined issues related to heterogeneity and security. RESULTS In the investigated FL studies, the most commonly used data type was image data, and the most studied target diseases were cancer and COVID-19. The majority of studies utilized open datasets. Furthermore, 72% of the FL articles addressed heterogeneity issues, while 50% discussed security concerns. CONCLUSIONS FL in the medical domain appears to be in its early stages, with most research using open data and focusing on specific data types and diseases for performance verification purposes. Nonetheless, medical FL research is anticipated to be increasingly applied and to become a vital component of multi-institutional research.
Collapse
Affiliation(s)
- Geunho Choi
- Department of Digital Health, SAIHST, Sungkyunkwan University, Seoul,
Korea
| | - Won Chul Cha
- Department of Digital Health, SAIHST, Sungkyunkwan University, Seoul,
Korea
- Department of Emergency Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul,
Korea
| | - Se Uk Lee
- Department of Emergency Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul,
Korea
| | - Soo-Yong Shin
- Department of Digital Health, SAIHST, Sungkyunkwan University, Seoul,
Korea
| |
Collapse
|
14
|
Salluh JIF, Quintairos A, Dongelmans DA, Aryal D, Bagshaw S, Beane A, Burghi G, López MDPA, Finazzi S, Guidet B, Hashimoto S, Ichihara N, Litton E, Lone NI, Pari V, Sendagire C, Vijayaraghavan BKT, Haniffa R, Pisani L, Pilcher D. National ICU Registries as Enablers of Clinical Research and Quality Improvement. Crit Care Med 2024; 52:125-135. [PMID: 37698452 DOI: 10.1097/ccm.0000000000006050] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2023]
Abstract
OBJECTIVES Clinical quality registries (CQRs) have been implemented worldwide by several medical specialties aiming to generate a better characterization of epidemiology, treatments, and outcomes of patients. National ICU registries were created almost 3 decades ago to improve the understanding of case-mix, resource use, and outcomes of critically ill patients. This narrative review describes the challenges, proposed solutions, and evidence generated by National ICU registries as facilitators for research and quality improvement. DATA SOURCES English language articles were identified in PubMed using phrases related to ICU registries, CQRs, outcomes, and case-mix. STUDY SELECTION Original research, review articles, letters, and commentaries, were considered. DATA EXTRACTION Data from relevant literature were identified, reviewed, and integrated into a concise narrative review. DATA SYNTHESIS CQRs have been implemented worldwide by several medical specialties aiming to generate a better characterization of epidemiology, treatments, and outcomes of patients. National ICU registries were created almost 3 decades ago to improve the understanding of case-mix, resource use, and outcomes of critically ill patients. The initial experience in European countries and in Oceania ensured that through locally generated data, ICUs could assess their performances by using risk-adjusted measures and compare their results through fair and validated benchmarking metrics with other ICUs contributing to the CQR. The accomplishment of these initiatives, coupled with the increasing adoption of information technology, resulted in a broad geographic expansion of CQRs as well as their use in quality improvement studies, clinical trials as well as international comparisons, and benchmarking for ICUs. CONCLUSIONS ICU registries have provided increased knowledge of case-mix and outcomes of ICU patients based on real-world data and contributed to improve care delivery through quality improvement initiatives and trials. Recent increases in adoption of new technologies (i.e., cloud-based structures, artificial intelligence, machine learning) will ensure a broader and better use of data for epidemiology, healthcare policies, quality improvement, and clinical trials.
Collapse
Affiliation(s)
- Jorge I F Salluh
- D'Or Institute for Research and Education, Rio de Janeiro, Brazil
- Post-Graduation Program, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Amanda Quintairos
- D'Or Institute for Research and Education, Rio de Janeiro, Brazil
- Department of Critical and Intensive Care Medicine, Academic Hospital Fundación Santa Fe de Bogota, Bogota, Colombia
| | - Dave A Dongelmans
- Amsterdam UMC location University of Amsterdam, Department of Intensive Care Medicine, Amsterdam, The Netherlands
- National Intensive Care Evaluation (NICE) Foundation, Amsterdam, The Netherlands
| | - Diptesh Aryal
- National Coordinator, Nepal Intensive Care Research Foundation, Kathmandu, Nepal
| | - Sean Bagshaw
- Department of Medicine, Faculty of Medicine and Dentistry (Ling, Bagshaw), University of Alberta and Alberta Health Services, Edmonton, AB, Canada
- Division of Internal Medicine (Villeneuve), Department of Critical Care Medicine, Faculty of Medicine and Dentistry and School of Public Health, University of Alberta and Grey Nuns Hospitals, Edmonton, AB, Canada
| | - Abigail Beane
- Critical Care, Mahidol Oxford Tropical Medicine Research Unit, Bangkok, Thailand
- Nuffield Department of Clinical Medicine, University of Oxford, Oxford, United Kingdom
| | | | - Maria Del Pilar Arias López
- Argentine Society of Intensive Care (SATI). SATI-Q Program, Buenos Aires, Argentina
- Intermediate Care Unit, Hospital de Niños Ricardo Gutierrez, Buenos Aires, Argentina
| | - Stefano Finazzi
- Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Ranica, Italy
- Associazione GiViTI, c/o Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | - Bertrand Guidet
- Sorbonne Université, INSERM, Institut Pierre Louis d'Epidémiologie et de Santé Publique, AP-HP, Hôpital Saint-Antoine, service de réanimation, Paris, France
| | - Satoru Hashimoto
- Division of Intensive Care, Department of Anesthesiology and Intensive Care Medicine, Kyoto Prefectural University of Medicine, Kyoto, Japan
| | - Nao Ichihara
- Department of Healthcare Quality Assessment, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Edward Litton
- Fiona Stanley Hospital, Perth, WA
- The University of Western Australia, Perth, WA
| | - Nazir I Lone
- Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
- Scottish Intensive Care Society Audit Group, United Kingdom
| | - Vrindha Pari
- Chennai Critical Care Consultants, Pvt Ltd, Chennai, India
| | - Cornelius Sendagire
- D'Or Institute for Research and Education, Rio de Janeiro, Brazil
- Anesthesia and Critical Care, Makerere University College of Health Sciences, Kampala, Uganda
| | | | - Rashan Haniffa
- Critical Care, Mahidol Oxford Tropical Medicine Research Unit, Bangkok, Thailand
- Crit Care Asia, Network for Improving Critical Care Systems and Training, Colombo, Sri Lanka
- Centre for Tropical Medicine and Global Health, University of Oxford, Oxford, United Kingdom
| | - Luigi Pisani
- Critical Care, Mahidol Oxford Tropical Medicine Research Unit, Bangkok, Thailand
| | - David Pilcher
- University College Hospital, London, United Kingdom
- Department of Intensive Care, Alfred Health, Prahran, VIC, Australia
- The Australian and New Zealand Intensive Care Society (ANZICS) Centre for Outcome and Resource Evaluation, Camberwell, Australia
| |
Collapse
|
15
|
Guckenberger M, Andratschke N, Chung C, Fuller D, Tanadini-Lang S, Jaffray DA. The Future of MR-Guided Radiation Therapy. Semin Radiat Oncol 2024; 34:135-144. [PMID: 38105088 DOI: 10.1016/j.semradonc.2023.10.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Magnetic resonance image guided radiation therapy (MRIgRT) is a relatively new technology that has already shown outcomes benefits but that has not yet reached its clinical potential. The improved soft-tissue contrast provided with MR, coupled with the immediacy of image acquisition with respect to the treatment, enables expansion of on-table adaptive protocols, currently at a cost of increased treatment complexity, use of human resources, and longer treatment slot times, which translate to decreased throughput. Many approaches are being investigated to meet these challenges, including the development of artificial intelligence (AI) algorithms to accelerate and automate much of the workflow and improved technology that parallelizes workflow tasks, as well as improvements in image acquisition speed and quality. This article summarizes limitations of current available integrated MRIgRT systems and gives an outlook about scientific developments to further expand the use of MRIgRT.
Collapse
Affiliation(s)
- Matthias Guckenberger
- Department of Radiation Oncology, University Hospital Zurich, University of Zurich, Zurich, Switzerland..
| | - Nicolaus Andratschke
- Department of Radiation Oncology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Caroline Chung
- Division of Radiation Oncology, University of Texas MD Anderson Cancer Center, Houston, TX
| | - Dave Fuller
- Division of Radiation Oncology, University of Texas MD Anderson Cancer Center, Houston, TX
| | - Stephanie Tanadini-Lang
- Department of Radiation Oncology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - David A Jaffray
- Division of Radiation Oncology, University of Texas MD Anderson Cancer Center, Houston, TX
| |
Collapse
|
16
|
Sharma S, Guleria K. A comprehensive review on federated learning based models for healthcare applications. Artif Intell Med 2023; 146:102691. [PMID: 38042608 DOI: 10.1016/j.artmed.2023.102691] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 10/22/2023] [Accepted: 10/22/2023] [Indexed: 12/04/2023]
Abstract
A disease is an abnormal condition that negatively impacts the functioning of the human body. Pathology determines the causes behind the disease and identifies its development mechanism and functional consequences. Each disease has different identification methods, including X-ray scans for pneumonia, covid-19, and lung cancer, whereas biopsy and CT-scan can identify the presence of skin cancer and Alzheimer's disease, respectively. Early disease detection leads to effective treatment and avoids abiding complications. Deep learning has provided a vast number of applications in medical sectors resulting in accurate and reliable early disease predictions. These models are utilized in the healthcare industry to provide supplementary assistance to doctors in identifying the presence of diseases. Majorly, these models are trained through secondary data sources since healthcare institutions refrain from sharing patients' private data to ensure confidentiality, which limits the effectiveness of deep learning models due to the requirement of extensive datasets for training to achieve optimal results. Federated learning deals with the data in such a way that it doesn't exploit the privacy of a patient's data. In this work, a wide variety of disease detection models trained through federated learning have been rigorously reviewed. This meta-analysis provides an in-depth review of the federated learning architectures, federated learning types, hyperparameters, dataset utilization details, aggregation techniques, performance measures, and augmentation methods applied in the existing models during the development phase. The review also highlights various open challenges associated with the disease detection models trained through federated learning for future research.
Collapse
Affiliation(s)
- Shagun Sharma
- Chitkara University Institute of Engineering & Technology, Chitkara University, Rajpura 140401, Punjab, India
| | - Kalpna Guleria
- Chitkara University Institute of Engineering & Technology, Chitkara University, Rajpura 140401, Punjab, India.
| |
Collapse
|
17
|
Li S, Liu P, Nascimento GG, Wang X, Leite FRM, Chakraborty B, Hong C, Ning Y, Xie F, Teo ZL, Ting DSW, Haddadi H, Ong MEH, Peres MA, Liu N. Federated and distributed learning applications for electronic health records and structured medical data: a scoping review. J Am Med Inform Assoc 2023; 30:2041-2049. [PMID: 37639629 PMCID: PMC10654866 DOI: 10.1093/jamia/ocad170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 07/19/2023] [Indexed: 08/31/2023] Open
Abstract
OBJECTIVES Federated learning (FL) has gained popularity in clinical research in recent years to facilitate privacy-preserving collaboration. Structured data, one of the most prevalent forms of clinical data, has experienced significant growth in volume concurrently, notably with the widespread adoption of electronic health records in clinical practice. This review examines FL applications on structured medical data, identifies contemporary limitations, and discusses potential innovations. MATERIALS AND METHODS We searched 5 databases, SCOPUS, MEDLINE, Web of Science, Embase, and CINAHL, to identify articles that applied FL to structured medical data and reported results following the PRISMA guidelines. Each selected publication was evaluated from 3 primary perspectives, including data quality, modeling strategies, and FL frameworks. RESULTS Out of the 1193 papers screened, 34 met the inclusion criteria, with each article consisting of one or more studies that used FL to handle structured clinical/medical data. Of these, 24 utilized data acquired from electronic health records, with clinical predictions and association studies being the most common clinical research tasks that FL was applied to. Only one article exclusively explored the vertical FL setting, while the remaining 33 explored the horizontal FL setting, with only 14 discussing comparisons between single-site (local) and FL (global) analysis. CONCLUSIONS The existing FL applications on structured medical data lack sufficient evaluations of clinically meaningful benefits, particularly when compared to single-site analyses. Therefore, it is crucial for future FL applications to prioritize clinical motivations and develop designs and methodologies that can effectively support and aid clinical practice and research.
Collapse
Affiliation(s)
- Siqi Li
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Pinyan Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Gustavo G Nascimento
- National Dental Research Institute Singapore, National Dental Centre Singapore, Singapore 168938, Singapore
- Oral Health Academic Clinical Programme, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Xinru Wang
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Fabio Renato Manzolli Leite
- National Dental Research Institute Singapore, National Dental Centre Singapore, Singapore 168938, Singapore
- Oral Health Academic Clinical Programme, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Bibhas Chakraborty
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore 169857, Singapore
- Department of Statistics and Data Science, National University of Singapore, Singapore 117546, Singapore
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27708, United States
| | - Chuan Hong
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27708, United States
| | - Yilin Ning
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Feng Xie
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Zhen Ling Teo
- Singapore National Eye Centre, Singapore, Singapore Eye Research Institute, Singapore 168751, Singapore
| | - Daniel Shu Wei Ting
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore
- Singapore National Eye Centre, Singapore, Singapore Eye Research Institute, Singapore 168751, Singapore
| | - Hamed Haddadi
- Department of Computing, Imperial College London, London SW7 2AZ, England, United Kingdom
| | - Marcus Eng Hock Ong
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore 169857, Singapore
- Department of Emergency Medicine, Singapore General Hospital, Singapore 169608, Singapore
| | - Marco Aurélio Peres
- National Dental Research Institute Singapore, National Dental Centre Singapore, Singapore 168938, Singapore
- Oral Health Academic Clinical Programme, Duke-NUS Medical School, Singapore 169857, Singapore
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Nan Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore 169857, Singapore
- Institute of Data Science, National University of Singapore, Singapore 117602, Singapore
| |
Collapse
|
18
|
Maniar KM, Lassarén P, Rana A, Yao Y, Tewarie IA, Gerstl JVE, Recio Blanco CM, Power LH, Mammi M, Mattie H, Smith TR, Mekary RA. Traditional Machine Learning Methods versus Deep Learning for Meningioma Classification, Grading, Outcome Prediction, and Segmentation: A Systematic Review and Meta-Analysis. World Neurosurg 2023; 179:e119-e134. [PMID: 37574189 DOI: 10.1016/j.wneu.2023.08.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 08/06/2023] [Indexed: 08/15/2023]
Abstract
BACKGROUND Meningiomas are common intracranial tumors. Machine learning (ML) algorithms are emerging to improve accuracy in 4 primary domains: classification, grading, outcome prediction, and segmentation. Such algorithms include both traditional approaches that rely on hand-crafted features and deep learning (DL) techniques that utilize automatic feature extraction. The aim of this study was to evaluate the performance of published traditional ML versus DL algorithms in classification, grading, outcome prediction, and segmentation of meningiomas. METHODS A systematic review and meta-analysis were conducted. Major databases were searched through September 2021 for publications evaluating traditional ML versus DL models on meningioma management. Performance measures including pooled sensitivity, specificity, F1-score, area under the receiver-operating characteristic curve, positive and negative likelihood ratios (LR+, LR-) along with their respective 95% confidence intervals (95% CIs) were derived using random-effects models. RESULTS Five hundred thirty-four records were screened, and 43 articles were included, regarding classification (3 articles), grading (29), outcome prediction (7), and segmentation (6) of meningiomas. Of the 29 studies that reported on grading, 10 could be meta-analyzed with 2 DL models (sensitivity 0.89, 95% CI: 0.74-0.96; specificity 0.91, 95% CI: 0.45-0.99; LR+ 10.1, 95% CI: 1.33-137; LR- 0.12, 95% CI: 0.04-0.59) and 8 traditional ML (sensitivity 0.74, 95% CI: 0.62-0.83; specificity 0.93, 95% CI: 0.79-0.98; LR+ 10.5, 95% CI: 2.91-39.5; and LR- 0.28, 95% CI: 0.17-0.49). The insufficient performance metrics reported precluded further statistical analysis of other performance metrics. CONCLUSIONS ML on meningiomas is mostly carried out with traditional methods. For meningioma grading, traditional ML methods generally had a higher LR+, while DL models a lower LR-.
Collapse
Affiliation(s)
- Krish M Maniar
- Department of Neurosurgery, Computational Neurosciences Outcomes Center (CNOC), Harvard Medical School, Brigham and Women's Hospital, Boston, Massachusetts, United States
| | - Philipp Lassarén
- Department of Neurosurgery, Computational Neurosciences Outcomes Center (CNOC), Harvard Medical School, Brigham and Women's Hospital, Boston, Massachusetts, United States; Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden
| | - Aakanksha Rana
- Department of Neurosurgery, Computational Neurosciences Outcomes Center (CNOC), Harvard Medical School, Brigham and Women's Hospital, Boston, Massachusetts, United States; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Boston, Massachusetts, United States
| | - Yuxin Yao
- Department of Pharmaceutical Business and Administrative Sciences, School of Pharmacy, Massachusetts College of Pharmacy and Health Sciences University, Boston, Massachusetts, United States
| | - Ishaan A Tewarie
- Department of Neurosurgery, Computational Neurosciences Outcomes Center (CNOC), Harvard Medical School, Brigham and Women's Hospital, Boston, Massachusetts, United States; Department of Neurosurgery, Haaglanden Medical Center, The Hague, The Netherlands; Faculty of Medicine, Erasmus University Rotterdam/Erasmus Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Jakob V E Gerstl
- Department of Neurosurgery, Computational Neurosciences Outcomes Center (CNOC), Harvard Medical School, Brigham and Women's Hospital, Boston, Massachusetts, United States
| | - Camila M Recio Blanco
- Department of Neurosurgery, Computational Neurosciences Outcomes Center (CNOC), Harvard Medical School, Brigham and Women's Hospital, Boston, Massachusetts, United States; Northeast National University, Corrientes, Argentina; Prisma Salud, Puerto San Julian, Santa Cruz, Argentina
| | - Liam H Power
- Department of Neurosurgery, Computational Neurosciences Outcomes Center (CNOC), Harvard Medical School, Brigham and Women's Hospital, Boston, Massachusetts, United States; School of Medicine, Tufts University, Boston, Massachusetts, United States
| | - Marco Mammi
- Neurosurgery Unit, S. Croce e Carle Hospital, Cuneo, Italy
| | - Heather Mattie
- Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, Massachusetts, United States
| | - Timothy R Smith
- Department of Neurosurgery, Computational Neurosciences Outcomes Center (CNOC), Harvard Medical School, Brigham and Women's Hospital, Boston, Massachusetts, United States; Department of Neurosurgery, Brigham and Women's Hospital, Harvard University, Boston, Massachusetts, United States
| | - Rania A Mekary
- Department of Neurosurgery, Computational Neurosciences Outcomes Center (CNOC), Harvard Medical School, Brigham and Women's Hospital, Boston, Massachusetts, United States; Department of Pharmaceutical Business and Administrative Sciences, School of Pharmacy, Massachusetts College of Pharmacy and Health Sciences University, Boston, Massachusetts, United States.
| |
Collapse
|
19
|
Sandhu SS, Gorji HT, Tavakolian P, Tavakolian K, Akhbardeh A. Medical Imaging Applications of Federated Learning. Diagnostics (Basel) 2023; 13:3140. [PMID: 37835883 PMCID: PMC10572559 DOI: 10.3390/diagnostics13193140] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 10/03/2023] [Accepted: 10/03/2023] [Indexed: 10/15/2023] Open
Abstract
Since its introduction in 2016, researchers have applied the idea of Federated Learning (FL) to several domains ranging from edge computing to banking. The technique's inherent security benefits, privacy-preserving capabilities, ease of scalability, and ability to transcend data biases have motivated researchers to use this tool on healthcare datasets. While several reviews exist detailing FL and its applications, this review focuses solely on the different applications of FL to medical imaging datasets, grouping applications by diseases, modality, and/or part of the body. This Systematic Literature review was conducted by querying and consolidating results from ArXiv, IEEE Xplorer, and PubMed. Furthermore, we provide a detailed description of FL architecture, models, descriptions of the performance achieved by FL models, and how results compare with traditional Machine Learning (ML) models. Additionally, we discuss the security benefits, highlighting two primary forms of privacy-preserving techniques, including homomorphic encryption and differential privacy. Finally, we provide some background information and context regarding where the contributions lie. The background information is organized into the following categories: architecture/setup type, data-related topics, security, and learning types. While progress has been made within the field of FL and medical imaging, much room for improvement and understanding remains, with an emphasis on security and data issues remaining the primary concerns for researchers. Therefore, improvements are constantly pushing the field forward. Finally, we highlighted the challenges in deploying FL in medical imaging applications and provided recommendations for future directions.
Collapse
Affiliation(s)
- Sukhveer Singh Sandhu
- Biomedical Engineering Program, University of North Dakota, Grand Forks, ND 58202, USA; (H.T.G.); (P.T.)
| | - Hamed Taheri Gorji
- Biomedical Engineering Program, University of North Dakota, Grand Forks, ND 58202, USA; (H.T.G.); (P.T.)
- SafetySpect Inc., 4200 James Ray Dr., Grand Forks, ND 58202, USA
| | - Pantea Tavakolian
- Biomedical Engineering Program, University of North Dakota, Grand Forks, ND 58202, USA; (H.T.G.); (P.T.)
| | - Kouhyar Tavakolian
- Biomedical Engineering Program, University of North Dakota, Grand Forks, ND 58202, USA; (H.T.G.); (P.T.)
| | | |
Collapse
|
20
|
Rehman MHU, Hugo Lopez Pinaya W, Nachev P, Teo JT, Ourselin S, Cardoso MJ. Federated learning for medical imaging radiology. Br J Radiol 2023; 96:20220890. [PMID: 38011227 PMCID: PMC10546441 DOI: 10.1259/bjr.20220890] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 07/31/2023] [Accepted: 08/02/2023] [Indexed: 11/29/2023] Open
Abstract
Federated learning (FL) is gaining wide acceptance across the medical AI domains. FL promises to provide a fairly acceptable clinical-grade accuracy, privacy, and generalisability of machine learning models across multiple institutions. However, the research on FL for medical imaging AI is still in its early stages. This paper presents a review of recent research to outline the difference between state-of-the-art [SOTA] (published literature) and state-of-the-practice [SOTP] (applied research in realistic clinical environments). Furthermore, the review outlines the future research directions considering various factors such as data, learning models, system design, governance, and human-in-loop to translate the SOTA into SOTP and effectively collaborate across multiple institutions.
Collapse
Affiliation(s)
| | | | - Parashkev Nachev
- Institute of Neurology, University College London, London, United Kingdom
| | - James T. Teo
- King’s College Hospital, NHS Foundation Trust, London, United Kingdom
| | | | | |
Collapse
|
21
|
Li S, Ning Y, Ong MEH, Chakraborty B, Hong C, Xie F, Yuan H, Liu M, Buckland DM, Chen Y, Liu N. FedScore: A privacy-preserving framework for federated scoring system development. J Biomed Inform 2023; 146:104485. [PMID: 37660960 DOI: 10.1016/j.jbi.2023.104485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 08/08/2023] [Accepted: 08/31/2023] [Indexed: 09/05/2023]
Abstract
OBJECTIVE We propose FedScore, a privacy-preserving federated learning framework for scoring system generation across multiple sites to facilitate cross-institutional collaborations. MATERIALS AND METHODS The FedScore framework includes five modules: federated variable ranking, federated variable transformation, federated score derivation, federated model selection and federated model evaluation. To illustrate usage and assess FedScore's performance, we built a hypothetical global scoring system for mortality prediction within 30 days after a visit to an emergency department using 10 simulated sites divided from a tertiary hospital in Singapore. We employed a pre-existing score generator to construct 10 local scoring systems independently at each site and we also developed a scoring system using centralized data for comparison. RESULTS We compared the acquired FedScore model's performance with that of other scoring models using the receiver operating characteristic (ROC) analysis. The FedScore model achieved an average area under the curve (AUC) value of 0.763 across all sites, with a standard deviation (SD) of 0.020. We also calculated the average AUC values and SDs for each local model, and the FedScore model showed promising accuracy and stability with a high average AUC value which was closest to the one of the pooled model and SD which was lower than that of most local models. CONCLUSION This study demonstrates that FedScore is a privacy-preserving scoring system generator with potentially good generalizability.
Collapse
Affiliation(s)
- Siqi Li
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
| | - Yilin Ning
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
| | - Marcus Eng Hock Ong
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore; Health Services Research Centre, Singapore Health Services, Singapore, Singapore; Department of Emergency Medicine, Singapore General Hospital, Singapore, Singapore
| | - Bibhas Chakraborty
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore; Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore; Department of Statistics and Data Science, National University of Singapore, Singapore, Singapore; Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | - Chuan Hong
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | - Feng Xie
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore; Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
| | - Han Yuan
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
| | - Mingxuan Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
| | - Daniel M Buckland
- Department of Emergency Medicine, Duke University School of Medicine, Durham, NC, USA
| | - Yong Chen
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Nan Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore; Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore; Institute of Data Science, National University of Singapore, Singapore, Singapore.
| |
Collapse
|
22
|
Li W, Kim M, Zhang K, Chen H, Jiang X, Harmanci A. COLLAGENE enables privacy-aware federated and collaborative genomic data analysis. Genome Biol 2023; 24:204. [PMID: 37697426 PMCID: PMC10496350 DOI: 10.1186/s13059-023-03039-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Accepted: 08/16/2023] [Indexed: 09/13/2023] Open
Abstract
Growing regulatory requirements set barriers around genetic data sharing and collaborations. Moreover, existing privacy-aware paradigms are challenging to deploy in collaborative settings. We present COLLAGENE, a tool base for building secure collaborative genomic data analysis methods. COLLAGENE protects data using shared-key homomorphic encryption and combines encryption with multiparty strategies for efficient privacy-aware collaborative method development. COLLAGENE provides ready-to-run tools for encryption/decryption, matrix processing, and network transfers, which can be immediately integrated into existing pipelines. We demonstrate the usage of COLLAGENE by building a practical federated GWAS protocol for binary phenotypes and a secure meta-analysis protocol. COLLAGENE is available at https://zenodo.org/record/8125935 .
Collapse
Affiliation(s)
- Wentao Li
- Center for Secure Artificial Intelligence For hEalthcare (SAFE), D. Bradley McWilliams School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Miran Kim
- Department of Mathematics, Department of Computer Science, Hanyang University, Seoul, 04763, Republic of Korea
- Research Institute for Convergence of Basic Science, Hanyang University, Seoul, 04763, Republic of Korea
- Bio-BigData Center, Hanyang Institute of Bioscience and Biotechnology, Hanyang University, Seoul, 04763, Republic of Korea
| | - Kai Zhang
- Center for Secure Artificial Intelligence For hEalthcare (SAFE), D. Bradley McWilliams School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
- Center for Precision Health, D. Bradley McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Xiaoqian Jiang
- Center for Secure Artificial Intelligence For hEalthcare (SAFE), D. Bradley McWilliams School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Arif Harmanci
- Center for Secure Artificial Intelligence For hEalthcare (SAFE), D. Bradley McWilliams School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA.
- Center for Precision Health, D. Bradley McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| |
Collapse
|
23
|
Diniz JM, Vasconcelos H, Souza J, Rb-Silva R, Ameijeiras-Rodriguez C, Freitas A. Comparing Decentralized Learning Methods for Health Data Models to Nondecentralized Alternatives: Protocol for a Systematic Review. JMIR Res Protoc 2023; 12:e45823. [PMID: 37335606 PMCID: PMC10337426 DOI: 10.2196/45823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 04/27/2023] [Accepted: 04/28/2023] [Indexed: 06/21/2023] Open
Abstract
BACKGROUND Considering the soaring health-related costs directed toward a growing, aging, and comorbid population, the health sector needs effective data-driven interventions while managing rising care costs. While health interventions using data mining have become more robust and adopted, they often demand high-quality big data. However, growing privacy concerns have hindered large-scale data sharing. In parallel, recently introduced legal instruments require complex implementations, especially when it comes to biomedical data. New privacy-preserving technologies, such as decentralized learning, make it possible to create health models without mobilizing data sets by using distributed computation principles. Several multinational partnerships, including a recent agreement between the United States and the European Union, are adopting these techniques for next-generation data science. While these approaches are promising, there is no clear and robust evidence synthesis of health care applications. OBJECTIVE The main aim is to compare the performance among health data models (eg, automated diagnosis and mortality prediction) developed using decentralized learning approaches (eg, federated and blockchain) to those using centralized or local methods. Secondary aims are comparing the privacy compromise and resource use among model architectures. METHODS We will conduct a systematic review using the first-ever registered research protocol for this topic following a robust search methodology, including several biomedical and computational databases. This work will compare health data models differing in development architecture, grouping them according to their clinical applications. For reporting purposes, a PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2020 flow diagram will be presented. CHARMS (Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies)-based forms will be used for data extraction and to assess the risk of bias, alongside PROBAST (Prediction Model Risk of Bias Assessment Tool). All effect measures in the original studies will be reported. RESULTS The queries and data extractions are expected to start on February 28, 2023, and end by July 31, 2023. The research protocol was registered with PROSPERO, under the number 393126, on February 3, 2023. With this protocol, we detail how we will conduct the systematic review. With that study, we aim to summarize the progress and findings from state-of-the-art decentralized learning models in health care in comparison to their local and centralized counterparts. Results are expected to clarify the consensuses and heterogeneities reported and help guide the research and development of new robust and sustainable applications to address the health data privacy problem, with applicability in real-world settings. CONCLUSIONS We expect to clearly present the status quo of these privacy-preserving technologies in health care. With this robust synthesis of the currently available scientific evidence, the review will inform health technology assessment and evidence-based decisions, from health professionals, data scientists, and policy makers alike. Importantly, it should also guide the development and application of new tools in service of patients' privacy and future research. TRIAL REGISTRATION PROSPERO 393126; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=393126. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) PRR1-10.2196/45823.
Collapse
Affiliation(s)
- José Miguel Diniz
- CINTESIS-Centre for Health Technology and Services Research, Faculty of Medicine, University of Porto, Porto, Portugal
- PhD Program in Health Data Science, Faculty of Medicine, University of Porto, Porto, Portugal
| | - Henrique Vasconcelos
- CINTESIS-Centre for Health Technology and Services Research, Faculty of Medicine, University of Porto, Porto, Portugal
| | - Júlio Souza
- CINTESIS-Centre for Health Technology and Services Research, Faculty of Medicine, University of Porto, Porto, Portugal
- MEDCIDS-Department of Community Medicine, Information and Health Decision Sciences, Faculty of Medicine, University of Porto, Porto, Portugal
| | - Rita Rb-Silva
- MEDCIDS-Department of Community Medicine, Information and Health Decision Sciences, Faculty of Medicine, University of Porto, Porto, Portugal
| | - Carolina Ameijeiras-Rodriguez
- MEDCIDS-Department of Community Medicine, Information and Health Decision Sciences, Faculty of Medicine, University of Porto, Porto, Portugal
| | - Alberto Freitas
- CINTESIS-Centre for Health Technology and Services Research, Faculty of Medicine, University of Porto, Porto, Portugal
- MEDCIDS-Department of Community Medicine, Information and Health Decision Sciences, Faculty of Medicine, University of Porto, Porto, Portugal
| |
Collapse
|
24
|
Tsai HF, Podder S, Chen PY. Microsystem Advances through Integration with Artificial Intelligence. MICROMACHINES 2023; 14:826. [PMID: 37421059 PMCID: PMC10141994 DOI: 10.3390/mi14040826] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Revised: 04/04/2023] [Accepted: 04/06/2023] [Indexed: 07/09/2023]
Abstract
Microfluidics is a rapidly growing discipline that involves studying and manipulating fluids at reduced length scale and volume, typically on the scale of micro- or nanoliters. Under the reduced length scale and larger surface-to-volume ratio, advantages of low reagent consumption, faster reaction kinetics, and more compact systems are evident in microfluidics. However, miniaturization of microfluidic chips and systems introduces challenges of stricter tolerances in designing and controlling them for interdisciplinary applications. Recent advances in artificial intelligence (AI) have brought innovation to microfluidics from design, simulation, automation, and optimization to bioanalysis and data analytics. In microfluidics, the Navier-Stokes equations, which are partial differential equations describing viscous fluid motion that in complete form are known to not have a general analytical solution, can be simplified and have fair performance through numerical approximation due to low inertia and laminar flow. Approximation using neural networks trained by rules of physical knowledge introduces a new possibility to predict the physicochemical nature. The combination of microfluidics and automation can produce large amounts of data, where features and patterns that are difficult to discern by a human can be extracted by machine learning. Therefore, integration with AI introduces the potential to revolutionize the microfluidic workflow by enabling the precision control and automation of data analysis. Deployment of smart microfluidics may be tremendously beneficial in various applications in the future, including high-throughput drug discovery, rapid point-of-care-testing (POCT), and personalized medicine. In this review, we summarize key microfluidic advances integrated with AI and discuss the outlook and possibilities of combining AI and microfluidics.
Collapse
Affiliation(s)
- Hsieh-Fu Tsai
- Department of Biomedical Engineering, Chang Gung University, Taoyuan City 333, Taiwan;
- Department of Neurosurgery, Chang Gung Memorial Hospital, Keelung, Keelung City 204, Taiwan
- Center for Biomedical Engineering, Chang Gung University, Taoyuan City 333, Taiwan
| | - Soumyajit Podder
- Department of Biomedical Engineering, Chang Gung University, Taoyuan City 333, Taiwan;
| | - Pin-Yuan Chen
- Department of Biomedical Engineering, Chang Gung University, Taoyuan City 333, Taiwan;
- Department of Neurosurgery, Chang Gung Memorial Hospital, Keelung, Keelung City 204, Taiwan
| |
Collapse
|
25
|
Rajendran S, Xu Z, Pan W, Ghosh A, Wang F. Data heterogeneity in federated learning with Electronic Health Records: Case studies of risk prediction for acute kidney injury and sepsis diseases in critical care. PLOS DIGITAL HEALTH 2023; 2:e0000117. [PMID: 36920974 PMCID: PMC10016691 DOI: 10.1371/journal.pdig.0000117] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 02/10/2023] [Indexed: 03/16/2023]
Abstract
With the wider availability of healthcare data such as Electronic Health Records (EHR), more and more data-driven based approaches have been proposed to improve the quality-of-care delivery. Predictive modeling, which aims at building computational models for predicting clinical risk, is a popular research topic in healthcare analytics. However, concerns about privacy of healthcare data may hinder the development of effective predictive models that are generalizable because this often requires rich diverse data from multiple clinical institutions. Recently, federated learning (FL) has demonstrated promise in addressing this concern. However, data heterogeneity from different local participating sites may affect prediction performance of federated models. Due to acute kidney injury (AKI) and sepsis' high prevalence among patients admitted to intensive care units (ICU), the early prediction of these conditions based on AI is an important topic in critical care medicine. In this study, we take AKI and sepsis onset risk prediction in ICU as two examples to explore the impact of data heterogeneity in the FL framework as well as compare performances across frameworks. We built predictive models based on local, pooled, and FL frameworks using EHR data across multiple hospitals. The local framework only used data from each site itself. The pooled framework combined data from all sites. In the FL framework, each local site did not have access to other sites' data. A model was updated locally, and its parameters were shared to a central aggregator, which was used to update the federated model's parameters and then subsequently, shared with each site. We found models built within a FL framework outperformed local counterparts. Then, we analyzed variable importance discrepancies across sites and frameworks. Finally, we explored potential sources of the heterogeneity within the EHR data. The different distributions of demographic profiles, medication use, and site information contributed to data heterogeneity.
Collapse
Affiliation(s)
- Suraj Rajendran
- Tri-Institutional Computational Biology & Medicine Program, Cornell University, New York, New York, United States of America
| | - Zhenxing Xu
- Division of Health Informatics, Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, United States of America
| | - Weishen Pan
- Division of Health Informatics, Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, United States of America
| | - Arnab Ghosh
- Departments of Medicine, Weill Cornell Medical College, Cornell University, New York, New York, United States of America
| | - Fei Wang
- Division of Health Informatics, Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, United States of America
| |
Collapse
|
26
|
Federated machine learning in data-protection-compliant research. NAT MACH INTELL 2023. [DOI: 10.1038/s42256-022-00601-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
27
|
Walker SB, Badke CM, Carroll MS, Honegger KS, Fawcett A, Weese-Mayer DE, Sanchez-Pinto LN. Novel approaches to capturing and using continuous cardiorespiratory physiological data in hospitalized children. Pediatr Res 2023; 93:396-404. [PMID: 36329224 DOI: 10.1038/s41390-022-02359-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 08/16/2022] [Accepted: 10/11/2022] [Indexed: 11/06/2022]
Abstract
Continuous cardiorespiratory physiological monitoring is a cornerstone of care in hospitalized children. The data generated by monitoring devices coupled with machine learning could transform the way we provide care. This scoping review summarizes existing evidence on novel approaches to continuous cardiorespiratory monitoring in hospitalized children. We aimed to identify opportunities for the development of monitoring technology and the use of machine learning to analyze continuous physiological data to improve the outcomes of hospitalized children. We included original research articles published on or after January 1, 2001, involving novel approaches to collect and use continuous cardiorespiratory physiological data in hospitalized children. OVID Medline, PubMed, and Embase databases were searched. We screened 2909 articles and performed full-text extraction of 105 articles. We identified 58 articles describing novel devices or approaches, which were generally small and single-center. In addition, we identified 47 articles that described the use of continuous physiological data in prediction models, but only 7 integrated multidimensional data (e.g., demographics, laboratory results). We identified three areas for development: (1) further validation of promising novel devices; (2) more studies of models integrating multidimensional data with continuous cardiorespiratory data; and (3) further dissemination, implementation, and validation of prediction models using continuous cardiorespiratory data. IMPACT: We performed a comprehensive scoping review of novel approaches to capture and use continuous cardiorespiratory physiological data for monitoring, diagnosis, providing care, and predicting events in hospitalized infants and children, from novel devices to machine learning-based prediction models. We identified three key areas for future development: (1) further validation of promising novel devices; (2) more studies of models integrating multidimensional data with continuous cardiorespiratory data; and (3) further dissemination, implementation, and validation of prediction models using cardiorespiratory data.
Collapse
Affiliation(s)
- Sarah B Walker
- Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, IL, USA. .,Stanley Manne Children's Research Institute, Ann & Robert H. Lurie Children's Hospital of Chicago, Chicago, IL, USA.
| | - Colleen M Badke
- Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.,Stanley Manne Children's Research Institute, Ann & Robert H. Lurie Children's Hospital of Chicago, Chicago, IL, USA
| | - Michael S Carroll
- Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.,Stanley Manne Children's Research Institute, Ann & Robert H. Lurie Children's Hospital of Chicago, Chicago, IL, USA
| | - Kyle S Honegger
- Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.,Stanley Manne Children's Research Institute, Ann & Robert H. Lurie Children's Hospital of Chicago, Chicago, IL, USA
| | - Andrea Fawcett
- Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.,Stanley Manne Children's Research Institute, Ann & Robert H. Lurie Children's Hospital of Chicago, Chicago, IL, USA
| | - Debra E Weese-Mayer
- Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.,Stanley Manne Children's Research Institute, Ann & Robert H. Lurie Children's Hospital of Chicago, Chicago, IL, USA
| | - L Nelson Sanchez-Pinto
- Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.,Stanley Manne Children's Research Institute, Ann & Robert H. Lurie Children's Hospital of Chicago, Chicago, IL, USA
| |
Collapse
|
28
|
Seastedt KP, Schwab P, O’Brien Z, Wakida E, Herrera K, Marcelo PGF, Agha-Mir-Salim L, Frigola XB, Ndulue EB, Marcelo A, Celi LA. Global healthcare fairness: We should be sharing more, not less, data. PLOS DIGITAL HEALTH 2022; 1:e0000102. [PMID: 36812599 PMCID: PMC9931202 DOI: 10.1371/journal.pdig.0000102] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2023]
Abstract
The availability of large, deidentified health datasets has enabled significant innovation in using machine learning (ML) to better understand patients and their diseases. However, questions remain regarding the true privacy of this data, patient control over their data, and how we regulate data sharing in a way that that does not encumber progress or further potentiate biases for underrepresented populations. After reviewing the literature on potential reidentifications of patients in publicly available datasets, we argue that the cost-measured in terms of access to future medical innovations and clinical software-of slowing ML progress is too great to limit sharing data through large publicly available databases for concerns of imperfect data anonymization. This cost is especially great for developing countries where the barriers preventing inclusion in such databases will continue to rise, further excluding these populations and increasing existing biases that favor high-income countries. Preventing artificial intelligence's progress towards precision medicine and sliding back to clinical practice dogma may pose a larger threat than concerns of potential patient reidentification within publicly available datasets. While the risk to patient privacy should be minimized, we believe this risk will never be zero, and society has to determine an acceptable risk threshold below which data sharing can occur-for the benefit of a global medical knowledge system.
Collapse
Affiliation(s)
- Kenneth P. Seastedt
- Beth Israel Deaconess Medical Center, Department of Surgery, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Patrick Schwab
- GlaxoSmithKline, Artificial Intelligence & Machine Learning, Zug, Switzerland
| | - Zach O’Brien
- Australian and New Zealand Intensive Care Research Centre (ANZIC-RC), Department of Epidemiology and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
| | - Edith Wakida
- Mbarara University of Science and Technology, Mbarara, Uganda
| | - Karen Herrera
- Quality and Patient Safety, Hospital Militar, Managua, Nicaragua
| | - Portia Grace F. Marcelo
- Department of Family & Community Medicine, University of the Philippines, Manila, Philippines
| | - Louis Agha-Mir-Salim
- Institute of Medical Informatics, Charité—Universitätsmedizin Berlin (corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health), Berlin, Germany
- Laboratory for Computational Physiology, Harvard-MIT Division of Health Sciences & Technology, Cambridge, Massachusetts, United States of America
| | - Xavier Borrat Frigola
- Laboratory for Computational Physiology, Harvard-MIT Division of Health Sciences & Technology, Cambridge, Massachusetts, United States of America
- Anesthesiology and Critical Care Department, Hospital Clinic de Barcelona, Barcelona, Spain
| | - Emily Boardman Ndulue
- Department of Journalism, Northeastern University, Boston, Massachusetts, United States of America
| | - Alvin Marcelo
- Department of Surgery, University of the Philippines, Manila, Philippines
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Harvard-MIT Division of Health Sciences & Technology, Cambridge, Massachusetts, United States of America
- Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Biostatistics Harvard T.H, Chan School of Public Health, Boston, Massachusetts, United States of America
| |
Collapse
|
29
|
Gottlieb ER, Samuel M, Bonventre JV, Celi LA, Mattie H. Machine Learning for Acute Kidney Injury Prediction in the Intensive Care Unit. Adv Chronic Kidney Dis 2022; 29:431-438. [PMID: 36253026 PMCID: PMC9586459 DOI: 10.1053/j.ackd.2022.06.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 06/01/2022] [Accepted: 06/22/2022] [Indexed: 01/25/2023]
Abstract
Machine learning is the field of artificial intelligence in which computers are trained to make predictions or to identify patterns in data through complex mathematical algorithms. It has great potential in critical care to predict outcomes, such as acute kidney injury, and can be used for prognosis and to suggest management strategies. Machine learning can also be used as a research tool to advance our clinical and biochemical understanding of acute kidney injury. In this review, we introduce basic concepts in machine learning and review recent research in each of these domains.
Collapse
Affiliation(s)
- Eric R Gottlieb
- Renal Section, Brigham and Women's Hospital, Boston, MA; Harvard Medical School, Boston, MA; Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA.
| | | | - Joseph V Bonventre
- Renal Section, Brigham and Women's Hospital, Boston, MA; Harvard Medical School, Boston, MA
| | - Leo A Celi
- Harvard Medical School, Boston, MA; Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA; MIT Critical Data, Cambridge, MA; Harvard T.H. Chan School of Public Health, Boston, MA; Beth Israel Deaconess Medical Center, Boston, MA
| | | |
Collapse
|