1
|
Tajabadi M, Martin R, Heider D. Privacy-preserving decentralized learning methods for biomedical applications. Comput Struct Biotechnol J 2024; 23:3281-3287. [PMID: 39296807 PMCID: PMC11408144 DOI: 10.1016/j.csbj.2024.08.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Revised: 08/26/2024] [Accepted: 08/26/2024] [Indexed: 09/21/2024] Open
Abstract
In recent years, decentralized machine learning has emerged as a significant advancement in biomedical applications, offering robust solutions for data privacy, security, and collaboration across diverse healthcare environments. In this review, we examine various decentralized learning methodologies, including federated learning, split learning, swarm learning, gossip learning, edge learning, and some of their applications in the biomedical field. We delve into the underlying principles, network topologies, and communication strategies of each approach, highlighting their advantages and limitations. Ultimately, the selection of a suitable method should be based on specific needs, infrastructures, and computational capabilities.
Collapse
Affiliation(s)
- Mohammad Tajabadi
- Institute of Computer Science, Heinrich-Heine-University Duesseldorf, Graf-Adolf-Str. 63, Duesseldorf, 40215, North Rhine-Westphalia, Germany
- Center for Digital Medicine, Heinrich-Heine-University Duesseldorf, Moorenstr. 5, Duesseldorf, 40215, North Rhine-Westphalia, Germany
| | - Roman Martin
- Institute of Computer Science, Heinrich-Heine-University Duesseldorf, Graf-Adolf-Str. 63, Duesseldorf, 40215, North Rhine-Westphalia, Germany
- Center for Digital Medicine, Heinrich-Heine-University Duesseldorf, Moorenstr. 5, Duesseldorf, 40215, North Rhine-Westphalia, Germany
| | - Dominik Heider
- Institute of Computer Science, Heinrich-Heine-University Duesseldorf, Graf-Adolf-Str. 63, Duesseldorf, 40215, North Rhine-Westphalia, Germany
- Center for Digital Medicine, Heinrich-Heine-University Duesseldorf, Moorenstr. 5, Duesseldorf, 40215, North Rhine-Westphalia, Germany
| |
Collapse
|
2
|
Hausleitner C, Mueller H, Holzinger A, Pfeifer B. Collaborative weighting in federated graph neural networks for disease classification with the human-in-the-loop. Sci Rep 2024; 14:21839. [PMID: 39294334 PMCID: PMC11410954 DOI: 10.1038/s41598-024-72748-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Accepted: 09/10/2024] [Indexed: 09/20/2024] Open
Abstract
The authors introduce a novel framework that integrates federated learning with Graph Neural Networks (GNNs) to classify diseases, incorporating Human-in-the-Loop methodologies. This advanced framework innovatively employs collaborative voting mechanisms on subgraphs within a Protein-Protein Interaction (PPI) network, situated in a federated ensemble-based deep learning context. This methodological approach marks a significant stride in the development of explainable and privacy-aware Artificial Intelligence, significantly contributing to the progression of personalized digital medicine in a responsible and transparent manner.
Collapse
Affiliation(s)
- Christian Hausleitner
- Institute for Medical Informatics, Statistics and Documentation, Medical University Graz, 8036, Graz, Austria
| | - Heimo Mueller
- Institute for Medical Informatics, Statistics and Documentation, Medical University Graz, 8036, Graz, Austria
| | - Andreas Holzinger
- Institute for Medical Informatics, Statistics and Documentation, Medical University Graz, 8036, Graz, Austria.
- Human-Centered AI Lab, Institute of Forest Engineering, Department of Forest and Soil Sciences, University of Natural Resources and Life Sciences Vienna, 1190, Vienna, Austria.
- Alberta Machine Intelligence Institute, Edmonton, T6G 2R3, Canada.
| | - Bastian Pfeifer
- Institute for Medical Informatics, Statistics and Documentation, Medical University Graz, 8036, Graz, Austria
| |
Collapse
|
3
|
Chereda H, Leha A, Beißbarth T. Stable feature selection utilizing Graph Convolutional Neural Network and Layer-wise Relevance Propagation for biomarker discovery in breast cancer. Artif Intell Med 2024; 151:102840. [PMID: 38658129 DOI: 10.1016/j.artmed.2024.102840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 03/05/2024] [Accepted: 03/10/2024] [Indexed: 04/26/2024]
Abstract
High-throughput technologies are becoming increasingly important in discovering prognostic biomarkers and in identifying novel drug targets. With Mammaprint, Oncotype DX, and many other prognostic molecular signatures breast cancer is one of the paradigmatic examples of the utility of high-throughput data to deliver prognostic biomarkers, that can be represented in a form of a rather short gene list. Such gene lists can be obtained as a set of features (genes) that are important for the decisions of a Machine Learning (ML) method applied to high-dimensional gene expression data. Several studies have identified predictive gene lists for patient prognosis in breast cancer, but these lists are unstable and have only a few genes in common. Instability of feature selection impedes biological interpretability: genes that are relevant for cancer pathology should be members of any predictive gene list obtained for the same clinical type of patients. Stability and interpretability of selected features can be improved by including information on molecular networks in ML methods. Graph Convolutional Neural Network (GCNN) is a contemporary deep learning approach applicable to gene expression data structured by a prior knowledge molecular network. Layer-wise Relevance Propagation (LRP) and SHapley Additive exPlanations (SHAP) are methods to explain individual decisions of deep learning models. We used both GCNN+LRP and GCNN+SHAP techniques to construct feature sets by aggregating individual explanations. We suggest a methodology to systematically and quantitatively analyze the stability, the impact on the classification performance, and the interpretability of the selected feature sets. We used this methodology to compare GCNN+LRP to GCNN+SHAP and to more classical ML-based feature selection approaches. Utilizing a large breast cancer gene expression dataset we show that, while feature selection with SHAP is useful in applications where selected features have to be impactful for classification performance, among all studied methods GCNN+LRP delivers the most stable (reproducible) and interpretable gene lists.
Collapse
Affiliation(s)
- Hryhorii Chereda
- Medical Bioinformatics, University Medical Center Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany
| | - Andreas Leha
- Medical Bioinformatics, University Medical Center Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany; Medical Statistics, University Medical Center Göttingen, Humboldtallee 32, Göttingen, 37073, Germany; Scientific Core Facility Medical Biometry and Statistical Bioinformatics, University Medical Center Göttingen, Humboldtallee 32, Göttingen, 37073, Germany
| | - Tim Beißbarth
- Medical Bioinformatics, University Medical Center Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany; Campus-Institute Data Science (CIDAS), University of Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany.
| |
Collapse
|
4
|
Metsch JM, Saranti A, Angerschmid A, Pfeifer B, Klemt V, Holzinger A, Hauschild AC. CLARUS: An interactive explainable AI platform for manual counterfactuals in graph neural networks. J Biomed Inform 2024; 150:104600. [PMID: 38301750 DOI: 10.1016/j.jbi.2024.104600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 01/22/2024] [Accepted: 01/22/2024] [Indexed: 02/03/2024]
Abstract
BACKGROUND Lack of trust in artificial intelligence (AI) models in medicine is still the key blockage for the use of AI in clinical decision support systems (CDSS). Although AI models are already performing excellently in systems medicine, their black-box nature entails that patient-specific decisions are incomprehensible for the physician. Explainable AI (XAI) algorithms aim to "explain" to a human domain expert, which input features influenced a specific recommendation. However, in the clinical domain, these explanations must lead to some degree of causal understanding by a clinician. RESULTS We developed the CLARUS platform, aiming to promote human understanding of graph neural network (GNN) predictions. CLARUS enables the visualisation of patient-specific networks, as well as, relevance values for genes and interactions, computed by XAI methods, such as GNNExplainer. This enables domain experts to gain deeper insights into the network and more importantly, the expert can interactively alter the patient-specific network based on the acquired understanding and initiate re-prediction or retraining. This interactivity allows us to ask manual counterfactual questions and analyse the effects on the GNN prediction. CONCLUSION We present the first interactive XAI platform prototype, CLARUS, that allows not only the evaluation of specific human counterfactual questions based on user-defined alterations of patient networks and a re-prediction of the clinical outcome but also a retraining of the entire GNN after changing the underlying graph structures. The platform is currently hosted by the GWDG on https://rshiny.gwdg.de/apps/clarus/.
Collapse
Affiliation(s)
| | - Anna Saranti
- Institute for Medical Informatics, Statistics and Documentation, Medical University Graz, Austria; Human-Centered AI Lab, University of Natural Resources and Life Sciences, Vienna, Austria
| | - Alessa Angerschmid
- Institute for Medical Informatics, Statistics and Documentation, Medical University Graz, Austria; Human-Centered AI Lab, University of Natural Resources and Life Sciences, Vienna, Austria
| | - Bastian Pfeifer
- Institute for Medical Informatics, Statistics and Documentation, Medical University Graz, Austria
| | - Vanessa Klemt
- Biomedical Datascience lab, Philipps University Marburg, Germany
| | - Andreas Holzinger
- Institute for Medical Informatics, Statistics and Documentation, Medical University Graz, Austria; Human-Centered AI Lab, University of Natural Resources and Life Sciences, Vienna, Austria
| | | |
Collapse
|
5
|
Montaha S, Azam S, Bhuiyan MRI, Chowa SS, Mukta MSH, Jonkman M. Malignancy pattern analysis of breast ultrasound images using clinical features and a graph convolutional network. Digit Health 2024; 10:20552076241251660. [PMID: 38817843 PMCID: PMC11138200 DOI: 10.1177/20552076241251660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 04/12/2024] [Indexed: 06/01/2024] Open
Abstract
Objective Early diagnosis of breast cancer can lead to effective treatment, possibly increase long-term survival rates, and improve quality of life. The objective of this study is to present an automated analysis and classification system for breast cancer using clinical markers such as tumor shape, orientation, margin, and surrounding tissue. The novelty and uniqueness of the study lie in the approach of considering medical features based on the diagnosis of radiologists. Methods Using clinical markers, a graph is generated where each feature is represented by a node, and the connection between them is represented by an edge which is derived through Pearson's correlation method. A graph convolutional network (GCN) model is proposed to classify breast tumors into benign and malignant, using the graph data. Several statistical tests are performed to assess the importance of the proposed features. The performance of the proposed GCN model is improved by experimenting with different layer configurations and hyper-parameter settings. Results Results show that the proposed model has a 98.73% test accuracy. The performance of the model is compared with a graph attention network, a one-dimensional convolutional neural network, and five transfer learning models, ten machine learning models, and three ensemble learning models. The performance of the model was further assessed with three supplementary breast cancer ultrasound image datasets, where the accuracies are 91.03%, 94.37%, and 89.62% for Dataset A, Dataset B, and Dataset C (combining Dataset A and Dataset B) respectively. Overfitting issues are assessed through k-fold cross-validation. Conclusion Several variants are utilized to present a more rigorous and fair evaluation of our work, especially the importance of extracting clinically relevant features. Moreover, a GCN model using graph data can be a promising solution for an automated feature-based breast image classification system.
Collapse
Affiliation(s)
- Sidratul Montaha
- Department of Computer Science, University of Calgary, Calgary, Canada
| | - Sami Azam
- Faculty of Science and Technology, Charles Darwin University, Casuarina, Australia
| | | | - Sadia Sultana Chowa
- Faculty of Science and Technology, Charles Darwin University, Casuarina, Australia
| | | | - Mirjam Jonkman
- Faculty of Science and Technology, Charles Darwin University, Casuarina, Australia
| |
Collapse
|