1
|
Zhang H, Liu J, Liu W, Chen H, Yu Z, Yuan Y, Wang P, Qin J. MHD-Net: Memory-Aware Hetero-Modal Distillation Network for Thymic Epithelial Tumor Typing With Missing Pathology Modality. IEEE J Biomed Health Inform 2024; 28:3003-3014. [PMID: 38470599 DOI: 10.1109/jbhi.2024.3376462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2024]
Abstract
Fusing multi-modal radiology and pathology data with complementary information can improve the accuracy of tumor typing. However, collecting pathology data is difficult since it is high-cost and sometimes only obtainable after the surgery, which limits the application of multi-modal methods in diagnosis. To address this problem, we propose comprehensively learning multi-modal radiology-pathology data in training, and only using uni-modal radiology data in testing. Concretely, a Memory-aware Hetero-modal Distillation Network (MHD-Net) is proposed, which can distill well-learned multi-modal knowledge with the assistance of memory from the teacher to the student. In the teacher, to tackle the challenge in hetero-modal feature fusion, we propose a novel spatial-differentiated hetero-modal fusion module (SHFM) that models spatial-specific tumor information correlations across modalities. As only radiology data is accessible to the student, we store pathology features in the proposed contrast-boosted typing memory module (CTMM) that achieves type-wise memory updating and stage-wise contrastive memory boosting to ensure the effectiveness and generalization of memory items. In the student, to improve the cross-modal distillation, we propose a multi-stage memory-aware distillation (MMD) scheme that reads memory-aware pathology features from CTMM to remedy missing modal-specific information. Furthermore, we construct a Radiology-Pathology Thymic Epithelial Tumor (RPTET) dataset containing paired CT and WSI images with annotations. Experiments on the RPTET and CPTAC-LUAD datasets demonstrate that MHD-Net significantly improves tumor typing and outperforms existing multi-modal methods on missing modality situations.
Collapse
|
2
|
Liu X. Incomplete Multiple Kernel Alignment Maximization for Clustering. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:1412-1424. [PMID: 34596533 DOI: 10.1109/tpami.2021.3116948] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multiple kernel alignment (MKA) maximization criterion has been widely applied into multiple kernel clustering (MKC) and many variants have been recently developed. Though demonstrating superior clustering performance in various applications, it is observed that none of them can effectively handle incomplete MKC, where parts or all of the pre-specified base kernel matrices are incomplete. To address this issue, we propose to integrate the imputation of incomplete kernel matrices and MKA maximization for clustering into a unified learning framework. The clustering of MKA maximization guides the imputation of incomplete kernel elements, and the completed kernel matrices are in turn combined to conduct the subsequent MKC. These two procedures are alternately performed until convergence. By this way, the imputation and MKC processes are seamlessly connected, with the aim to achieve better clustering performance. Besides theoretically analyzing the clustering generalization error bound, we empirically evaluate the clustering performance on several multiple kernel learning (MKL) benchmark datasets, and the results indicate the superiority of our algorithm over existing state-of-the-art counterparts. Our codes and data are publicly available at https://xinwangliu.github.io/.
Collapse
|
3
|
Chen Y, Pan Y, Xia Y, Yuan Y. Disentangle First, Then Distill: A Unified Framework for Missing Modality Imputation and Alzheimer's Disease Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3566-3578. [PMID: 37450359 DOI: 10.1109/tmi.2023.3295489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/18/2023]
Abstract
Multi-modality medical data provide complementary information, and hence have been widely explored for computer-aided AD diagnosis. However, the research is hindered by the unavoidable missing-data problem, i.e., one data modality was not acquired on some subjects due to various reasons. Although the missing data can be imputed using generative models, the imputation process may introduce unrealistic information to the classification process, leading to poor performance. In this paper, we propose the Disentangle First, Then Distill (DFTD) framework for AD diagnosis using incomplete multi-modality medical images. First, we design a region-aware disentanglement module to disentangle each image into inter-modality relevant representation and intra-modality specific representation with emphasis on disease-related regions. To progressively integrate multi-modality knowledge, we then construct an imputation-induced distillation module, in which a lateral inter-modality transition unit is created to impute representation of the missing modality. The proposed DFTD framework has been evaluated against six existing methods on an ADNI dataset with 1248 subjects. The results show that our method has superior performance in both AD-CN classification and MCI-to-AD prediction tasks, substantially over-performing all competing methods.
Collapse
|
4
|
Chen Y, Guo X, Pan Y, Xia Y, Yuan Y. Dynamic feature splicing for few-shot rare disease diagnosis. Med Image Anal 2023; 90:102959. [PMID: 37757644 DOI: 10.1016/j.media.2023.102959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 09/03/2023] [Accepted: 09/05/2023] [Indexed: 09/29/2023]
Abstract
Annotated images for rare disease diagnosis are extremely hard to collect. Therefore, identifying rare diseases under a few-shot learning (FSL) setting is significant. Existing FSL methods transfer useful and global knowledge from base classes with abundant training samples to enrich features of novel classes with few training samples, but still face difficulties when being applied to medical images due to the complex lesion characteristics and large intra-class variance. In this paper, we propose a dynamic feature splicing (DNFS) framework for few-shot rare disease diagnosis. Under DNFS, both low-level features (i.e., the output of three convolutional blocks) and high-level features (i.e., the output of the last fully connected layer) of novel classes are dynamically enriched. We construct the position coherent DNFS (P-DNFS) module to perform low-level feature splicing, where a lesion-oriented Transformer is designed to detect lesion regions. Thus, novel-class channels are replaced by similar base-class channels within the detected lesion regions to achieve disease-related feature enrichment. We also devise a semantic coherent DNFS (S-DNFS) module to perform high-level feature splicing. It explores cross-image channel relations and selects base-class channels with semantic consistency for explicit knowledge transfer. Both low-level and high-level feature splicings are performed dynamically and iteratively. Consequently, abundant spliced features are generated for disease diagnosis, leading to more accurate decision boundary and improved diagnosis performance. Extensive experiments have been conducted on three medical image classification datasets. Our results suggest that the proposed DNFS achieves superior performance against state-of-the-art approaches.
Collapse
Affiliation(s)
- Yuanyuan Chen
- National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China
| | - Xiaoqing Guo
- Department of Engineering Science, University of Oxford, Oxford, UK
| | - Yongsheng Pan
- National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China
| | - Yong Xia
- National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China.
| | - Yixuan Yuan
- Department of Electronic Engineering, Chinese University of Hong Kong, Hong Kong Special Administrative Region of China; CUHK Shenzhen Research Institute, Shenzhen 518172, China.
| |
Collapse
|
5
|
Gu Y, Otake Y, Uemura K, Soufi M, Takao M, Talbot H, Okada S, Sugano N, Sato Y. Bone mineral density estimation from a plain X-ray image by learning decomposition into projections of bone-segmented computed tomography. Med Image Anal 2023; 90:102970. [PMID: 37774535 DOI: 10.1016/j.media.2023.102970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 07/25/2023] [Accepted: 09/11/2023] [Indexed: 10/01/2023]
Abstract
Osteoporosis is a prevalent bone disease that causes fractures in fragile bones, leading to a decline in daily living activities. Dual-energy X-ray absorptiometry (DXA) and quantitative computed tomography (QCT) are highly accurate for diagnosing osteoporosis; however, these modalities require special equipment and scan protocols. To frequently monitor bone health, low-cost, low-dose, and ubiquitously available diagnostic methods are highly anticipated. In this study, we aim to perform bone mineral density (BMD) estimation from a plain X-ray image for opportunistic screening, which is potentially useful for early diagnosis. Existing methods have used multi-stage approaches consisting of extraction of the region of interest and simple regression to estimate BMD, which require a large amount of training data. Therefore, we propose an efficient method that learns decomposition into projections of bone-segmented QCT for BMD estimation under limited datasets. The proposed method achieved high accuracy in BMD estimation, where Pearson correlation coefficients of 0.880 and 0.920 were observed for DXA-measured BMD and QCT-measured BMD estimation tasks, respectively, and the root mean square of the coefficient of variation values were 3.27 to 3.79% for four measurements with different poses. Furthermore, we conducted extensive validation experiments, including multi-pose, uncalibrated-CT, and compression experiments toward actual application in routine clinical practice.
Collapse
Affiliation(s)
- Yi Gu
- Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan; CentraleSupélec, Université Paris-Saclay, Inria, Gif-sur-Yvette 91190, France.
| | - Yoshito Otake
- Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan.
| | - Keisuke Uemura
- Department of Orthopeadic Medical Engineering, Osaka University Graduate School of Medicine, Suita, Osaka 565-0871, Japan.
| | - Mazen Soufi
- Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan
| | - Masaki Takao
- Department of Bone and Joint Surgery, Ehime University Graduate School of Medicine, Toon, Ehime 791-0295, Japan
| | - Hugues Talbot
- CentraleSupélec, Université Paris-Saclay, Inria, Gif-sur-Yvette 91190, France
| | - Seiji Okada
- Department of Orthopaedics, Osaka University Graduate School of Medicine, Suita, Osaka 565-0871, Japan
| | - Nobuhiko Sugano
- Department of Orthopeadic Medical Engineering, Osaka University Graduate School of Medicine, Suita, Osaka 565-0871, Japan
| | - Yoshinobu Sato
- Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan.
| |
Collapse
|
6
|
Kwak MG, Su Y, Chen K, Weidman D, Wu T, Lure F, Li J. A Mutual Knowledge Distillation-Empowered AI Framework for Early Detection of Alzheimer's Disease Using Incomplete Multi-Modal Images. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.08.24.23294574. [PMID: 37662267 PMCID: PMC10473798 DOI: 10.1101/2023.08.24.23294574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Early detection of Alzheimer's Disease (AD) is crucial to ensure timely interventions and optimize treatment outcomes for patients. While integrating multi-modal neuroimages, such as MRI and PET, has shown great promise, limited research has been done to effectively handle incomplete multi-modal image datasets in the integration. To this end, we propose a deep learning-based framework that employs Mutual Knowledge Distillation (MKD) to jointly model different sub-cohorts based on their respective available image modalities. In MKD, the model with more modalities (e.g., MRI and PET) is considered a teacher while the model with fewer modalities (e.g., only MRI) is considered a student. Our proposed MKD framework includes three key components: First, we design a teacher model that is student-oriented, namely the Student-oriented Multi-modal Teacher (SMT), through multi-modal information disentanglement. Second, we train the student model by not only minimizing its classification errors but also learning from the SMT teacher. Third, we update the teacher model by transfer learning from the student's feature extractor because the student model is trained with more samples. Evaluations on Alzheimer's Disease Neuroimaging Initiative (ADNI) datasets highlight the effectiveness of our method. Our work demonstrates the potential of using AI for addressing the challenges of incomplete multi-modal neuroimage datasets, opening new avenues for advancing early AD detection and treatment strategies.
Collapse
Affiliation(s)
- Min Gu Kwak
- School of Industrial and Systems Engineering, Georgia Institute of Technology, GA
| | - Yi Su
- Banner Alzheimer's Institute, AZ
| | | | | | - Teresa Wu
- School of Computing, Informatics and Decision Systems Engineering, Arizona State University, AZ
| | | | - Jing Li
- School of Industrial and Systems Engineering, Georgia Institute of Technology, GA
| |
Collapse
|
7
|
Steyaert S, Pizurica M, Nagaraj D, Khandelwal P, Hernandez-Boussard T, Gentles AJ, Gevaert O. Multimodal data fusion for cancer biomarker discovery with deep learning. NAT MACH INTELL 2023; 5:351-362. [PMID: 37693852 PMCID: PMC10484010 DOI: 10.1038/s42256-023-00633-5] [Citation(s) in RCA: 27] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 02/17/2023] [Indexed: 09/12/2023]
Abstract
Technological advances now make it possible to study a patient from multiple angles with high-dimensional, high-throughput multi-scale biomedical data. In oncology, massive amounts of data are being generated ranging from molecular, histopathology, radiology to clinical records. The introduction of deep learning has significantly advanced the analysis of biomedical data. However, most approaches focus on single data modalities leading to slow progress in methods to integrate complementary data types. Development of effective multimodal fusion approaches is becoming increasingly important as a single modality might not be consistent and sufficient to capture the heterogeneity of complex diseases to tailor medical care and improve personalised medicine. Many initiatives now focus on integrating these disparate modalities to unravel the biological processes involved in multifactorial diseases such as cancer. However, many obstacles remain, including lack of usable data as well as methods for clinical validation and interpretation. Here, we cover these current challenges and reflect on opportunities through deep learning to tackle data sparsity and scarcity, multimodal interpretability, and standardisation of datasets.
Collapse
Affiliation(s)
- Sandra Steyaert
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
| | - Marija Pizurica
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
| | | | | | - Tina Hernandez-Boussard
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
- Department of Biomedical Data Science, Stanford University
| | - Andrew J Gentles
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
- Department of Biomedical Data Science, Stanford University
| | - Olivier Gevaert
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
- Department of Biomedical Data Science, Stanford University
| |
Collapse
|
8
|
El-Sappagh S, Alonso-Moral JM, Abuhmed T, Ali F, Bugarín-Diz A. Trustworthy artificial intelligence in Alzheimer’s disease: state of the art, opportunities, and challenges. Artif Intell Rev 2023. [DOI: 10.1007/s10462-023-10415-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]
|
9
|
Gong W, Bai S, Zheng YQ, Smith SM, Beckmann CF. Supervised Phenotype Discovery From Multimodal Brain Imaging. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:834-849. [PMID: 36318559 DOI: 10.1109/tmi.2022.3218720] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Data-driven discovery of image-derived phenotypes (IDPs) from large-scale multimodal brain imaging data has enormous potential for neuroscientific and clinical research by linking IDPs to subjects' demographic, behavioural, clinical and cognitive measures (i.e., non-imaging derived phenotypes or nIDPs). However, current approaches are primarily based on unsupervised approaches, without the use of information in nIDPs. In this paper, we proposed a semi-supervised, multimodal, and multi-task fusion approach, termed SuperBigFLICA, for IDP discovery, which simultaneously integrates information from multiple imaging modalities as well as multiple nIDPs. SuperBigFLICA is computationally efficient and largely avoids the need for parameter tuning. Using the UK Biobank brain imaging dataset with around 40,000 subjects and 47 modalities, along with more than 17,000 nIDPs, we showed that SuperBigFLICA enhances the prediction power of nIDPs, benchmarked against IDPs derived by conventional expert-knowledge and unsupervised-learning approaches (with average nIDP prediction accuracy improvements of up to 46%). It also enables the learning of generic imaging features that can predict new nIDPs. Further empirical analysis of the SuperBigFLICA algorithm demonstrates its robustness in different prediction tasks and the ability to derive biologically meaningful IDPs in predicting health outcomes and cognitive nIDPs, such as fluid intelligence and hypertension.
Collapse
|
10
|
Sun Y, Li Y, Zhang F, Zhao H, Liu H, Wang N, Li H. A deep network using coarse clinical prior for myopic maculopathy grading. Comput Biol Med 2023; 154:106556. [PMID: 36682177 DOI: 10.1016/j.compbiomed.2023.106556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 12/19/2022] [Accepted: 01/11/2023] [Indexed: 01/15/2023]
Abstract
Pathological Myopia (PM) is a globally prevalent eye disease which is one of the main causes of blindness. In the long-term clinical observation, myopic maculopathy is a main criterion to diagnose PM severity. The grading of myopic maculopathy can provide a severity and progression prediction of PM to perform treatment and prevent myopia blindness in time. In this paper, we propose a feature fusion framework to utilize tessellated fundus and the brightest region in fundus images as prior knowledge. The proposed framework consists of prior knowledge extraction module and feature fusion module. Prior knowledge extraction module uses traditional image processing methods to extract the prior knowledge to indicate coarse lesion positions in fundus images. Furthermore, the prior, tessellated fundus and the brightest region in fundus images, are integrated into deep learning network as global and local constrains respectively by feature fusion module. In addition, rank loss is designed to increase the continuity of classification score. We collect a private color fundus dataset from Beijing TongRen Hospital containing 714 clinical images. The dataset contains all 5 grades of myopic maculopathy which are labeled by experienced ophthalmologists. Our framework achieves 0.8921 five-grade accuracy on our private dataset. Pathological Myopia (PALM) dataset is used for comparison with other related algorithms. Our framework is trained with 400 images and achieves an AUC of 0.9981 for two-class grading. The results show that our framework can achieve a good performance for myopic maculopathy grading.
Collapse
Affiliation(s)
- Yun Sun
- Beijing Institute of Technology, No. 5, Zhong Guan Cun South Street, Beijing, 100081, China
| | - Yu Li
- Beijing Tongren Hospital, Capital Medical University, No. 2, Chongwenmennei Street, Beijing, 100730, China
| | - Fengju Zhang
- Beijing Tongren Hospital, Capital Medical University, No. 2, Chongwenmennei Street, Beijing, 100730, China
| | - He Zhao
- Beijing Institute of Technology, No. 5, Zhong Guan Cun South Street, Beijing, 100081, China.
| | - Hanruo Liu
- Beijing Institute of Technology, No. 5, Zhong Guan Cun South Street, Beijing, 100081, China; Beijing Tongren Hospital, Capital Medical University, No. 2, Chongwenmennei Street, Beijing, 100730, China
| | - Ningli Wang
- Beijing Tongren Hospital, Capital Medical University, No. 2, Chongwenmennei Street, Beijing, 100730, China
| | - Huiqi Li
- Beijing Institute of Technology, No. 5, Zhong Guan Cun South Street, Beijing, 100081, China.
| |
Collapse
|
11
|
Liu F, Yuan S, Li W, Xu Q, Sheng B. Patch-based deep multi-modal learning framework for Alzheimer’s disease diagnosis using multi-view neuroimaging. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
12
|
Xiao Z, Zhang X, Liu Y, Geng L, Wu J, Wang W, Zhang F. RNN-combined graph convolutional network with multi-feature fusion for tuberculosis cavity segmentation. SIGNAL, IMAGE AND VIDEO PROCESSING 2023; 17:2297-2303. [PMID: 36624826 PMCID: PMC9813881 DOI: 10.1007/s11760-022-02446-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 11/16/2022] [Accepted: 12/10/2022] [Indexed: 05/20/2023]
Abstract
Tuberculosis is a common infectious disease in the world. Tuberculosis cavities are common and an important imaging signs in tuberculosis. Accurate segmentation of tuberculosis cavities has practical significance for indicating the activity of lesions and guiding clinical treatment. However, this task faces challenges such as blurred boundaries, irregular shapes, different location and size of lesions and similar structures on computed tomography (CT) to other lung diseases or tissues. To overcome these problems, we propose a novel RNN-combined graph convolutional network (R2GCN) method, which integrates the bidirectional recurrent network (BRN) and graph convolution network (GCN) modules. First, feature extraction is performed on the input image by VGG-16 or ResNet-50 to obtain the feature map. The feature map is then used as the input of the two modules. On the one hand, we adopt the BRN to retrieve contextual information from the feature map. On the other hand, we take the vector for each location in the feature map as input nodes and utilize GCN to extract node topology information. Finally, two types of features obtained fuse together. Our strategy can not only make full use of node correlations and differences, but also obtain more precise segmentation boundaries. Extensive experiments on CT images of cavitary patients with tuberculosis show that our proposed method achieves the best segmentation accuracy than compared segmentation methods. Our method can be used for the diagnosis of tuberculosis cavity and the evaluation of tuberculosis cavity treatment.
Collapse
Affiliation(s)
- Zhitao Xiao
- School of life Sciences, Tiangong University, Tianjin, 300387 China
- Tianjin Key Laboratory of Optoelectronic Detection Technology and Systems, Tianjin, 300387 China
| | - Xiaomeng Zhang
- School of Artificial Intelligence, Tiangong University, Tianjin, 300387 China
| | - Yanbei Liu
- School of life Sciences, Tiangong University, Tianjin, 300387 China
| | - Lei Geng
- School of life Sciences, Tiangong University, Tianjin, 300387 China
| | - Jun Wu
- School of Electronic and Information Engineering, Tiangong University, Tianjin, 300387 China
| | - Wen Wang
- School of life Sciences, Tiangong University, Tianjin, 300387 China
| | - Fang Zhang
- School of life Sciences, Tiangong University, Tianjin, 300387 China
| |
Collapse
|
13
|
Zhang S, Zhang J, Tian B, Lukasiewicz T, Xu Z. Multi-modal contrastive mutual learning and pseudo-label re-learning for semi-supervised medical image segmentation. Med Image Anal 2023; 83:102656. [PMID: 36327656 DOI: 10.1016/j.media.2022.102656] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 10/04/2022] [Accepted: 10/12/2022] [Indexed: 12/12/2022]
Abstract
Semi-supervised learning has a great potential in medical image segmentation tasks with a few labeled data, but most of them only consider single-modal data. The excellent characteristics of multi-modal data can improve the performance of semi-supervised segmentation for each image modality. However, a shortcoming for most existing multi-modal solutions is that as the corresponding processing models of the multi-modal data are highly coupled, multi-modal data are required not only in the training but also in the inference stages, which thus limits its usage in clinical practice. Consequently, we propose a semi-supervised contrastive mutual learning (Semi-CML) segmentation framework, where a novel area-similarity contrastive (ASC) loss leverages the cross-modal information and prediction consistency between different modalities to conduct contrastive mutual learning. Although Semi-CML can improve the segmentation performance of both modalities simultaneously, there is a performance gap between two modalities, i.e., there exists a modality whose segmentation performance is usually better than that of the other. Therefore, we further develop a soft pseudo-label re-learning (PReL) scheme to remedy this gap. We conducted experiments on two public multi-modal datasets. The results show that Semi-CML with PReL greatly outperforms the state-of-the-art semi-supervised segmentation methods and achieves a similar (and sometimes even better) performance as fully supervised segmentation methods with 100% labeled data, while reducing the cost of data annotation by 90%. We also conducted ablation studies to evaluate the effectiveness of the ASC loss and the PReL module.
Collapse
Affiliation(s)
- Shuo Zhang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China
| | - Jiaojiao Zhang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China
| | - Biao Tian
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China
| | | | - Zhenghua Xu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China.
| |
Collapse
|
14
|
Xu C, Liu H, Guan Z, Wu X, Tan J, Ling B. Adversarial Incomplete Multiview Subspace Clustering Networks. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:10490-10503. [PMID: 33750730 DOI: 10.1109/tcyb.2021.3062830] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Multiview clustering aims to leverage information from multiple views to improve the clustering performance. Most previous works assumed that each view has complete data. However, in real-world datasets, it is often the case that a view may contain some missing data, resulting in the problem of incomplete multiview clustering (IMC). Previous approaches to this problem have at least one of the following drawbacks: 1) employing shallow models, which cannot well handle the dependence and discrepancy among different views; 2) ignoring the hidden information of the missing data; and 3) being dedicated to the two-view case. To eliminate all these drawbacks, in this work, we present the adversarial IMC (AIMC) framework. In particular, AIMC seeks the common latent representation of multiview data for reconstructing raw data and inferring missing data. The elementwise reconstruction and the generative adversarial network are integrated to evaluate the reconstruction. They aim to capture the overall structure and get a deeper semantic understanding, respectively. Moreover, the clustering loss is designed to obtain a better clustering structure. We explore two variants of AIMC, namely: 1) autoencoder-based AIMC (AAIMC) and 2) generalized AIMC (GAIMC), with different strategies to obtain the multiview common representation. Experiments conducted on six real-world datasets show that AAIMC and GAIMC perform well and outperform the baseline methods.
Collapse
|
15
|
Zhang Y, Zhang H, Xiao L, Bai Y, Calhoun VD, Wang YP. Multi-Modal Imaging Genetics Data Fusion via a Hypergraph-Based Manifold Regularization: Application to Schizophrenia Study. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:2263-2272. [PMID: 35320094 PMCID: PMC9661879 DOI: 10.1109/tmi.2022.3161828] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Recent studies show that multi-modal data fusion techniques combine information from diverse sources for comprehensive diagnosis and prognosis of complex brain disorder, often resulting in improved accuracy compared to single-modality approaches. However, many existing data fusion methods extract features from homogeneous networs, ignoring heterogeneous structural information among multiple modalities. To this end, we propose a Hypergraph-based Multi-modal data Fusion algorithm, namely HMF. Specifically, we first generate a hypergraph similarity matrix to represent the high-order relationships among subjects, and then enforce the regularization term based upon both the inter- and intra-modality relationships of the subjects. Finally, we apply HMF to integrate imaging and genetics datasets. Validation of the proposed method is performed on both synthetic data and real samples from schizophrenia study. Results show that our algorithm outperforms several competing methods, and reveals significant interactions among risk genes, environmental factors and abnormal brain regions.
Collapse
|
16
|
Xu L, Wu H, He C, Wang J, Zhang C, Nie F, Chen L. Multi-modal sequence learning for Alzheimer’s disease progression prediction with incomplete variable-length longitudinal data. Med Image Anal 2022; 82:102643. [DOI: 10.1016/j.media.2022.102643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Revised: 08/27/2022] [Accepted: 09/23/2022] [Indexed: 11/28/2022]
|
17
|
Zhang H, Chen X, Zhang E, Wang L. Incomplete Multi-view Learning via Consensus Graph Completion. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10973-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|