1
|
Wang P, Zhang H, Zhu M, Jiang X, Qin J, Yuan Y. MGIML: Cancer Grading With Incomplete Radiology-Pathology Data via Memory Learning and Gradient Homogenization. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2113-2124. [PMID: 38231819 DOI: 10.1109/tmi.2024.3355142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Taking advantage of multi-modal radiology-pathology data with complementary clinical information for cancer grading is helpful for doctors to improve diagnosis efficiency and accuracy. However, radiology and pathology data have distinct acquisition difficulties and costs, which leads to incomplete-modality data being common in applications. In this work, we propose a Memory- and Gradient-guided Incomplete Modal-modal Learning (MGIML) framework for cancer grading with incomplete radiology-pathology data. Firstly, to remedy missing-modality information, we propose a Memory-driven Hetero-modality Complement (MH-Complete) scheme, which constructs modal-specific memory banks constrained by a coarse-grained memory boosting (CMB) loss to record generic radiology and pathology feature patterns, and develops a cross-modal memory reading strategy enhanced by a fine-grained memory consistency (FMC) loss to take missing-modality information from well-stored memories. Secondly, as gradient conflicts exist between missing-modality situations, we propose a Rotation-driven Gradient Homogenization (RG-Homogenize) scheme, which estimates instance-specific rotation matrices to smoothly change the feature-level gradient directions, and computes confidence-guided homogenization weights to dynamically balance gradient magnitudes. By simultaneously mitigating gradient direction and magnitude conflicts, this scheme well avoids the negative transfer and optimization imbalance problems. Extensive experiments on CPTAC-UCEC and CPTAC-PDA datasets show that the proposed MGIML framework performs favorably against state-of-the-art multi-modal methods on missing-modality situations.
Collapse
|
2
|
Liu S, Wang H, Li S, Zhang C. Mixture-of-experts and semantic-guided network for brain tumor segmentation with missing MRI modalities. Med Biol Eng Comput 2024:10.1007/s11517-024-03130-y. [PMID: 38789839 DOI: 10.1007/s11517-024-03130-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 05/14/2024] [Indexed: 05/26/2024]
Abstract
Accurate brain tumor segmentation with multi-modal MRI images is crucial, but missing modalities in clinical practice often reduce accuracy. The aim of this study is to propose a mixture-of-experts and semantic-guided network to tackle the issue of missing modalities in brain tumor segmentation. We introduce a transformer-based encoder with novel mixture-of-experts blocks. In each block, four modality experts aim for modality-specific feature learning. Learnable modality embeddings are employed to alleviate the negative effect of missing modalities. We also introduce a decoder guided by semantic information, designed to pay higher attention to various tumor regions. Finally, we conduct extensive comparison experiments with other models as well as ablation experiments to validate the performance of the proposed model on the BraTS2018 dataset. The proposed model can accurately segment brain tumor sub-regions even with missing modalities. It achieves an average Dice score of 0.81 for the whole tumor, 0.66 for the tumor core, and 0.52 for the enhanced tumor across the 15 modality combinations, achieving top or near-top results in most cases, while also exhibiting a lower computational cost. Our mixture-of-experts and sematic-guided network achieves accurate and reliable brain tumor segmentation results with missing modalities, indicating its significant potential for clinical applications. Our source code is already available at https://github.com/MaggieLSY/MESG-Net .
Collapse
Affiliation(s)
- Siyu Liu
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, 200032, China
- Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Shanghai, 200032, China
| | - Haoran Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, 200032, China
- Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Shanghai, 200032, China
| | - Shiman Li
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, 200032, China
- Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Shanghai, 200032, China
| | - Chenxi Zhang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, 200032, China.
- Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Shanghai, 200032, China.
| |
Collapse
|
3
|
Chen Q, Zhang J, Meng R, Zhou L, Li Z, Feng Q, Shen D. Modality-Specific Information Disentanglement From Multi-Parametric MRI for Breast Tumor Segmentation and Computer-Aided Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1958-1971. [PMID: 38206779 DOI: 10.1109/tmi.2024.3352648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/13/2024]
Abstract
Breast cancer is becoming a significant global health challenge, with millions of fatalities annually. Magnetic Resonance Imaging (MRI) can provide various sequences for characterizing tumor morphology and internal patterns, and becomes an effective tool for detection and diagnosis of breast tumors. However, previous deep-learning based tumor segmentation methods from multi-parametric MRI still have limitations in exploring inter-modality information and focusing task-informative modality/modalities. To address these shortcomings, we propose a Modality-Specific Information Disentanglement (MoSID) framework to extract both inter- and intra-modality attention maps as prior knowledge for guiding tumor segmentation. Specifically, by disentangling modality-specific information, the MoSID framework provides complementary clues for the segmentation task, by generating modality-specific attention maps to guide modality selection and inter-modality evaluation. Our experiments on two 3D breast datasets and one 2D prostate dataset demonstrate that the MoSID framework outperforms other state-of-the-art multi-modality segmentation methods, even in the cases of missing modalities. Based on the segmented lesions, we further train a classifier to predict the patients' response to radiotherapy. The prediction accuracy is comparable to the case of using manually-segmented tumors for treatment outcome prediction, indicating the robustness and effectiveness of the proposed segmentation method. The code is available at https://github.com/Qianqian-Chen/MoSID.
Collapse
|
4
|
Zhang H, Liu J, Liu W, Chen H, Yu Z, Yuan Y, Wang P, Qin J. MHD-Net: Memory-Aware Hetero-Modal Distillation Network for Thymic Epithelial Tumor Typing With Missing Pathology Modality. IEEE J Biomed Health Inform 2024; 28:3003-3014. [PMID: 38470599 DOI: 10.1109/jbhi.2024.3376462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2024]
Abstract
Fusing multi-modal radiology and pathology data with complementary information can improve the accuracy of tumor typing. However, collecting pathology data is difficult since it is high-cost and sometimes only obtainable after the surgery, which limits the application of multi-modal methods in diagnosis. To address this problem, we propose comprehensively learning multi-modal radiology-pathology data in training, and only using uni-modal radiology data in testing. Concretely, a Memory-aware Hetero-modal Distillation Network (MHD-Net) is proposed, which can distill well-learned multi-modal knowledge with the assistance of memory from the teacher to the student. In the teacher, to tackle the challenge in hetero-modal feature fusion, we propose a novel spatial-differentiated hetero-modal fusion module (SHFM) that models spatial-specific tumor information correlations across modalities. As only radiology data is accessible to the student, we store pathology features in the proposed contrast-boosted typing memory module (CTMM) that achieves type-wise memory updating and stage-wise contrastive memory boosting to ensure the effectiveness and generalization of memory items. In the student, to improve the cross-modal distillation, we propose a multi-stage memory-aware distillation (MMD) scheme that reads memory-aware pathology features from CTMM to remedy missing modal-specific information. Furthermore, we construct a Radiology-Pathology Thymic Epithelial Tumor (RPTET) dataset containing paired CT and WSI images with annotations. Experiments on the RPTET and CPTAC-LUAD datasets demonstrate that MHD-Net significantly improves tumor typing and outperforms existing multi-modal methods on missing modality situations.
Collapse
|
5
|
Hu X, Wang L, Wang L, Chen Q, Zheng L, Zhu Y. Glioma segmentation based on dense contrastive learning and multimodal features recalibration. Phys Med Biol 2024; 69:095016. [PMID: 38537288 DOI: 10.1088/1361-6560/ad387f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 03/27/2024] [Indexed: 04/23/2024]
Abstract
Accurate segmentation of different regions of gliomas from multimodal magnetic resonance (MR) images is crucial for glioma grading and precise diagnosis, but many existing segmentation methods are difficult to effectively utilize multimodal MR image information to recognize accurately the lesion regions with small size, low contrast and irregular shape. To address this issue, this work proposes a novel 3D glioma segmentation model DCL-MANet. DCL-MANet has an architecture of multiple encoders and one single decoder. Each encoder is used to extract MR image features of a given modality. To overcome the entangle problems of multimodal semantic features, a dense contrastive learning (DCL) strategy is presented to extract the modality-specific and common features. Following that, feature recalibration block (RFB) based on modality-wise attention is used to recalibrate the semantic features of each modality, enabling the model to focus on the features that are beneficial for glioma segmentation. These recalibrated features are input into the decoder to obtain the segmentation results. To verify the superiority of the proposed method, we compare it with several state-of-the-art (SOTA) methods in terms of Dice, average symmetric surface distance (ASSD), HD95 and volumetric similarity (Vs). The comparison results show that the average Dice, ASSD, HD95 and Vs of DCL-MANet on all tumor regions are improved at least by 0.66%, 3.47%, 8.94% and 1.07% respectively. For small enhance tumor (ET) region, the corresponding improvement can be up to 0.37%, 7.83%, 11.32%, and 1.35%, respectively. In addition, the ablation results demonstrate the effectiveness of the proposed DCL and RFB, and combining them can significantly increase Dice (1.59%) and Vs (1.54%) while decreasing ASSD (40.51%) and HD95 (45.16%) on ET region. The proposed DCL-MANet could disentangle multimodal features and enhance the semantics of modality-dependent features, providing a potential means to accurately segment small lesion regions in gliomas.
Collapse
Affiliation(s)
- Xubin Hu
- Engineering Research Center of Text Computing & Cognitive Intelligence, Ministry of Education, Key Laboratory of Intelligent Medical Image Analysis and Precise Diagnosis of Guizhou Province, State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang 550025, People's Republic of China
| | - Lihui Wang
- Engineering Research Center of Text Computing & Cognitive Intelligence, Ministry of Education, Key Laboratory of Intelligent Medical Image Analysis and Precise Diagnosis of Guizhou Province, State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang 550025, People's Republic of China
| | - Li Wang
- Engineering Research Center of Text Computing & Cognitive Intelligence, Ministry of Education, Key Laboratory of Intelligent Medical Image Analysis and Precise Diagnosis of Guizhou Province, State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang 550025, People's Republic of China
| | - Qijian Chen
- Engineering Research Center of Text Computing & Cognitive Intelligence, Ministry of Education, Key Laboratory of Intelligent Medical Image Analysis and Precise Diagnosis of Guizhou Province, State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang 550025, People's Republic of China
| | - Licheng Zheng
- Engineering Research Center of Text Computing & Cognitive Intelligence, Ministry of Education, Key Laboratory of Intelligent Medical Image Analysis and Precise Diagnosis of Guizhou Province, State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang 550025, People's Republic of China
| | - Yuemin Zhu
- University Lyon, INSA Lyon, CNRS, Inserm, IRP Metislab CREATIS UMR5220, U1206, Lyon F-69621, France
| |
Collapse
|
6
|
Qiu L, Zhao L, Zhao W, Zhao J. Dual-space disentangled-multimodal network (DDM-net) for glioma diagnosis and prognosis with incomplete pathology and genomic data. Phys Med Biol 2024; 69:085028. [PMID: 38595094 DOI: 10.1088/1361-6560/ad37ec] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 03/26/2024] [Indexed: 04/11/2024]
Abstract
Objective. Effective fusion of histology slides and molecular profiles from genomic data has shown great potential in the diagnosis and prognosis of gliomas. However, it remains challenging to explicitly utilize the consistent-complementary information among different modalities and create comprehensive representations of patients. Additionally, existing researches mainly focus on complete multi-modality data and usually fail to construct robust models for incomplete samples.Approach. In this paper, we propose adual-space disentangled-multimodal network (DDM-net)for glioma diagnosis and prognosis. DDM-net disentangles the latent features generated by two separate variational autoencoders (VAEs) into common and specific components through a dual-space disentangled approach, facilitating the construction of comprehensive representations of patients. More importantly, DDM-net imputes the unavailable modality in the latent feature space, making it robust to incomplete samples.Main results. We evaluated our approach on the TCGA-GBMLGG dataset for glioma grading and survival analysis tasks. Experimental results demonstrate that the proposed method achieves superior performance compared to state-of-the-art methods, with a competitive AUC of 0.952 and a C-index of 0.768.Significance. The proposed model may help the clinical understanding of gliomas and can serve as an effective fusion model with multimodal data. Additionally, it is capable of handling incomplete samples, making it less constrained by clinical limitations.
Collapse
Affiliation(s)
- Lu Qiu
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, People's Republic of China
| | - Lu Zhao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, People's Republic of China
| | - Wangyuan Zhao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, People's Republic of China
| | - Jun Zhao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, People's Republic of China
| |
Collapse
|
7
|
Sun Y, Wang C. Brain tumor detection based on a novel and high-quality prediction of the tumor pixel distributions. Comput Biol Med 2024; 172:108196. [PMID: 38493601 DOI: 10.1016/j.compbiomed.2024.108196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 01/31/2024] [Accepted: 02/18/2024] [Indexed: 03/19/2024]
Abstract
The work presented in this paper is in the area of brain tumor detection. We propose a fast detection system with 3D MRI scans of Flair modality. It performs 2 functions, predicting the gray level distribution and location distribution of the pixels in the tumor regions and generating tumor masks with pixel-wise precision. To facilitate 3D data analysis and processing, we introduce a 2D histogram presentation encompassing the gray-level distribution and pixel-location distribution of a 3D object. In the proposed system, specific 2D histograms highlighting tumor-related features are established by exploiting the left-right asymmetry of a brain structure. A modulation function, generated from the input data of each patient case, is applied to the 2D histograms to transform them into coarsely or finely predicted distributions of tumor pixels. The prediction result helps to identify/remove tumor-free slices. The prediction and removal operations are performed to the axial, coronal and sagittal slice series of a brain image, transforming it into a 3D minimum bounding box of its tumor region. The bounding box is utilized to finalize the prediction and generate a 3D tumor mask. The proposed system has been tested extensively with the data of more than 1200 patient cases in BraTS2018∼2021 datasets. The test results demonstrate that the predicted 2D histograms resemble closely the true ones. The system delivers also very good tumor detection results, comparable to those of state-of-the-art CNN systems with mono-modality inputs. They are reproducible and obtained at an extremely low computation cost and without need for training.
Collapse
Affiliation(s)
- Yanming Sun
- Department of Electrical and Computer Engineering, Concordia University, 1455 De Maisonneuve Blvd. W, Montreal, Quebec, Canada, H3G 1M8
| | - Chunyan Wang
- Department of Electrical and Computer Engineering, Concordia University, 1455 De Maisonneuve Blvd. W, Montreal, Quebec, Canada, H3G 1M8.
| |
Collapse
|
8
|
Zhang D, Wang C, Chen T, Chen W, Shen Y. Scalable Swin Transformer network for brain tumor segmentation from incomplete MRI modalities. Artif Intell Med 2024; 149:102788. [PMID: 38462288 DOI: 10.1016/j.artmed.2024.102788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 12/19/2023] [Accepted: 01/25/2024] [Indexed: 03/12/2024]
Abstract
BACKGROUND Deep learning methods have shown great potential in processing multi-modal Magnetic Resonance Imaging (MRI) data, enabling improved accuracy in brain tumor segmentation. However, the performance of these methods can suffer when dealing with incomplete modalities, which is a common issue in clinical practice. Existing solutions, such as missing modality synthesis, knowledge distillation, and architecture-based methods, suffer from drawbacks such as long training times, high model complexity, and poor scalability. METHOD This paper proposes IMS2Trans, a novel lightweight scalable Swin Transformer network by utilizing a single encoder to extract latent feature maps from all available modalities. This unified feature extraction process enables efficient information sharing and fusion among the modalities, resulting in efficiency without compromising segmentation performance even in the presence of missing modalities. RESULTS Two datasets, BraTS 2018 and BraTS 2020, containing incomplete modalities for brain tumor segmentation are evaluated against popular benchmarks. On the BraTS 2018 dataset, our model achieved higher average Dice similarity coefficient (DSC) scores for the whole tumor, tumor core, and enhancing tumor regions (86.57, 75.67, and 58.28, respectively), in comparison with a state-of-the-art model, i.e. mmFormer (86.45, 75.51, and 57.79, respectively). Similarly, on the BraTS 2020 dataset, our model scored higher DSC scores in these three brain tumor regions (87.33, 79.09, and 62.11, respectively) compared to mmFormer (86.17, 78.34, and 60.36, respectively). We also conducted a Wilcoxon test on the experimental results, and the generated p-value confirmed that our model's performance was statistically significant. Moreover, our model exhibits significantly reduced complexity with only 4.47 M parameters, 121.89G FLOPs, and a model size of 77.13 MB, whereas mmFormer comprises 34.96 M parameters, 265.79 G FLOPs, and a model size of 559.74 MB. These indicate our model, being light-weighted with significantly reduced parameters, is still able to achieve better performance than a state-of-the-art model. CONCLUSION By leveraging a single encoder for processing the available modalities, IMS2Trans offers notable scalability advantages over methods that rely on multiple encoders. This streamlined approach eliminates the need for maintaining separate encoders for each modality, resulting in a lightweight and scalable network architecture. The source code of IMS2Trans and the associated weights are both publicly available at https://github.com/hudscomdz/IMS2Trans.
Collapse
Affiliation(s)
- Dongsong Zhang
- School of Big Data and Artificial Intelligence, Xinyang College, Xinyang, 464000, Henan, China; School of Computing and Engineering, University of Huddersfield, Huddersfield, HD13DH, UK
| | - Changjian Wang
- National Key Laboratory of Parallel and Distributed Computing, Changsha, 410073, Hunan, China
| | - Tianhua Chen
- School of Computing and Engineering, University of Huddersfield, Huddersfield, HD13DH, UK
| | - Weidao Chen
- Beijing Infervision Technology Co., Ltd., Beijing, 100020, China
| | - Yiqing Shen
- Department of Computer Science, Johns Hopkins University, Baltimore, 21218, MD, USA.
| |
Collapse
|
9
|
Zang P, Hormel TT, Wang J, Guo Y, Bailey ST, Flaxel CJ, Huang D, Hwang TS, Jia Y. Interpretable Diabetic Retinopathy Diagnosis Based on Biomarker Activation Map. IEEE Trans Biomed Eng 2024; 71:14-25. [PMID: 37405891 PMCID: PMC10796196 DOI: 10.1109/tbme.2023.3290541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/07/2023]
Abstract
OBJECTIVE Deep learning classifiers provide the most accurate means of automatically diagnosing diabetic retinopathy (DR) based on optical coherence tomography (OCT) and its angiography (OCTA). The power of these models is attributable in part to the inclusion of hidden layers that provide the complexity required to achieve a desired task. However, hidden layers also render algorithm outputs difficult to interpret. Here we introduce a novel biomarker activation map (BAM) framework based on generative adversarial learning that allows clinicians to verify and understand classifiers' decision-making. METHODS A data set including 456 macular scans were graded as non-referable or referable DR based on current clinical standards. A DR classifier that was used to evaluate our BAM was first trained based on this data set. The BAM generation framework was designed by combing two U-shaped generators to provide meaningful interpretability to this classifier. The main generator was trained to take referable scans as input and produce an output that would be classified by the classifier as non-referable. The BAM is then constructed as the difference image between the output and input of the main generator. To ensure that the BAM only highlights classifier-utilized biomarkers an assistant generator was trained to do the opposite, producing scans that would be classified as referable by the classifier from non-referable scans. RESULTS The generated BAMs highlighted known pathologic features including nonperfusion area and retinal fluid. CONCLUSION/SIGNIFICANCE A fully interpretable classifier based on these highlights could help clinicians better utilize and verify automated DR diagnosis.
Collapse
Affiliation(s)
- Pengxiao Zang
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239 USA
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR 97239 USA
| | - Tristan T. Hormel
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239 USA
| | - Jie Wang
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239 USA
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR 97239 USA
| | - Yukun Guo
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239 USA
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR 97239 USA
| | - Steven T. Bailey
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239 USA
| | - Christina J. Flaxel
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239 USA
| | - David Huang
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239 USA
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR 97239 USA
| | - Thomas S. Hwang
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239 USA
| | - Yali Jia
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239 USA
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR 97239 USA
| |
Collapse
|
10
|
Chen Y, Pan Y, Xia Y, Yuan Y. Disentangle First, Then Distill: A Unified Framework for Missing Modality Imputation and Alzheimer's Disease Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3566-3578. [PMID: 37450359 DOI: 10.1109/tmi.2023.3295489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/18/2023]
Abstract
Multi-modality medical data provide complementary information, and hence have been widely explored for computer-aided AD diagnosis. However, the research is hindered by the unavoidable missing-data problem, i.e., one data modality was not acquired on some subjects due to various reasons. Although the missing data can be imputed using generative models, the imputation process may introduce unrealistic information to the classification process, leading to poor performance. In this paper, we propose the Disentangle First, Then Distill (DFTD) framework for AD diagnosis using incomplete multi-modality medical images. First, we design a region-aware disentanglement module to disentangle each image into inter-modality relevant representation and intra-modality specific representation with emphasis on disease-related regions. To progressively integrate multi-modality knowledge, we then construct an imputation-induced distillation module, in which a lateral inter-modality transition unit is created to impute representation of the missing modality. The proposed DFTD framework has been evaluated against six existing methods on an ADNI dataset with 1248 subjects. The results show that our method has superior performance in both AD-CN classification and MCI-to-AD prediction tasks, substantially over-performing all competing methods.
Collapse
|
11
|
Yang H, Sun J, Xu Z. Learning Unified Hyper-Network for Multi-Modal MR Image Synthesis and Tumor Segmentation With Missing Modalities. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3678-3689. [PMID: 37540616 DOI: 10.1109/tmi.2023.3301934] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/06/2023]
Abstract
Accurate segmentation of brain tumors is of critical importance in clinical assessment and treatment planning, which requires multiple MR modalities providing complementary information. However, due to practical limits, one or more modalities may be missing in real scenarios. To tackle this problem, existing methods need to train multiple networks or a unified but fixed network for various possible missing modality cases, which leads to high computational burdens or sub-optimal performance. In this paper, we propose a unified and adaptive multi-modal MR image synthesis method, and further apply it to tumor segmentation with missing modalities. Based on the decomposition of multi-modal MR images into common and modality-specific features, we design a shared hyper-encoder for embedding each available modality into the feature space, a graph-attention-based fusion block to aggregate the features of available modalities to the fused features, and a shared hyper-decoder for image reconstruction. We also propose an adversarial common feature constraint to enforce the fused features to be in a common space. As for missing modality segmentation, we first conduct the feature-level and image-level completion using our synthesis method and then segment the tumors based on the completed MR images together with the extracted common features. Moreover, we design a hypernet-based modulation module to adaptively utilize the real and synthetic modalities. Experimental results suggest that our method can not only synthesize reasonable multi-modal MR images, but also achieve state-of-the-art performance on brain tumor segmentation with missing modalities.
Collapse
|
12
|
Diao Y, Li F, Li Z. Joint learning-based feature reconstruction and enhanced network for incomplete multi-modal brain tumor segmentation. Comput Biol Med 2023; 163:107234. [PMID: 37450967 DOI: 10.1016/j.compbiomed.2023.107234] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 06/12/2023] [Accepted: 07/01/2023] [Indexed: 07/18/2023]
Abstract
Multimodal Magnetic Resonance Imaging (MRI) can provide valuable complementary information and substantially enhance the performance of brain tumor segmentation. However, it is common for certain modalities to be absent or missing during clinical diagnosis, which can significantly impair segmentation techniques that rely on complete modalities. Current advanced methods attempt to address this challenge by developing shared feature representations via modal fusion to handle different missing modality situations. Considering the importance of missing modality information in multimodal segmentation, this paper utilize a feature reconstruction method to recover the missing information, and proposes a joint learning-based feature reconstruction and enhancement method for incomplete modality brain tumor segmentation. The method leverages an information learning mechanism to transfer information from the complete modality to a single modality, enabling it to obtain complete brain tumor information, even without the support of other modalities. Additionally, the method incorporates a module for reconstructing missing modality features, which recovers fused features of the absent modality through utilizing the abundant potential information obtained from the available modalities. Furthermore, the feature enhancement mechanism improves shared feature representation by utilizing the information obtained from the missing modalities that have been reconstructed. These processes enable the method to obtain more comprehensive information regarding brain tumors in various missing modality circumstances, thereby enhancing the model's robustness. The performance of the proposed model was evaluated on BraTS datasets and compared with other deep learning algorithms using Dice similarity scores. On the BraTS2018 dataset, the proposed algorithm achieved a Dice similarity score of 86.28%, 77.02%, and 59.64% for whole tumors, tumor cores, and enhanced tumors, respectively. These results demonstrate the superiority of our framework over state-of-the-art methods in missing modalities situations.
Collapse
Affiliation(s)
- Yueqin Diao
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China; Yunnan Key Laboratory of Artificial Intelligence, Kunming 650500, China.
| | - Fan Li
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China; Yunnan Key Laboratory of Artificial Intelligence, Kunming 650500, China.
| | - Zhiyuan Li
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China; Yunnan Key Laboratory of Artificial Intelligence, Kunming 650500, China.
| |
Collapse
|
13
|
Zhao C, Wu D, He J, Dai C. A Visual Positioning Method of UAV in a Large-Scale Outdoor Environment. SENSORS (BASEL, SWITZERLAND) 2023; 23:6941. [PMID: 37571724 PMCID: PMC10422297 DOI: 10.3390/s23156941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 08/01/2023] [Accepted: 08/02/2023] [Indexed: 08/13/2023]
Abstract
Visual positioning is a basic component for UAV operation. The structure-based methods are, widely applied in most literature, based on local feature matching between a query image that needs to be localized and a reference image with a known pose and feature points. However, the existing methods still struggle with the different illumination and seasonal changes. In outdoor regions, the feature points and descriptors are similar, and the number of mismatches will increase rapidly, leading to the visual positioning becoming unreliable. Moreover, with the database growing, the image retrieval and feature matching are time-consuming. Therefore, in this paper, we propose a novel hierarchical visual positioning method, which includes map construction, landmark matching and pose calculation. First, we combine brain-inspired mechanisms and landmarks to construct a cognitive map, which can make image retrieval efficient. Second, the graph neural network is utilized to learn the inner relations of the feature points. To improve matching accuracy, the network uses the semantic confidence in matching score calculations. Besides, the system can eliminate the mismatches by analyzing all the matching results in the same landmark. Finally, we calculate the pose by using a PnP solver. Furthermore, we evaluate both the matching algorithm and the visual positioning method experimentally in the simulation datasets, where the matching algorithm performs better in some scenes. The results demonstrate that the retrieval time can be shortened by three-thirds with an average positioning error of 10.8 m.
Collapse
Affiliation(s)
- Chenhao Zhao
- Graduate School, Air Force Engineering University, Xi’an 710077, China;
| | - Dewei Wu
- School of Information and Navigation, Air Force Engineering University, Xi’an 710077, China; (J.H.); (C.D.)
| | - Jing He
- School of Information and Navigation, Air Force Engineering University, Xi’an 710077, China; (J.H.); (C.D.)
| | - Chuanjin Dai
- School of Information and Navigation, Air Force Engineering University, Xi’an 710077, China; (J.H.); (C.D.)
| |
Collapse
|
14
|
Yu Z, Han X, Zhang S, Feng J, Peng T, Zhang XY. MouseGAN++: Unsupervised Disentanglement and Contrastive Representation for Multiple MRI Modalities Synthesis and Structural Segmentation of Mouse Brain. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1197-1209. [PMID: 36449589 DOI: 10.1109/tmi.2022.3225528] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Segmenting the fine structure of the mouse brain on magnetic resonance (MR) images is critical for delineating morphological regions, analyzing brain function, and understanding their relationships. Compared to a single MRI modality, multimodal MRI data provide complementary tissue features that can be exploited by deep learning models, resulting in better segmentation results. However, multimodal mouse brain MRI data is often lacking, making automatic segmentation of mouse brain fine structure a very challenging task. To address this issue, it is necessary to fuse multimodal MRI data to produce distinguished contrasts in different brain structures. Hence, we propose a novel disentangled and contrastive GAN-based framework, named MouseGAN++, to synthesize multiple MR modalities from single ones in a structure-preserving manner, thus improving the segmentation performance by imputing missing modalities and multi-modality fusion. Our results demonstrate that the translation performance of our method outperforms the state-of-the-art methods. Using the subsequently learned modality-invariant information as well as the modality-translated images, MouseGAN++ can segment fine brain structures with averaged dice coefficients of 90.0% (T2w) and 87.9% (T1w), respectively, achieving around +10% performance improvement compared to the state-of-the-art algorithms. Our results demonstrate that MouseGAN++, as a simultaneous image synthesis and segmentation method, can be used to fuse cross-modality information in an unpaired manner and yield more robust performance in the absence of multimodal data. We release our method as a mouse brain structural segmentation tool for free academic usage at https://github.com/yu02019.
Collapse
|
15
|
Tian W, Li D, Lv M, Huang P. Axial Attention Convolutional Neural Network for Brain Tumor Segmentation with Multi-Modality MRI Scans. Brain Sci 2022; 13:brainsci13010012. [PMID: 36671994 PMCID: PMC9856007 DOI: 10.3390/brainsci13010012] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/13/2022] [Accepted: 12/18/2022] [Indexed: 12/24/2022] Open
Abstract
Accurately identifying tumors from MRI scans is of the utmost importance for clinical diagnostics and when making plans regarding brain tumor treatment. However, manual segmentation is a challenging and time-consuming process in practice and exhibits a high degree of variability between doctors. Therefore, an axial attention brain tumor segmentation network was established in this paper, automatically segmenting tumor subregions from multi-modality MRIs. The axial attention mechanism was employed to capture richer semantic information, which makes it easier for models to provide local-global contextual information by incorporating local and global feature representations while simplifying the computational complexity. The deep supervision mechanism is employed to avoid vanishing gradients and guide the AABTS-Net to generate better feature representations. The hybrid loss is employed in the model to handle the class imbalance of the dataset. Furthermore, we conduct comprehensive experiments on the BraTS 2019 and 2020 datasets. The proposed AABTS-Net shows greater robustness and accuracy, which signifies that the model can be employed in clinical practice and provides a new avenue for medical image segmentation systems.
Collapse
Affiliation(s)
- Weiwei Tian
- Shandong Key Laboratory of Medical Physics and Image Processing, Shandong Institute of Industrial Technology for Health Sciences and Precision Medicine, School of Physics and Electronics, Shandong Normal University, Jinan 250358, China
| | - Dengwang Li
- Shandong Key Laboratory of Medical Physics and Image Processing, Shandong Institute of Industrial Technology for Health Sciences and Precision Medicine, School of Physics and Electronics, Shandong Normal University, Jinan 250358, China
- Correspondence:
| | - Mengyu Lv
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| | - Pu Huang
- Shandong Key Laboratory of Medical Physics and Image Processing, Shandong Institute of Industrial Technology for Health Sciences and Precision Medicine, School of Physics and Electronics, Shandong Normal University, Jinan 250358, China
| |
Collapse
|
16
|
DGRUnit: Dual graph reasoning unit for brain tumor segmentation. Comput Biol Med 2022; 149:106079. [PMID: 36108413 DOI: 10.1016/j.compbiomed.2022.106079] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2022] [Revised: 08/27/2022] [Accepted: 09/03/2022] [Indexed: 11/20/2022]
Abstract
Many fully automatic segmentation models have been created to solve the difficulty of brain tumor segmentation, thanks to the rapid growth of deep learning. However, few approaches focus on the long-range relationships and contextual interdependence in multimodal Magnetic Resonance (MR) images. In this paper, we propose a novel approach for brain tumor segmentation called the dual graph reasoning unit (DGRUnit). Two parallel graph reasoning modules are included in our proposed method: a spatial reasoning module and a channel reasoning module. The spatial reasoning module models the long-range spatial dependencies between distinct regions in an image using a graph convolutional network (GCN). The channel reasoning module uses a graph attention network (GAT) to model the rich contextual interdependencies between different channels with similar semantic representations. Our experimental results clearly demonstrate the superior performance of the proposed DGRUnit. The ablation study shows the flexibility and generalizability of our model, which can be easily integrated into a wide range of neural networks and further improve them. When compared to several state-of-the-art methods, experimental results show that the proposed approach significantly improves both visual inspection and quantitative metrics for brain tumor segmentation tasks.
Collapse
|