1
|
Su J, Luo Z, Wang C, Lian S, Lin X, Li S. Reconstruct incomplete relation for incomplete modality brain tumor segmentation. Neural Netw 2024; 180:106657. [PMID: 39186839 DOI: 10.1016/j.neunet.2024.106657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 08/14/2024] [Accepted: 08/19/2024] [Indexed: 08/28/2024]
Abstract
Different brain tumor magnetic resonance imaging (MRI) modalities provide diverse tumor-specific information. Previous works have enhanced brain tumor segmentation performance by integrating multiple MRI modalities. However, multi-modal MRI data are often unavailable in clinical practice. An incomplete modality leads to missing tumor-specific information, which degrades the performance of existing models. Various strategies have been proposed to transfer knowledge from a full modality network (teacher) to an incomplete modality one (student) to address this issue. However, they neglect the fact that brain tumor segmentation is a structural prediction problem that requires voxel semantic relations. In this paper, we propose a Reconstruct Incomplete Relation Network (RIRN) that transfers voxel semantic relational knowledge from the teacher to the student. Specifically, we propose two types of voxel relations to incorporate structural knowledge: Class-relative relations (CRR) and Class-agnostic relations (CAR). The CRR groups voxels into different tumor regions and constructs a relation between them. The CAR builds a global relation between all voxel features, complementing the local inter-region relation. Moreover, we use adversarial learning to align the holistic structural prediction between the teacher and the student. Extensive experimentation on both the BraTS 2018 and BraTS 2020 datasets establishes that our method outperforms all state-of-the-art approaches.
Collapse
Affiliation(s)
- Jiawei Su
- School of Computer Engineering, Jimei University, Xiamen, China; The Department of Artificial Intelligence, Xiamen University, Fujian, China
| | - Zhiming Luo
- The Department of Artificial Intelligence, Xiamen University, Fujian, China.
| | - Chengji Wang
- The School of Computer Science, Central China Normal University, Wuhan, China
| | - Sheng Lian
- The College of Computer and Data Science, Fuzhou University, Fujian, China
| | - Xuejuan Lin
- The Department of Traditional Chinese Medicine, Fujian University of Traditional Chinese Medicine, Fujian, China
| | - Shaozi Li
- The Department of Artificial Intelligence, Xiamen University, Fujian, China
| |
Collapse
|
2
|
Tang Y, Lyu T, Jin H, Du Q, Wang J, Li Y, Li M, Chen Y, Zheng J. Domain adaptive noise reduction with iterative knowledge transfer and style generalization learning. Med Image Anal 2024; 98:103327. [PMID: 39191093 DOI: 10.1016/j.media.2024.103327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 08/20/2024] [Accepted: 08/21/2024] [Indexed: 08/29/2024]
Abstract
Low-dose computed tomography (LDCT) denoising tasks face significant challenges in practical imaging scenarios. Supervised methods encounter difficulties in real-world scenarios as there are no paired data for training. Moreover, when applied to datasets with varying noise patterns, these methods may experience decreased performance owing to the domain gap. Conversely, unsupervised methods do not require paired data and can be directly trained on real-world data. However, they often exhibit inferior performance compared to supervised methods. To address this issue, it is necessary to leverage the strengths of these supervised and unsupervised methods. In this paper, we propose a novel domain adaptive noise reduction framework (DANRF), which integrates both knowledge transfer and style generalization learning to effectively tackle the domain gap problem. Specifically, an iterative knowledge transfer method with knowledge distillation is selected to train the target model using unlabeled target data and a pre-trained source model trained with paired simulation data. Meanwhile, we introduce the mean teacher mechanism to update the source model, enabling it to adapt to the target domain. Furthermore, an iterative style generalization learning process is also designed to enrich the style diversity of the training dataset. We evaluate the performance of our approach through experiments conducted on multi-source datasets. The results demonstrate the feasibility and effectiveness of our proposed DANRF model in multi-source LDCT image processing tasks. Given its hybrid nature, which combines the advantages of supervised and unsupervised learning, and its ability to bridge domain gaps, our approach is well-suited for improving practical low-dose CT imaging in clinical settings. Code for our proposed approach is publicly available at https://github.com/tyfeiii/DANRF.
Collapse
Affiliation(s)
- Yufei Tang
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Tianling Lyu
- Research Center of Augmented Intelligence, Zhejiang Lab, Hangzhou, 310000, China
| | - Haoyang Jin
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Qiang Du
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Jiping Wang
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Yunxiang Li
- Nanovision Technology Co., Ltd., Beiqing Road, Haidian District, Beijing, 100094, China
| | - Ming Li
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China.
| | - Yang Chen
- Laboratory of Image Science and Technology, the School of Computer Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Jian Zheng
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China; Shandong Laboratory of Advanced Biomaterials and Medical Devices in Weihai, Weihai, 264200, China.
| |
Collapse
|
3
|
Wang Z, Zhu H, Liu J, Chen N, Huang B, Lu W, Wang Y. Hybrid offline and self-knowledge distillation for acute ischemic stroke lesion segmentation from non-contrast CT scans. Comput Biol Med 2024; 183:109312. [PMID: 39486307 DOI: 10.1016/j.compbiomed.2024.109312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 10/17/2024] [Accepted: 10/18/2024] [Indexed: 11/04/2024]
Abstract
Diagnosing and treating Acute Ischemic Stroke (AIS) within 0-24 h of onset is critical for patient recovery. While Diffusion-Weighted Imaging (DWI) and Computed Tomography Perfusion (CTP) are effective for early infarction identification, Non-Contrast CT (NCCT) remains the first-line imaging modality in emergency settings due to its efficiency and cost-effectiveness. In this work, to enhance lesion segmentation in NCCT using multi-modal information, we propose OS-AISeg, which integrates Offline knowledge distillation with Self-knowledge distillation to realize AIS segmentation. Initially, we trained a multi-modality teacher network by introducing uncertainty through Subjective Logic (SL) theory to reduce prediction errors and stabilize the training process. Subsequently, during student network training, we integrate confidence region knowledge guided by uncertainty weights and feature structure information guided by brain asymmetry. The former facilitates the acquisition of effective contextual information from paired predictions, while the latter leverages asymmetric activation maps to extract high-level structural content from multi-modality images. In self-knowledge distillation, we enhance the student network's learning of consistent global feature distributions by introducing mirrored NCCT images, thereby aiding the network in extracting knowledge directly from the modality. OS-AISeg was evaluated through five-fold cross-validation on two publicly available datasets, achieving a Dice value of 0.6196 on AISD and 0.4841 on ISLES2018. Additionally, experiments were also conducted on an external dataset, BraTS2019, as well as on a private stroke dataset named GLis. Strong correlations were observed between segmented Early Infarct (EI) and ground truth in volume analysis, validating the effectiveness of the proposed method in AIS diagnosis. The code for this project is available at https://github.com/Uni-Summer/OS-AISeg.
Collapse
Affiliation(s)
- Ziying Wang
- School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Hongqing Zhu
- School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China.
| | - Jiahao Liu
- School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Ning Chen
- School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Bingcang Huang
- Department of Radiology, Gongli Hospital of Shanghai Pudong New Area, Shanghai 200135, China
| | - Weiping Lu
- Department of Radiology, Gongli Hospital of Shanghai Pudong New Area, Shanghai 200135, China
| | - Ying Wang
- Shanghai Health Commission Key Lab of Artificial Intelligence (AI)-Based Management of Inflammation and Chronic Diseases, Sino-French Cooperative Central Lab, Gongli Hospital of Shanghai Pudong New Area, Shanghai 200135, China
| |
Collapse
|
4
|
Guetarni B, Windal F, Benhabiles H, Petit M, Dubois R, Leteurtre E, Collard D. A Vision Transformer-Based Framework for Knowledge Transfer From Multi-Modal to Mono-Modal Lymphoma Subtyping Models. IEEE J Biomed Health Inform 2024; 28:5562-5572. [PMID: 38819973 DOI: 10.1109/jbhi.2024.3407878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2024]
Abstract
Determining lymphoma subtypes is a crucial step for better patient treatment targeting to potentially increase their survival chances. In this context, the existing gold standard diagnosis method, which relies on gene expression technology, is highly expensive and time-consuming, making it less accessibility. Although alternative diagnosis methods based on IHC (immunohistochemistry) technologies exist (recommended by the WHO), they still suffer from similar limitations and are less accurate. Whole Slide Image (WSI) analysis using deep learning models has shown promising potential for cancer diagnosis, that could offer cost-effective and faster alternatives to existing methods. In this work, we propose a vision transformer-based framework for distinguishing DLBCL (Diffuse Large B-Cell Lymphoma) cancer subtypes from high-resolution WSIs. To this end, we introduce a multi-modal architecture to train a classifier model from various WSI modalities. We then leverage this model through a knowledge distillation process to efficiently guide the learning of a mono-modal classifier. Our experimental study conducted on a lymphoma dataset of 157 patients shows the promising performance of our mono-modal classification model, outperforming six recent state-of-the-art methods. In addition, the power-law curve, estimated on our experimental data, suggests that with more training data from a reasonable number of additional patients, our model could achieve competitive diagnosis accuracy with IHC technologies. Furthermore, the efficiency of our framework is confirmed through an additional experimental study on an external breast cancer dataset (BCI dataset).
Collapse
|
5
|
Bonada M, Rossi LF, Carone G, Panico F, Cofano F, Fiaschi P, Garbossa D, Di Meco F, Bianconi A. Deep Learning for MRI Segmentation and Molecular Subtyping in Glioblastoma: Critical Aspects from an Emerging Field. Biomedicines 2024; 12:1878. [PMID: 39200342 PMCID: PMC11352020 DOI: 10.3390/biomedicines12081878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 07/29/2024] [Accepted: 07/31/2024] [Indexed: 09/02/2024] Open
Abstract
Deep learning (DL) has been applied to glioblastoma (GBM) magnetic resonance imaging (MRI) assessment for tumor segmentation and inference of molecular, diagnostic, and prognostic information. We comprehensively overviewed the currently available DL applications, critically examining the limitations that hinder their broader adoption in clinical practice and molecular research. Technical limitations to the routine application of DL include the qualitative heterogeneity of MRI, related to different machinery and protocols, and the absence of informative sequences, possibly compensated by artificial image synthesis. Moreover, taking advantage from the available benchmarks of MRI, algorithms should be trained on large amounts of data. Additionally, the segmentation of postoperative imaging should be further addressed to limit the inaccuracies previously observed for this task. Indeed, molecular information has been promisingly integrated in the most recent DL tools, providing useful prognostic and therapeutic information. Finally, ethical concerns should be carefully addressed and standardized to allow for data protection. DL has provided reliable results for GBM assessment concerning MRI analysis and segmentation, but the routine clinical application is still limited. The current limitations could be prospectively addressed, giving particular attention to data collection, introducing new technical advancements, and carefully regulating ethical issues.
Collapse
Affiliation(s)
- Marta Bonada
- Neurosurgery Unit, Department of Neuroscience, University of Turin, Via Cherasco 15, 10126 Turin, Italy; (M.B.); (F.C.); (D.G.)
- Department of Neurosurgery, Fondazione IRCCS Istituto Neurologico Carlo Besta, Via Celoria 11, 20133 Milan, Italy; (G.C.)
| | - Luca Francesco Rossi
- Department of Informatics, Polytechnic University of Turin, Corso Castelfidardo 39, 10129 Turin, Italy;
| | - Giovanni Carone
- Department of Neurosurgery, Fondazione IRCCS Istituto Neurologico Carlo Besta, Via Celoria 11, 20133 Milan, Italy; (G.C.)
| | - Flavio Panico
- Neurosurgery Unit, Department of Neuroscience, University of Turin, Via Cherasco 15, 10126 Turin, Italy; (M.B.); (F.C.); (D.G.)
| | - Fabio Cofano
- Neurosurgery Unit, Department of Neuroscience, University of Turin, Via Cherasco 15, 10126 Turin, Italy; (M.B.); (F.C.); (D.G.)
| | - Pietro Fiaschi
- Division of Neurosurgery, Ospedale Policlinico San Martino, IRCCS for Oncology and Neurosciences, Largo Rosanna Benzi 10, 16132 Genoa, Italy;
- Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics and Maternal and Child Health, University of Genoa, Largo Rosanna Benzi 10, 16132 Genoa, Italy
| | - Diego Garbossa
- Neurosurgery Unit, Department of Neuroscience, University of Turin, Via Cherasco 15, 10126 Turin, Italy; (M.B.); (F.C.); (D.G.)
| | - Francesco Di Meco
- Department of Neurosurgery, Fondazione IRCCS Istituto Neurologico Carlo Besta, Via Celoria 11, 20133 Milan, Italy; (G.C.)
| | - Andrea Bianconi
- Neurosurgery Unit, Department of Neuroscience, University of Turin, Via Cherasco 15, 10126 Turin, Italy; (M.B.); (F.C.); (D.G.)
- Division of Neurosurgery, Ospedale Policlinico San Martino, IRCCS for Oncology and Neurosciences, Largo Rosanna Benzi 10, 16132 Genoa, Italy;
| |
Collapse
|
6
|
Tu R, Zhang D, Li C, Xiao L, Zhang Y, Cai X, Si W. Multimodal MRI segmentation of key structures for microvascular decompression via knowledge-driven mutual distillation and topological constraints. Int J Comput Assist Radiol Surg 2024; 19:1329-1338. [PMID: 38739324 DOI: 10.1007/s11548-024-03159-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Accepted: 04/22/2024] [Indexed: 05/14/2024]
Abstract
PURPOSE Microvascular decompression (MVD) is a widely used neurosurgical intervention for the treatment of cranial nerves compression. Segmentation of MVD-related structures, including the brainstem, nerves, arteries, and veins, is critical for preoperative planning and intraoperative decision-making. Automatically segmenting structures related to MVD is still challenging for current methods due to the limited information from a single modality and the complex topology of vessels and nerves. METHODS Considering that it is hard to distinguish MVD-related structures, especially for nerve and vessels with similar topology, we design a multimodal segmentation network with a shared encoder-dual decoder structure and propose a clinical knowledge-driven distillation scheme, allowing reliable knowledge transferred from each decoder to the other. Besides, we introduce a class-wise contrastive module to learn the discriminative representations by maximizing the distance among classes across modalities. Then, a projected topological loss based on persistent homology is proposed to constrain topological continuity. RESULTS We evaluate the performance of our method on in-house dataset consisting of 100 paired HR-T2WI and 3D TOF-MRA volumes. Experiments indicate that our model outperforms the SOTA in DSC by 1.9% for artery, 3.3% for vein and 0.5% for nerve. Visualization results show our method attains improved continuity and less breakage, which is also consistent with intraoperative images. CONCLUSION Our method can comprehensively extract the distinct features from multimodal data to segment the MVD-related key structures and preserve the topological continuity, allowing surgeons precisely perceiving the patient-specific target anatomy and substantially reducing the workload of surgeons in the preoperative planning stage. Our resources will be publicly available at https://github.com/JaronTu/Multimodal_MVD_Seg .
Collapse
Affiliation(s)
- Renzhe Tu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen University Town, 1068 Xueyuan Avenue, Shenzhen, 518055, China
- University of Chinese Academy of Sciences, No.1 Yanqihu East Rd, Beijing, 101408, China
| | - Doudou Zhang
- The Second School of Clinical Medicine, Southern Medical University, No.1023, South Shatai Road, Guangzhou, 510515, China
- Department of Neurosurgery, Guangdong Second Provincial General Hospital, 466 Xingang Middle Road, Guangzhou, 510317, China
- Department of Neurosurgery, the First Affiliated Hospital of Shenzhen University, Health Science Center, Shenzhen Second People's Hospital, Sungang West Road 3002, Shenzhen, 518035, China
| | - Caizi Li
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen University Town, 1068 Xueyuan Avenue, Shenzhen, 518055, China.
| | - Linxia Xiao
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen University Town, 1068 Xueyuan Avenue, Shenzhen, 518055, China
| | - Yong Zhang
- The Second School of Clinical Medicine, Southern Medical University, No.1023, South Shatai Road, Guangzhou, 510515, China
- Department of Neurosurgery, Guangdong Second Provincial General Hospital, 466 Xingang Middle Road, Guangzhou, 510317, China
| | - Xiaodong Cai
- Department of Neurosurgery, the First Affiliated Hospital of Shenzhen University, Health Science Center, Shenzhen Second People's Hospital, Sungang West Road 3002, Shenzhen, 518035, China
| | - Weixin Si
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen University Town, 1068 Xueyuan Avenue, Shenzhen, 518055, China.
| |
Collapse
|
7
|
Luan S, Ji Y, Liu Y, Zhu L, Zhou H, Ouyang J, Yang X, Zhao H, Zhu B. Real-Time Reconstruction of HIFU Focal Temperature Field Based on Deep Learning. BME FRONTIERS 2024; 5:0037. [PMID: 38515637 PMCID: PMC10956737 DOI: 10.34133/bmef.0037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 02/07/2024] [Indexed: 03/23/2024] Open
Abstract
Objective and Impact Statement: High-intensity focused ultrasound (HIFU) therapy is a promising noninvasive method that induces coagulative necrosis in diseased tissues through thermal and cavitation effects, while avoiding surrounding damage to surrounding normal tissues. Introduction: Accurate and real-time acquisition of the focal region temperature field during HIFU treatment marked enhances therapeutic efficacy, holding paramount scientific and practical value in clinical cancer therapy. Methods: In this paper, we initially designed and assembled an integrated HIFU system incorporating diagnostic, therapeutic, and temperature measurement functionalities to collect ultrasound echo signals and temperature variations during HIFU therapy. Furthermore, we introduced a novel multimodal teacher-student model approach, which utilizes the shared self-expressive coefficients and the deep canonical correlation analysis layer to aggregate each modality data, then through knowledge distillation strategies, transfers the knowledge from the teacher model to the student model. Results: By investigating the relationship between the phantoms, in vitro, and in vivo ultrasound echo signals and temperatures, we successfully achieved real-time reconstruction of the HIFU focal 2D temperature field region with a maximum temperature error of less than 2.5 °C. Conclusion: Our method effectively monitored the distribution of the HIFU temperature field in real time, providing scientifically precise predictive schemes for HIFU therapy, laying a theoretical foundation for subsequent personalized treatment dose planning, and providing efficient guidance for noninvasive, nonionizing cancer treatment.
Collapse
Affiliation(s)
- Shunyao Luan
- School of Integrated Circuits, Laboratory for Optoelectronics,
Huazhong University of Science and Technology, Wuhan, China
| | - Yongshuo Ji
- HIFU Center of Oncology Department,
Huadong Hospital Affiliated to Fudan University, Shanghai, China
| | - Yumei Liu
- HIFU Center of Oncology Department,
Huadong Hospital Affiliated to Fudan University, Shanghai, China
| | - Linling Zhu
- HIFU Center of Oncology Department,
Huadong Hospital Affiliated to Fudan University, Shanghai, China
| | - Haoyu Zhou
- School of Integrated Circuits, Laboratory for Optoelectronics,
Huazhong University of Science and Technology, Wuhan, China
| | - Jun Ouyang
- School of Integrated Circuits, Laboratory for Optoelectronics,
Huazhong University of Science and Technology, Wuhan, China
| | - Xiaofei Yang
- School of Integrated Circuits, Laboratory for Optoelectronics,
Huazhong University of Science and Technology, Wuhan, China
| | - Hong Zhao
- HIFU Center of Oncology Department,
Huadong Hospital Affiliated to Fudan University, Shanghai, China
| | - Benpeng Zhu
- School of Integrated Circuits, Laboratory for Optoelectronics,
Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
8
|
Ying M, Wang Y, Yang K, Wang H, Liu X. A deep learning knowledge distillation framework using knee MRI and arthroscopy data for meniscus tear detection. Front Bioeng Biotechnol 2024; 11:1326706. [PMID: 38292305 PMCID: PMC10825958 DOI: 10.3389/fbioe.2023.1326706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 12/22/2023] [Indexed: 02/01/2024] Open
Abstract
Purpose: To construct a deep learning knowledge distillation framework exploring the utilization of MRI alone or combing with distilled Arthroscopy information for meniscus tear detection. Methods: A database of 199 paired knee Arthroscopy-MRI exams was used to develop a multimodal teacher network and an MRI-based student network, which used residual neural networks architectures. A knowledge distillation framework comprising the multimodal teacher network T and the monomodal student network S was proposed. We optimized the loss functions of mean squared error (MSE) and cross-entropy (CE) to enable the student network S to learn arthroscopic information from the teacher network T through our deep learning knowledge distillation framework, ultimately resulting in a distilled student network S T. A coronal proton density (PD)-weighted fat-suppressed MRI sequence was used in this study. Fivefold cross-validation was employed, and the accuracy, sensitivity, specificity, F1-score, receiver operating characteristic (ROC) curves and area under the receiver operating characteristic curve (AUC) were used to evaluate the medial and lateral meniscal tears detection performance of the models, including the undistilled student model S, the distilled student model S T and the teacher model T. Results: The AUCs of the undistilled student model S, the distilled student model S T, the teacher model T for medial meniscus (MM) tear detection and lateral meniscus (LM) tear detection are 0.773/0.672, 0.792/0.751 and 0.834/0.746, respectively. The distilled student model S T had higher AUCs than the undistilled model S. After undergoing knowledge distillation processing, the distilled student model demonstrated promising results, with accuracy (0.764/0.734), sensitivity (0.838/0.661), and F1-score (0.680/0.754) for both medial and lateral tear detection better than the undistilled one with accuracy (0.734/0.648), sensitivity (0.733/0.607), and F1-score (0.620/0.673). Conclusion: Through the knowledge distillation framework, the student model S based on MRI benefited from the multimodal teacher model T and achieved an improved meniscus tear detection performance.
Collapse
Affiliation(s)
- Mengjie Ying
- Department of Orthopedics, Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Yufan Wang
- Engineering Research Center for Digital Medicine of the Ministry of Education, Shanghai, China
- School of Biomedical Engineering and Med-X Research Institute, Shanghai Jiao Tong University, Shanghai, China
| | - Kai Yang
- Department of Radiology, Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Haoyuan Wang
- Department of Orthopedics, Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Xudong Liu
- Department of Orthopedics, Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
| |
Collapse
|
9
|
Zhang Z, Verburgt J, Kagaya Y, Christoffer C, Kihara D. Improved Peptide Docking with Privileged Knowledge Distillation using Deep Learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.01.569671. [PMID: 38106114 PMCID: PMC10723353 DOI: 10.1101/2023.12.01.569671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Protein-peptide interactions play a key role in biological processes. Understanding the interactions that occur within a receptor-peptide complex can help in discovering and altering their biological functions. Various computational methods for modeling the structures of receptor-peptide complexes have been developed. Recently, accurate structure prediction enabled by deep learning methods has significantly advanced the field of structural biology. AlphaFold (AF) is among the top-performing structure prediction methods and has highly accurate structure modeling performance on single-chain targets. Shortly after the release of AlphaFold, AlphaFold-Multimer (AFM) was developed in a similar fashion as AF for prediction of protein complex structures. AFM has achieved competitive performance in modeling protein-peptide interactions compared to previous computational methods; however, still further improvement is needed. Here, we present DistPepFold, which improves protein-peptide complex docking using an AFM-based architecture through a privileged knowledge distillation approach. DistPepFold leverages a teacher model that uses native interaction information during training and transfers its knowledge to a student model through a teacher-student distillation process. We evaluated DistPepFold's docking performance on two protein-peptide complex datasets and showed that DistPepFold outperforms AFM. Furthermore, we demonstrate that the student model was able to learn from the teacher model to make structural improvements based on AFM predictions.
Collapse
Affiliation(s)
- Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Yuki Kagaya
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| |
Collapse
|
10
|
Chen Y, Pan Y, Xia Y, Yuan Y. Disentangle First, Then Distill: A Unified Framework for Missing Modality Imputation and Alzheimer's Disease Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3566-3578. [PMID: 37450359 DOI: 10.1109/tmi.2023.3295489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/18/2023]
Abstract
Multi-modality medical data provide complementary information, and hence have been widely explored for computer-aided AD diagnosis. However, the research is hindered by the unavoidable missing-data problem, i.e., one data modality was not acquired on some subjects due to various reasons. Although the missing data can be imputed using generative models, the imputation process may introduce unrealistic information to the classification process, leading to poor performance. In this paper, we propose the Disentangle First, Then Distill (DFTD) framework for AD diagnosis using incomplete multi-modality medical images. First, we design a region-aware disentanglement module to disentangle each image into inter-modality relevant representation and intra-modality specific representation with emphasis on disease-related regions. To progressively integrate multi-modality knowledge, we then construct an imputation-induced distillation module, in which a lateral inter-modality transition unit is created to impute representation of the missing modality. The proposed DFTD framework has been evaluated against six existing methods on an ADNI dataset with 1248 subjects. The results show that our method has superior performance in both AD-CN classification and MCI-to-AD prediction tasks, substantially over-performing all competing methods.
Collapse
|
11
|
Chen B, Liu Z, Lu J, Li Z, Kuang K, Yang J, Wang Z, Sun Y, Du B, Qi L, Li M. Deep learning parametric response mapping from inspiratory chest CT scans: a new approach for small airway disease screening. Respir Res 2023; 24:299. [PMID: 38017476 PMCID: PMC10683250 DOI: 10.1186/s12931-023-02611-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 11/16/2023] [Indexed: 11/30/2023] Open
Abstract
OBJECTIVES Parametric response mapping (PRM) enables the evaluation of small airway disease (SAD) at the voxel level, but requires both inspiratory and expiratory chest CT scans. We hypothesize that deep learning PRM from inspiratory chest CT scans can effectively evaluate SAD in individuals with normal spirometry. METHODS We included 537 participants with normal spirometry, a history of smoking or secondhand smoke exposure, and divided them into training, tuning, and test sets. A cascaded generative adversarial network generated expiratory CT from inspiratory CT, followed by a UNet-like network predicting PRM using real inspiratory CT and generated expiratory CT. The performance of the prediction is evaluated using SSIM, RMSE and dice coefficients. Pearson correlation evaluated the correlation between predicted and ground truth PRM. ROC curves evaluated predicted PRMfSAD (the volume percentage of functional small airway disease, fSAD) performance in stratifying SAD. RESULTS Our method can generate expiratory CT of good quality (SSIM 0.86, RMSE 80.13 HU). The predicted PRM dice coefficients for normal lung, emphysema, and fSAD regions are 0.85, 0.63, and 0.51, respectively. The volume percentages of emphysema and fSAD showed good correlation between predicted and ground truth PRM (|r| were 0.97 and 0.64, respectively, p < 0.05). Predicted PRMfSAD showed good SAD stratification performance with ground truth PRMfSAD at thresholds of 15%, 20% and 25% (AUCs were 0.84, 0.78, and 0.84, respectively, p < 0.001). CONCLUSION Our deep learning method generates high-quality PRM using inspiratory chest CT and effectively stratifies SAD in individuals with normal spirometry.
Collapse
Affiliation(s)
- Bin Chen
- Department of Radiology, Huadong Hospital Affiliated to Fudan University, 221, Yanan West Road, Jingan Temple Street, Jingan District, Shanghai, China
- Zhang Guozhen Small Pulmonary Nodules Diagnosis and Treatment Center, Shanghai, China
| | - Ziyi Liu
- School of Computer Science, Wuhan University, LuoJiaShan, WuChang District, Wuhan, Hubei, China
- Artificial Intelligence Institute of Wuhan University, Wuhan, Hubei, China
- Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan, Hubei, China
| | - Jinjuan Lu
- Department of Radiology, Shanghai Geriatric Medical Center, Shanghai, China
| | - Zhihao Li
- School of Computer Science, Wuhan University, LuoJiaShan, WuChang District, Wuhan, Hubei, China
- Artificial Intelligence Institute of Wuhan University, Wuhan, Hubei, China
- Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan, Hubei, China
| | - Kaiming Kuang
- Dianei Technology, Shanghai, China
- University of California San Diego, La Jolla, USA
| | - Jiancheng Yang
- Dianei Technology, Shanghai, China
- Computer Vision Laboratory, Swiss Federal Institute of Technology Lausanne (EPFL), Lausanne, Switzerland
| | - Zengmao Wang
- School of Computer Science, Wuhan University, LuoJiaShan, WuChang District, Wuhan, Hubei, China
- Artificial Intelligence Institute of Wuhan University, Wuhan, Hubei, China
- Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan, Hubei, China
| | - Yingli Sun
- Department of Radiology, Huadong Hospital Affiliated to Fudan University, 221, Yanan West Road, Jingan Temple Street, Jingan District, Shanghai, China
- Zhang Guozhen Small Pulmonary Nodules Diagnosis and Treatment Center, Shanghai, China
| | - Bo Du
- School of Computer Science, Wuhan University, LuoJiaShan, WuChang District, Wuhan, Hubei, China.
- Artificial Intelligence Institute of Wuhan University, Wuhan, Hubei, China.
- Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan, Hubei, China.
| | - Lin Qi
- Department of Radiology, Huadong Hospital Affiliated to Fudan University, 221, Yanan West Road, Jingan Temple Street, Jingan District, Shanghai, China.
- Zhang Guozhen Small Pulmonary Nodules Diagnosis and Treatment Center, Shanghai, China.
| | - Ming Li
- Department of Radiology, Huadong Hospital Affiliated to Fudan University, 221, Yanan West Road, Jingan Temple Street, Jingan District, Shanghai, China.
- Zhang Guozhen Small Pulmonary Nodules Diagnosis and Treatment Center, Shanghai, China.
| |
Collapse
|
12
|
Choi Y, Al-Masni MA, Jung KJ, Yoo RE, Lee SY, Kim DH. A single stage knowledge distillation network for brain tumor segmentation on limited MR image modalities. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 240:107644. [PMID: 37307766 DOI: 10.1016/j.cmpb.2023.107644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 05/14/2023] [Accepted: 06/03/2023] [Indexed: 06/14/2023]
Abstract
BACKGROUND AND OBJECTIVE Precisely segmenting brain tumors using multimodal Magnetic Resonance Imaging (MRI) is an essential task for early diagnosis, disease monitoring, and surgical planning. Unfortunately, the complete four image modalities utilized in the well-known BraTS benchmark dataset: T1, T2, Fluid-Attenuated Inversion Recovery (FLAIR), and T1 Contrast-Enhanced (T1CE) are not regularly acquired in clinical practice due to the high cost and long acquisition time. Rather, it is common to utilize limited image modalities for brain tumor segmentation. METHODS In this paper, we propose a single stage learning of knowledge distillation algorithm that derives information from the missing modalities for better segmentation of brain tumors. Unlike the previous works that adopted a two-stage framework to distill the knowledge from a pre-trained network into a student network, where the latter network is trained on limited image modality, we train both models simultaneously using a single-stage knowledge distillation algorithm. We transfer the information by reducing the redundancy from a teacher network trained on full image modalities to the student network using Barlow Twins loss on a latent-space level. To distill the knowledge on the pixel level, we further employ a deep supervision idea that trains the backbone networks of both teacher and student paths using Cross-Entropy loss. RESULTS We demonstrate that the proposed single-stage knowledge distillation approach enables improving the performance of the student network in each tumor category with overall dice scores of 91.11% for Tumor Core, 89.70% for Enhancing Tumor, and 92.20% for Whole Tumor in the case of only using the FLAIR and T1CE images, outperforming the state-of-the-art segmentation methods. CONCLUSIONS The outcomes of this work prove the feasibility of exploiting the knowledge distillation in segmenting brain tumors using limited image modalities and hence make it closer to clinical practices.
Collapse
Affiliation(s)
- Yoonseok Choi
- Department of Electrical and Electronic Engineering, College of Engineering, Yonsei University, Seoul 03722, Republic of Korea
| | - Mohammed A Al-Masni
- Department of Artificial Intelligence, College of Software & Convergence Technology, Daeyang AI Center, Sejong University, Seoul 05006, Republic of Korea
| | - Kyu-Jin Jung
- Department of Electrical and Electronic Engineering, College of Engineering, Yonsei University, Seoul 03722, Republic of Korea
| | - Roh-Eul Yoo
- Department of Radiology, Seoul National University Hospital, 101 Daehak-ro Jongno-gu, Seoul 03080, Republic of Korea; Department of Radiology, Seoul National University College of Medicine, 103 Daehak-ro Jongno-gu, Seoul 03080, Republic of Korea
| | - Seong-Yeong Lee
- Department of Radiology, Seoul National University Hospital, 101 Daehak-ro Jongno-gu, Seoul 03080, Republic of Korea
| | - Dong-Hyun Kim
- Department of Electrical and Electronic Engineering, College of Engineering, Yonsei University, Seoul 03722, Republic of Korea.
| |
Collapse
|
13
|
Zhao Q, Zhong L, Xiao J, Zhang J, Chen Y, Liao W, Zhang S, Wang G. Efficient Multi-Organ Segmentation From 3D Abdominal CT Images With Lightweight Network and Knowledge Distillation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2513-2523. [PMID: 37030798 DOI: 10.1109/tmi.2023.3262680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Accurate segmentation of multiple abdominal organs from Computed Tomography (CT) images plays an important role in computer-aided diagnosis, treatment planning and follow-up. Currently, 3D Convolution Neural Networks (CNN) have achieved promising performance for automatic medical image segmentation tasks. However, most existing 3D CNNs have a large set of parameters and huge floating point operations (FLOPs), and 3D CT volumes have a large size, leading to high computational cost, which limits their clinical application. To tackle this issue, we propose a novel framework based on lightweight network and Knowledge Distillation (KD) for delineating multiple organs from 3D CT volumes. We first propose a novel lightweight medical image segmentation network named LCOV-Net for reducing the model size and then introduce two knowledge distillation modules (i.e., Class-Affinity KD and Multi-Scale KD) to effectively distill the knowledge from a heavy-weight teacher model to improve LCOV-Net's segmentation accuracy. Experiments on two public abdominal CT datasets for multiple organ segmentation showed that: 1) Our LCOV-Net outperformed existing lightweight 3D segmentation models in both computational cost and accuracy; 2) The proposed KD strategy effectively improved the performance of the lightweight network, and it outperformed existing KD methods; 3) Combining the proposed LCOV-Net and KD strategy, our framework achieved better performance than the state-of-the-art 3D nnU-Net with only one-fifth parameters. The code is available at https://github.com/HiLab-git/LCOVNet-and-KD.
Collapse
|
14
|
Wu J, Guo D, Wang L, Yang S, Zheng Y, Shapey J, Vercauteren T, Bisdas S, Bradford R, Saeed S, Kitchen N, Ourselin S, Zhang S, Wang G. TISS-net: Brain tumor image synthesis and segmentation using cascaded dual-task networks and error-prediction consistency. Neurocomputing 2023; 544:None. [PMID: 37528990 PMCID: PMC10243514 DOI: 10.1016/j.neucom.2023.126295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 03/15/2023] [Accepted: 04/30/2023] [Indexed: 08/03/2023]
Abstract
Accurate segmentation of brain tumors from medical images is important for diagnosis and treatment planning, and it often requires multi-modal or contrast-enhanced images. However, in practice some modalities of a patient may be absent. Synthesizing the missing modality has a potential for filling this gap and achieving high segmentation performance. Existing methods often treat the synthesis and segmentation tasks separately or consider them jointly but without effective regularization of the complex joint model, leading to limited performance. We propose a novel brain Tumor Image Synthesis and Segmentation network (TISS-Net) that obtains the synthesized target modality and segmentation of brain tumors end-to-end with high performance. First, we propose a dual-task-regularized generator that simultaneously obtains a synthesized target modality and a coarse segmentation, which leverages a tumor-aware synthesis loss with perceptibility regularization to minimize the high-level semantic domain gap between synthesized and real target modalities. Based on the synthesized image and the coarse segmentation, we further propose a dual-task segmentor that predicts a refined segmentation and error in the coarse segmentation simultaneously, where a consistency between these two predictions is introduced for regularization. Our TISS-Net was validated with two applications: synthesizing FLAIR images for whole glioma segmentation, and synthesizing contrast-enhanced T1 images for Vestibular Schwannoma segmentation. Experimental results showed that our TISS-Net largely improved the segmentation accuracy compared with direct segmentation from the available modalities, and it outperformed state-of-the-art image synthesis-based segmentation methods.
Collapse
Affiliation(s)
- Jianghao Wu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Dong Guo
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Lu Wang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Shuojue Yang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Yuanjie Zheng
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Jonathan Shapey
- School of Biomedical Engineering & Imaging Sciences, King’s College London, London, UK
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London, UK
- Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, London, UK
| | - Tom Vercauteren
- School of Biomedical Engineering & Imaging Sciences, King’s College London, London, UK
| | - Sotirios Bisdas
- Department of Neuroradiology, National Hospital for Neurology and Neurosurgery, London, UK
| | - Robert Bradford
- Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, London, UK
| | - Shakeel Saeed
- Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, London, UK
| | - Neil Kitchen
- Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, London, UK
| | - Sebastien Ourselin
- School of Biomedical Engineering & Imaging Sciences, King’s College London, London, UK
| | - Shaoting Zhang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
- SenseTime Research, Shanghai, China
| | - Guotai Wang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
- Shanghai Artificial Intelligence Laboratory, Shanghai, China
| |
Collapse
|
15
|
Yang H, Zhou T, Zhou Y, Zhang Y, Fu H. Flexible Fusion Network for Multi-Modal Brain Tumor Segmentation. IEEE J Biomed Health Inform 2023; 27:3349-3359. [PMID: 37126623 DOI: 10.1109/jbhi.2023.3271808] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Automated brain tumor segmentation is crucial for aiding brain disease diagnosis and evaluating disease progress. Currently, magnetic resonance imaging (MRI) is a routinely adopted approach in the field of brain tumor segmentation that can provide different modality images. It is critical to leverage multi-modal images to boost brain tumor segmentation performance. Existing works commonly concentrate on generating a shared representation by fusing multi-modal data, while few methods take into account modality-specific characteristics. Besides, how to efficiently fuse arbitrary numbers of modalities is still a difficult task. In this study, we present a flexible fusion network (termed F 2Net) for multi-modal brain tumor segmentation, which can flexibly fuse arbitrary numbers of multi-modal information to explore complementary information while maintaining the specific characteristics of each modality. Our F 2Net is based on the encoder-decoder structure, which utilizes two Transformer-based feature learning streams and a cross-modal shared learning network to extract individual and shared feature representations. To effectively integrate the knowledge from the multi-modality data, we propose a cross-modal feature-enhanced module (CFM) and a multi-modal collaboration module (MCM), which aims at fusing the multi-modal features into the shared learning network and incorporating the features from encoders into the shared decoder, respectively. Extensive experimental results on multiple benchmark datasets demonstrate the effectiveness of our F 2Net over other state-of-the-art segmentation methods.
Collapse
|
16
|
Gao H, Miao Q, Ma D, Liua R. Deep Mutual Learning for Brain Tumor Segmentation with the Fusion Network. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.11.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
17
|
Noothout JMH, Lessmann N, van Eede MC, van Harten LD, Sogancioglu E, Heslinga FG, Veta M, van Ginneken B, Išgum I. Knowledge distillation with ensembles of convolutional neural networks for medical image segmentation. J Med Imaging (Bellingham) 2022; 9:052407. [PMID: 35692896 PMCID: PMC9142841 DOI: 10.1117/1.jmi.9.5.052407] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 05/12/2022] [Indexed: 06/21/2024] Open
Abstract
Purpose: Ensembles of convolutional neural networks (CNNs) often outperform a single CNN in medical image segmentation tasks, but inference is computationally more expensive and makes ensembles unattractive for some applications. We compared the performance of differently constructed ensembles with the performance of CNNs derived from these ensembles using knowledge distillation, a technique for reducing the footprint of large models such as ensembles. Approach: We investigated two different types of ensembles, namely, diverse ensembles of networks with three different architectures and two different loss-functions, and uniform ensembles of networks with the same architecture but initialized with different random seeds. For each ensemble, additionally, a single student network was trained to mimic the class probabilities predicted by the teacher model, the ensemble. We evaluated the performance of each network, the ensembles, and the corresponding distilled networks across three different publicly available datasets. These included chest computed tomography scans with four annotated organs of interest, brain magnetic resonance imaging (MRI) with six annotated brain structures, and cardiac cine-MRI with three annotated heart structures. Results: Both uniform and diverse ensembles obtained better results than any of the individual networks in the ensemble. Furthermore, applying knowledge distillation resulted in a single network that was smaller and faster without compromising performance compared with the ensemble it learned from. The distilled networks significantly outperformed the same network trained with reference segmentation instead of knowledge distillation. Conclusion: Knowledge distillation can compress segmentation ensembles of uniform or diverse composition into a single CNN while maintaining the performance of the ensemble.
Collapse
Affiliation(s)
- Julia M. H. Noothout
- Amsterdam University Medical Center, University of Amsterdam, Department of Biomedical Engineering and Physics, Amsterdam, The Netherlands
| | - Nikolas Lessmann
- Radboud University Medical Center, Department of Medical Imaging, Nijmegen, The Netherlands
| | - Matthijs C. van Eede
- Amsterdam University Medical Center, University of Amsterdam, Department of Biomedical Engineering and Physics, Amsterdam, The Netherlands
| | - Louis D. van Harten
- Amsterdam University Medical Center, University of Amsterdam, Department of Biomedical Engineering and Physics, Amsterdam, The Netherlands
| | - Ecem Sogancioglu
- Radboud University Medical Center, Department of Medical Imaging, Nijmegen, The Netherlands
| | - Friso G. Heslinga
- Eindhoven University of Technology, Department of Biomedical Engineering, Eindhoven, The Netherlands
| | - Mitko Veta
- Eindhoven University of Technology, Department of Biomedical Engineering, Eindhoven, The Netherlands
| | - Bram van Ginneken
- Radboud University Medical Center, Department of Medical Imaging, Nijmegen, The Netherlands
| | - Ivana Išgum
- Amsterdam University Medical Center, University of Amsterdam, Department of Biomedical Engineering and Physics, Amsterdam, The Netherlands
- Amsterdam University Medical Center, University of Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands
- Amsterdam University Medical Center, University of Amsterdam, Amsterdam Cardiovascular Sciences, Heart Failure & Arrhythmias, Amsterdam, The Netherlands
- University of Amsterdam, Informatics Institute, Amsterdam, The Netherlands
| |
Collapse
|