1
|
Zhou H, Hu B, Yi N, Li Q, Ergu D, Liu F. Balancing High-performance and Lightweight: HL-UNet for 3D Cardiac Medical Image Segmentation. Acad Radiol 2024; 31:4340-4351. [PMID: 38902109 DOI: 10.1016/j.acra.2024.06.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 04/27/2024] [Accepted: 06/03/2024] [Indexed: 06/22/2024]
Abstract
RATIONALE AND OBJECTIVES Cardiac magnetic resonance imaging is a crucial tool for analyzing, diagnosing, and formulating treatment plans for cardiovascular diseases. Currently, there is very little research focused on balancing cardiac segmentation performance with lightweight methods. Despite the existence of numerous efficient image segmentation algorithms, they primarily rely on complex and computationally intensive network models, making it challenging to implement them on resource-constrained medical devices. Furthermore, simplified models designed to meet the requirements of device lightweighting may have limitations in comprehending and utilizing both global and local information for cardiac segmentation. MATERIALS AND METHODS We propose a novel 3D high-performance lightweight medical image segmentation network, HL-UNet, for application in cardiac image segmentation. Specifically, in HL-UNet, we propose a novel residual-enhanced Adaptive attention (REAA) module that combines residual-enhanced connectivity with an adaptive attention mechanism to efficiently capture key features of input images and optimize their representation capabilities, and integrates the Visual Mamba (VSS) module to enhance the performance of HL-UNet. RESULTS Compared to large-scale models such as TransUNet, HL-UNet increased the Dice of the right ventricular cavity (RV), left ventricular myocardia (MYO), and left ventricular cavity (LV), the key indicators of cardiac image segmentation, by 1.61%, 5.03% and 0.19%, respectively. At the same time, the Params and FLOPs of the model decreased by 41.3 M and 31.05 G, respectively. Furthermore, compared to lightweight models such as the MISSFormer, the HL-UNet improves the Dice of RV, MYO, and LV by 4.11%, 3.82%, and 4.33%, respectively, when the number of parameters and computational complexity are close to or even lower. CONCLUSION The proposed HL-UNet model captures local details and edge information in images while being lightweight. Experimental results show that compared with large-scale models, HL-UNet significantly reduces the number of parameters and computational complexity while maintaining performance, thereby increasing frames per second (FPS). Compared to lightweight models, HL-UNet shows substantial improvements across various key metrics, with parameter count and computational complexity approaching or even lower.
Collapse
Affiliation(s)
- Hai Zhou
- College of Electronic and Information, Southwest Minzu University, Chengdu 610225, China; Key Laboratory of Electronic Information Engineering, Southwest Minzu University, Chengdu 610225, China
| | - Binbin Hu
- College of Electronic and Information, Southwest Minzu University, Chengdu 610225, China; Key Laboratory of Electronic Information Engineering, Southwest Minzu University, Chengdu 610225, China
| | - Nengmin Yi
- College of Electronic and Information, Southwest Minzu University, Chengdu 610225, China; Key Laboratory of Electronic Information Engineering, Southwest Minzu University, Chengdu 610225, China
| | - Qingtai Li
- College of Electronic and Information, Southwest Minzu University, Chengdu 610225, China; Key Laboratory of Electronic Information Engineering, Southwest Minzu University, Chengdu 610225, China
| | - Daji Ergu
- College of Electronic and Information, Southwest Minzu University, Chengdu 610225, China; Key Laboratory of Electronic Information Engineering, Southwest Minzu University, Chengdu 610225, China
| | - Fangyao Liu
- College of Electronic and Information, Southwest Minzu University, Chengdu 610225, China; Key Laboratory of Electronic Information Engineering, Southwest Minzu University, Chengdu 610225, China.
| |
Collapse
|
2
|
Lawal M, Yi D. Polar contrast attention and skip cross-channel aggregation for efficient learning in U-Net. Comput Biol Med 2024; 181:109047. [PMID: 39182369 DOI: 10.1016/j.compbiomed.2024.109047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 07/31/2024] [Accepted: 08/18/2024] [Indexed: 08/27/2024]
Abstract
The performance of existing lesion semantic segmentation models has shown a steady improvement with the introduction of mechanisms like attention, skip connections, and deep supervision. However, these advancements often come at the expense of computational requirements, necessitating powerful graphics processing units with substantial video memory. Consequently, certain models may exhibit poor or non-existent performance on more affordable edge devices, such as smartphones and other point-of-care devices. To tackle this challenge, our paper introduces a lesion segmentation model with a low parameter count and minimal operations. This model incorporates polar transformations to simplify images, facilitating faster training and improved performance. We leverage the characteristics of polar images by directing the model's focus to areas most likely to contain segmentation information, achieved through the introduction of a learning-efficient polar-based contrast attention (PCA). This design utilizes Hadamard products to implement a lightweight attention mechanism without significantly increasing model parameters and complexities. Furthermore, we present a novel skip cross-channel aggregation (SC2A) approach for sharing cross-channel corrections, introducing Gaussian depthwise convolution to enhance nonlinearity. Extensive experiments on the ISIC 2018 and Kvasir datasets demonstrate that our model surpasses state-of-the-art models while maintaining only about 25K parameters. Additionally, our proposed model exhibits strong generalization to cross-domain data, as confirmed through experiments on the PH2 dataset and CVC-Polyp dataset. In addition, we evaluate the model's performance in a mobile setting against other lightweight models. Notably, our proposed model outperforms other advanced models in terms of IoU and Dice score, and running time.
Collapse
Affiliation(s)
- Mohammed Lawal
- Department of Computing Science, University of Aberdeen, Aberdeen, AB24 3UE, United Kingdom
| | - Dewei Yi
- Department of Computing Science, University of Aberdeen, Aberdeen, AB24 3UE, United Kingdom.
| |
Collapse
|
3
|
Gupta U, Paluru N, Nankani D, Kulkarni K, Awasthi N. A comprehensive review on efficient artificial intelligence models for classification of abnormal cardiac rhythms using electrocardiograms. Heliyon 2024; 10:e26787. [PMID: 38562492 PMCID: PMC10982903 DOI: 10.1016/j.heliyon.2024.e26787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 02/20/2024] [Indexed: 04/04/2024] Open
Abstract
Deep learning has made many advances in data classification using electrocardiogram (ECG) waveforms. Over the past decade, data science research has focused on developing artificial intelligence (AI) based models that can analyze ECG waveforms to identify and classify abnormal cardiac rhythms accurately. However, the primary drawback of the current AI models is that most of these models are heavy, computationally intensive, and inefficient in terms of cost for real-time implementation. In this review, we first discuss the current state-of-the-art AI models utilized for ECG-based cardiac rhythm classification. Next, we present some of the upcoming modeling methodologies which have the potential to perform real-time implementation of AI-based heart rhythm diagnosis. These models hold significant promise in being lightweight and computationally efficient without compromising the accuracy. Contemporary models predominantly utilize 12-lead ECG for cardiac rhythm classification and cardiovascular status prediction, increasing the computational burden and making real-time implementation challenging. We also summarize research studies evaluating the potential of efficient data setups to reduce the number of ECG leads without affecting classification accuracy. Lastly, we present future perspectives on AI's utility in precision medicine by providing opportunities for accurate prediction and diagnostics of cardiovascular status in patients.
Collapse
Affiliation(s)
- Utkarsh Gupta
- Department of Computational and Data Sciences, Indian Institute of Science, Bengaluru, 560012, India
| | - Naveen Paluru
- Department of Computational and Data Sciences, Indian Institute of Science, Bengaluru, 560012, India
| | - Deepankar Nankani
- Department of Computer Science and Engineering, Indian Institute of Technology, Guwahati, Assam, 781039, India
| | - Kanchan Kulkarni
- IHU-LIRYC, Heart Rhythm Disease Institute, Fondation Bordeaux Université, Pessac, Bordeaux, F-33000, France
- University of Bordeaux, INSERM, Centre de recherche Cardio-Thoracique de Bordeaux, U1045, Bordeaux, F-33000, France
| | - Navchetan Awasthi
- Faculty of Science, Mathematics and Computer Science, Informatics Institute, University of Amsterdam, Amsterdam, 1090 GH, the Netherlands
- Department of Biomedical Engineering and Physics, Amsterdam UMC, Amsterdam, 1081 HV, the Netherlands
| |
Collapse
|
4
|
Zhang Y, Chen Z, Yang X. Light-M: An efficient lightweight medical image segmentation framework for resource-constrained IoMT. Comput Biol Med 2024; 170:108088. [PMID: 38320339 DOI: 10.1016/j.compbiomed.2024.108088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 12/22/2023] [Accepted: 01/27/2024] [Indexed: 02/08/2024]
Abstract
The Internet of Medical Things (IoMT) is being incorporated into current healthcare systems. This technology intends to connect patients, IoMT devices, and hospitals over mobile networks, allowing for more secure, quick, and convenient health monitoring and intelligent healthcare services. However, existing intelligent healthcare applications typically rely on large-scale AI models, and standard IoMT devices have significant resource constraints. To alleviate this paradox, in this paper, we propose a Knowledge Distillation (KD)-based IoMT end-edge-cloud orchestrated architecture for medical image segmentation tasks, called Light-M, aiming to deploy a lightweight medical model in resource-constrained IoMT devices. Specifically, Light-M trains a large teacher model in the cloud server and employs computation in local nodes through imitation of the performance of the teacher model using knowledge distillation. Light-M contains two KD strategies: (1) active exploration and passive transfer (AEPT) and (2) self-attention-based inter-class feature variation (AIFV) distillation for the medical image segmentation task. The AEPT encourages the student model to learn undiscovered knowledge/features of the teacher model without additional feature layers, aiming to explore new features and outperform the teacher. To improve the distinguishability of the student for different classes, the student learns the self-attention-based feature variation (AIFV) between classes. Since the proposed AEPT and AIFV only appear in the training process, our framework does not involve any additional computation burden for a student model during the segmentation task deployment. Extensive experiments on cardiac images and public real-scene datasets demonstrate that our approach improves student model learning representations and outperforms state-of-the-art methods by combining two knowledge distillation strategies. Moreover, when deployed on the IoT device, the distilled student model takes only 29.6 ms for one sample at the inference step.
Collapse
Affiliation(s)
- Yifan Zhang
- Shenzhen University, 3688 Nanhai Ave., Shenzhen, 518060, Guangdong, China
| | - Zhuangzhuang Chen
- Shenzhen University, 3688 Nanhai Ave., Shenzhen, 518060, Guangdong, China
| | - Xuan Yang
- Shenzhen University, 3688 Nanhai Ave., Shenzhen, 518060, Guangdong, China.
| |
Collapse
|
5
|
Su C, Zhou Y, Ma J, Chi H, Jing X, Jiao J, Yan Q. JANet: A joint attention network for balancing accuracy and speed in left ventricular ultrasound video segmentation. Comput Biol Med 2024; 169:107856. [PMID: 38154159 DOI: 10.1016/j.compbiomed.2023.107856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 11/08/2023] [Accepted: 12/11/2023] [Indexed: 12/30/2023]
Abstract
Multiple cardiac diseases are closely associated with functional parameters of the left ventricle, but functional parameter quantification still requires manual involvement, a time-consuming and less reproducible task. We develop a joint attention network (JANet) and expand it into two versions (V1 and V2) that can be used to segment the left ventricular region in echocardiograms to assist physicians in diagnosis. V1 is a smaller model with a size of 56.3 MB, and V2 has a higher accuracy. The proposed JANet V1 and V2 achieve a mean dice score (DSC) of 93.59/93.69(V1/V2), respectively, outperforming the state-of-the-art models. We grade 1264 patients with 87.24/87.50 (V1/V2) accuracy when using the 2-level classification criteria and 83.62/84.18 (V1/V2) when using the 5-level classification criteria. The results of the consistency analysis show that the proposed method is comparable to that of clinicians.
Collapse
Affiliation(s)
- Chenkai Su
- School of Integrated Circuits, Shandong University, Jinan, 250101, China
| | - Yuxiang Zhou
- School of Integrated Circuits, Shandong University, Jinan, 250101, China
| | - Jinlian Ma
- School of Integrated Circuits, Shandong University, Jinan, 250101, China; Shenzhen Research Institute of Shandong University, A301 Virtual University Park in South District of Shenzhen, China.
| | - Haoyu Chi
- School of Integrated Circuits, Shandong University, Jinan, 250101, China
| | - Xin Jing
- School of Integrated Circuits, Shandong University, Jinan, 250101, China
| | - Junyan Jiao
- School of Integrated Circuits, Shandong University, Jinan, 250101, China
| | - Qiqi Yan
- School of Integrated Circuits, Shandong University, Jinan, 250101, China
| |
Collapse
|
6
|
Shoaib MA, Chuah JH, Ali R, Dhanalakshmi S, Hum YC, Khalil A, Lai KW. Fully Automatic Left Ventricle Segmentation Using Bilateral Lightweight Deep Neural Network. LIFE (BASEL, SWITZERLAND) 2023; 13:life13010124. [PMID: 36676073 PMCID: PMC9864753 DOI: 10.3390/life13010124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 12/22/2022] [Accepted: 12/29/2022] [Indexed: 01/04/2023]
Abstract
The segmentation of the left ventricle (LV) is one of the fundamental procedures that must be performed to obtain quantitative measures of the heart, such as its volume, area, and ejection fraction. In clinical practice, the delineation of LV is still often conducted semi-automatically, leaving it open to operator subjectivity. The automatic LV segmentation from echocardiography images is a challenging task due to poorly defined boundaries and operator dependency. Recent research has demonstrated that deep learning has the capability to employ the segmentation process automatically. However, the well-known state-of-the-art segmentation models still lack in terms of accuracy and speed. This study aims to develop a single-stage lightweight segmentation model that precisely and rapidly segments the LV from 2D echocardiography images. In this research, a backbone network is used to acquire both low-level and high-level features. Two parallel blocks, known as the spatial feature unit and the channel feature unit, are employed for the enhancement and improvement of these features. The refined features are merged by an integrated unit to segment the LV. The performance of the model and the time taken to segment the LV are compared to other established segmentation models, DeepLab, FCN, and Mask RCNN. The model achieved the highest values of the dice similarity index (0.9446), intersection over union (0.8445), and accuracy (0.9742). The evaluation metrics and processing time demonstrate that the proposed model not only provides superior quantitative results but also trains and segments the LV in less time, indicating its improved performance over competing segmentation models.
Collapse
Affiliation(s)
- Muhammad Ali Shoaib
- Department of Electrical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur 50603, Malaysia
- Faculty of Information and Communication Technology, BUITEMS, Quetta 87300, Pakistan
| | - Joon Huang Chuah
- Department of Electrical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur 50603, Malaysia
| | - Raza Ali
- Department of Electrical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur 50603, Malaysia
- Faculty of Information and Communication Technology, BUITEMS, Quetta 87300, Pakistan
| | - Samiappan Dhanalakshmi
- Department of Electronics and Communication Engineering, SRM Institute of Science and Technology, Kattankulathur 603203, India
| | - Yan Chai Hum
- Department of Mechatronics and Biomedical Engineering (DMBE), Lee Kong Chian Faculty of Engineering and Science (LKC FES), Universiti Tunku Abdul Rahman (UTAR), Jalan Sungai Long, Bandar Sungai Long, Cheras, Kajang 43000, Malaysia
| | - Azira Khalil
- Faculty of Science and Technology, Universiti Sains Islam Malaysia (USIM), Nilai 71800, Malaysia
| | - Khin Wee Lai
- Department of Biomedical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur 50603, Malaysia
- Correspondence:
| |
Collapse
|
7
|
Bandwidth Improvement in Ultrasound Image Reconstruction Using Deep Learning Techniques. Healthcare (Basel) 2022; 11:healthcare11010123. [PMID: 36611583 PMCID: PMC9819580 DOI: 10.3390/healthcare11010123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 12/24/2022] [Accepted: 12/29/2022] [Indexed: 01/03/2023] Open
Abstract
Ultrasound (US) imaging is a medical imaging modality that uses the reflection of sound in the range of 2-18 MHz to image internal body structures. In US, the frequency bandwidth (BW) is directly associated with image resolution. BW is a property of the transducer and more bandwidth comes at a higher cost. Thus, methods that can transform strongly bandlimited ultrasound data into broadband data are essential. In this work, we propose a deep learning (DL) technique to improve the image quality for a given bandwidth by learning features provided by broadband data of the same field of view. Therefore, the performance of several DL architectures and conventional state-of-the-art techniques for image quality improvement and artifact removal have been compared on in vitro US datasets. Two training losses have been utilized on three different architectures: a super resolution convolutional neural network (SRCNN), U-Net, and a residual encoder decoder network (REDNet) architecture. The models have been trained to transform low-bandwidth image reconstructions to high-bandwidth image reconstructions, to reduce the artifacts, and make the reconstructions visually more attractive. Experiments were performed for 20%, 40%, and 60% fractional bandwidth on the original images and showed that the improvements obtained are as high as 45.5% in RMSE, and 3.85 dB in PSNR, in datasets with a 20% bandwidth limitation.
Collapse
|