1
|
Zhang S, Cui Y, Xu D, Lin Y. A collaborative inference strategy for medical image diagnosis in mobile edge computing environment. PeerJ Comput Sci 2025; 11:e2708. [PMID: 40134868 PMCID: PMC11935760 DOI: 10.7717/peerj-cs.2708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2024] [Accepted: 01/25/2025] [Indexed: 03/27/2025]
Abstract
The popularity and convenience of mobile medical image analysis and diagnosis in mobile edge computing (MEC) environments have greatly improved the efficiency and quality of healthcare services, necessitating the use of deep neural networks (DNNs) for image analysis. However, DNNs face performance and energy constraints when operating on the mobile side, and are limited by communication costs and privacy issues when operating on the edge side, and previous edge-end collaborative approaches have shown unstable performance and low search efficiency when exploring classification strategies. To address these issues, we propose a DNN edge-optimized collaborative inference strategy (MOCI) for medical image diagnosis, which optimizes data transfer and computation allocation by combining compression techniques and multi-agent reinforcement learning (MARL) methods. The MOCI strategy first uses coding and quantization-based compression methods to reduce the redundancy of image data during transmission at the edge, and then dynamically segments the DNN model through MARL and executes it collaboratively between the edge and the mobile device. To improve policy stability and adaptability, MOCI introduces the optimal transmission distance (Wasserstein) to optimize the policy update process, and uses the long short-term memory (LSTM) network to improve the model's adaptability to dynamic task complexity. The experimental results show that the MOCI strategy can effectively solve the collaborative inference task of medical image diagnosis and significantly reduce the latency and energy consumption with less than a 2% loss in classification accuracy, with a maximum reduction of 38.5% in processing latency and 71% in energy consumption compared to other inference strategies. In real-world MEC scenarios, MOCI has a wide range of potential applications that can effectively promote the development and application of intelligent healthcare.
Collapse
Affiliation(s)
- Shiqian Zhang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, Henan, China
- Collaborative Innovation Center for Internet Healthcare, Zhengzhou University, Zhengzhou, Henan, China
| | - Yong Cui
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, Henan, China
| | - Dandan Xu
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, Henan, China
- Collaborative Innovation Center for Internet Healthcare, Zhengzhou University, Zhengzhou, Henan, China
| | - Yusong Lin
- Collaborative Innovation Center for Internet Healthcare, Zhengzhou University, Zhengzhou, Henan, China
- School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou, Henan, China
| |
Collapse
|
2
|
Shan C, Zhang Y, Liu C, Jin Z, Cheng H, Chen Y, Yao J, Luo S. LSMD: Long-Short Memory-Based Detection Network for Carotid Artery Detection in B-Mode Ultrasound Video Streams. IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL 2024; 71:1464-1477. [PMID: 39514357 DOI: 10.1109/tuffc.2024.3494019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
Carotid atherosclerotic plaques are a major complication associated with type II diabetes, and carotid ultrasound is commonly used for diagnosing carotid vascular disease. In primary hospitals, less experienced ultrasound physicians often struggle to consistently capture standard carotid images and identify plaques. To address this issue, we propose a novel approach, the long-short memory-based detection (LSMD) network, for carotid artery detection in ultrasound video streams, facilitating the identification and localization of critical anatomical structures and plaques. This approach models short- and long-distance spatiotemporal features through short-term temporal aggregation (STA) and long-term temporal aggregation (LTA) modules, effectively expanding the temporal receptive field with minimal delay and enhancing the detection efficiency of carotid anatomy and plaques. Specifically, we introduce memory buffers with a dynamic updating strategy to ensure extensive temporal receptive field coverage while minimizing memory and computation costs. The proposed model was trained on 80 carotid ultrasound videos and evaluated on 50, with all videos annotated by physicians for carotid anatomies and plaques. The trained LSMD was evaluated for performance on the validation and test sets using the single-frame image-based single shot multibox detector (SSD) algorithm as a baseline. The results show that the precision, recall, average precision (AP) at ( ), and mean AP (mAP) are 6.83%, 12.29%, 11.23%, and 13.21% higher than the baseline ( ), respectively, while the model's inference latency reaches 6.97 ms on a desktop-level GPU (NVIDIA RTX 3090Ti) and 29.69 ms on an edge computing device (Jetson Orin Nano). These findings demonstrate that LSMD can accurately localize carotid anatomy and plaques with real-time inference, indicating its potential for enhancing diagnostic accuracy in clinical practice.
Collapse
|
3
|
Awais M, Al Taie M, O’Connor CS, Castelo AH, Acidi B, Tran Cao HS, Brock KK. Enhancing Surgical Guidance: Deep Learning-Based Liver Vessel Segmentation in Real-Time Ultrasound Video Frames. Cancers (Basel) 2024; 16:3674. [PMID: 39518111 PMCID: PMC11545685 DOI: 10.3390/cancers16213674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2024] [Revised: 10/21/2024] [Accepted: 10/25/2024] [Indexed: 11/16/2024] Open
Abstract
BACKGROUND/OBJECTIVES In the field of surgical medicine, the planning and execution of liver resection procedures present formidable challenges, primarily attributable to the intricate and highly individualized nature of liver vascular anatomy. In the current surgical milieu, intraoperative ultrasonography (IOUS) has become indispensable; however, traditional 2D ultrasound imaging's interpretability is hindered by noise and speckle artifacts. Accurate identification of critical structures for preservation during hepatectomy requires advanced surgical skills. METHODS An AI-based model that can help detect and recognize vessels including the inferior vena cava (IVC); the right (RHV), middle (MHV), and left (LVH) hepatic veins; the portal vein (PV) and its major first and second order branches the left portal vein (LPV), right portal vein (RPV), and right anterior (RAPV) and posterior (RPPV) portal veins, for real-time IOUS navigation can be of immense value in liver surgery. This research aims to advance the capabilities of IOUS-guided interventions by applying an innovative AI-based approach named the "2D-weigthed U-Net model" for the segmentation of multiple blood vessels in real-time IOUS video frames. RESULTS Our proposed deep learning (DL) model achieved a mean Dice score of 0.92 for IVC, 0.90 for RHV, 0.89 for MHV, 0.86 for LHV, 0.95 for PV, 0.93 for LPV, 0.84 for RPV, 0.85 for RAPV, and 0.96 for RPPV. CONCLUSION In the future, this research will be extended for real-time multi-label segmentation of extended vasculature in the liver, followed by the translation of our model into the surgical suite.
Collapse
Affiliation(s)
- Muhammad Awais
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; (M.A.)
| | - Mais Al Taie
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; (M.A.)
| | - Caleb S. O’Connor
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; (M.A.)
| | - Austin H. Castelo
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; (M.A.)
| | - Belkacem Acidi
- Department of Surgical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA (H.S.T.C.)
| | - Hop S. Tran Cao
- Department of Surgical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA (H.S.T.C.)
| | - Kristy K. Brock
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; (M.A.)
- Department the of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| |
Collapse
|
4
|
Xu M, Ma Q, Zhang H, Kong D, Zeng T. MEF-UNet: An end-to-end ultrasound image segmentation algorithm based on multi-scale feature extraction and fusion. Comput Med Imaging Graph 2024; 114:102370. [PMID: 38513396 DOI: 10.1016/j.compmedimag.2024.102370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 03/10/2024] [Accepted: 03/13/2024] [Indexed: 03/23/2024]
Abstract
Ultrasound image segmentation is a challenging task due to the complexity of lesion types, fuzzy boundaries, and low-contrast images along with the presence of noises and artifacts. To address these issues, we propose an end-to-end multi-scale feature extraction and fusion network (MEF-UNet) for the automatic segmentation of ultrasound images. Specifically, we first design a selective feature extraction encoder, including detail extraction stage and structure extraction stage, to precisely capture the edge details and overall shape features of the lesions. In order to enhance the representation capacity of contextual information, we develop a context information storage module in the skip-connection section, responsible for integrating information from adjacent two-layer feature maps. In addition, we design a multi-scale feature fusion module in the decoder section to merge feature maps with different scales. Experimental results indicate that our MEF-UNet can significantly improve the segmentation results in both quantitative analysis and visual effects.
Collapse
Affiliation(s)
- Mengqi Xu
- School of Mathematics and Statistics, Nanjing University of Information Science and Technology, Nanjing, Jiangsu, 210044, China
| | - Qianting Ma
- School of Mathematics and Statistics, Nanjing University of Information Science and Technology, Nanjing, Jiangsu, 210044, China.
| | - Huajie Zhang
- School of Mathematics and Statistics, Nanjing University of Information Science and Technology, Nanjing, Jiangsu, 210044, China
| | - Dexing Kong
- School of Mathematical Sciences, Zhejiang University, Hangzhou, Zhejiang, 310027, China
| | - Tieyong Zeng
- Department of Mathematics, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region of China
| |
Collapse
|
5
|
Cai S, Lin Y, Chen H, Huang Z, Zhou Y, Zheng Y. Automated analysis of pectoralis major thickness in pec-fly exercises: evolving from manual measurement to deep learning techniques. Vis Comput Ind Biomed Art 2024; 7:8. [PMID: 38625580 PMCID: PMC11021386 DOI: 10.1186/s42492-024-00159-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 03/22/2024] [Indexed: 04/17/2024] Open
Abstract
This study addresses a limitation of prior research on pectoralis major (PMaj) thickness changes during the pectoralis fly exercise using a wearable ultrasound imaging setup. Although previous studies used manual measurement and subjective evaluation, it is important to acknowledge the subsequent limitations of automating widespread applications. We then employed a deep learning model for image segmentation and automated measurement to solve the problem and study the additional quantitative supplementary information that could be provided. Our results revealed increased PMaj thickness changes in the coronal plane within the probe detection region when real-time ultrasound imaging (RUSI) visual biofeedback was incorporated, regardless of load intensity (50% or 80% of one-repetition maximum). Additionally, participants showed uniform thickness changes in the PMaj in response to enhanced RUSI biofeedback. Notably, the differences in PMaj thickness changes between load intensities were reduced by RUSI biofeedback, suggesting altered muscle activation strategies. We identified the optimal measurement location for the maximal PMaj thickness close to the rib end and emphasized the lightweight applicability of our model for fitness training and muscle assessment. Further studies can refine load intensities, investigate diverse parameters, and employ different network models to enhance accuracy. This study contributes to our understanding of the effects of muscle physiology and exercise training.
Collapse
Affiliation(s)
- Shangyu Cai
- School of Biomedical Engineering, Medical School, Shenzhen University, Shenzhen, 518073, China
| | - Yongsheng Lin
- School of Biomedical Engineering, Medical School, Shenzhen University, Shenzhen, 518073, China
| | - Haoxin Chen
- School of Biomedical Engineering, Medical School, Shenzhen University, Shenzhen, 518073, China
| | - Zihao Huang
- Department of Biomedical Engineering, the Hong Kong Polytechnic University, Hong Kong, 999077, China
| | - Yongjin Zhou
- School of Biomedical Engineering, Medical School, Shenzhen University, Shenzhen, 518073, China.
| | - Yongping Zheng
- Department of Biomedical Engineering, the Hong Kong Polytechnic University, Hong Kong, 999077, China.
| |
Collapse
|
6
|
GLAN: GAN Assisted Lightweight Attention Network for Biomedical Imaging Based Diagnostics. Cognit Comput 2023. [DOI: 10.1007/s12559-023-10131-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
|
7
|
Ansari MY, Yang Y, Meher PK, Dakua SP. Dense-PSP-UNet: A neural network for fast inference liver ultrasound segmentation. Comput Biol Med 2023; 153:106478. [PMID: 36603437 DOI: 10.1016/j.compbiomed.2022.106478] [Citation(s) in RCA: 52] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 11/29/2022] [Accepted: 12/21/2022] [Indexed: 01/02/2023]
Abstract
Liver Ultrasound (US) or sonography is popularly used because of its real-time output, low-cost, ease-of-use, portability, and non-invasive nature. Segmentation of real-time liver US is essential for diagnosing and analyzing liver conditions (e.g., hepatocellular carcinoma (HCC)), assisting the surgeons/radiologists in therapeutic procedures. In this paper, we propose a method using a modified Pyramid Scene Parsing (PSP) module in tuned neural network backbones to achieve real-time segmentation without compromising the segmentation accuracy. Considering widespread noise in US data and its impact on outcomes, we study the impact of pre-processing and the influence of loss functions on segmentation performance. We have tested our method after annotating a publicly available US dataset containing 2400 images of 8 healthy volunteers (link to the annotated dataset is provided); the results show that the Dense-PSP-UNet model achieves a high Dice coefficient of 0.913±0.024 while delivering a real-time performance of 37 frames per second (FPS).
Collapse
Affiliation(s)
| | - Yin Yang
- Hamad Bin Khalifa Uinversity, Doha, Qatar
| | | | | |
Collapse
|
8
|
Computation and memory optimized spectral domain convolutional neural network for throughput and energy-efficient inference. APPL INTELL 2023; 53:4499-4523. [PMID: 35730044 PMCID: PMC9188280 DOI: 10.1007/s10489-022-03756-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/10/2022] [Indexed: 02/04/2023]
Abstract
Conventional convolutional neural networks (CNNs) present a high computational workload and memory access cost (CMC). Spectral domain CNNs (SpCNNs) offer a computationally efficient approach to compute CNN training and inference. This paper investigates CMC of SpCNNs and its contributing components analytically and then proposes a methodology to optimize CMC, under three strategies, to enhance inference performance. In this methodology, output feature map (OFM) size, OFM depth or both are progressively reduced under an accuracy constraint to compute performance-optimized CNN inference. Before conducting training or testing, it can provide designers guidelines and preliminary insights regarding techniques for optimum performance, least degradation in accuracy and a balanced performance-accuracy trade-off. This methodology was evaluated on MNIST and Fashion MNIST datasets using LeNet-5 and AlexNet architectures. When compared to state-of-the-art SpCNN models, LeNet-5 achieves up to 4.2× (batch inference) and 4.1× (single-image inference) higher throughputs and 10.5× (batch inference) and 4.2× (single-image inference) greater energy efficiency at a maximum loss of 3% in test accuracy. When compared to the baseline model used in this study, AlexNet delivers 11.6× (batch inference) and 5× (single-image inference) increased throughput and 25× (batch inference) and 8.8× (single-image inference) more energy-efficient inference with just 4.4% reduction in accuracy.
Collapse
|
9
|
Ansari MY, Yang Y, Balakrishnan S, Abinahed J, Al-Ansari A, Warfa M, Almokdad O, Barah A, Omer A, Singh AV, Meher PK, Bhadra J, Halabi O, Azampour MF, Navab N, Wendler T, Dakua SP. A lightweight neural network with multiscale feature enhancement for liver CT segmentation. Sci Rep 2022; 12:14153. [PMID: 35986015 PMCID: PMC9391485 DOI: 10.1038/s41598-022-16828-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 07/18/2022] [Indexed: 11/18/2022] Open
Abstract
Segmentation of abdominal Computed Tomography (CT) scan is essential for analyzing, diagnosing, and treating visceral organ diseases (e.g., hepatocellular carcinoma). This paper proposes a novel neural network (Res-PAC-UNet) that employs a fixed-width residual UNet backbone and Pyramid Atrous Convolutions, providing a low disk utilization method for precise liver CT segmentation. The proposed network is trained on medical segmentation decathlon dataset using a modified surface loss function. Additionally, we evaluate its quantitative and qualitative performance; the Res16-PAC-UNet achieves a Dice coefficient of 0.950 ± 0.019 with less than half a million parameters. Alternatively, the Res32-PAC-UNet obtains a Dice coefficient of 0.958 ± 0.015 with an acceptable parameter count of approximately 1.2 million.
Collapse
Affiliation(s)
| | - Yin Yang
- Hamad Bin Khalifa University, Doha, Qatar
| | | | | | | | - Mohamed Warfa
- Wake Forest Baptist Medical Center, Winston-Salem, USA
| | | | - Ali Barah
- Hamad Medical Corporation, Doha, Qatar
| | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Dinsdale NK, Jenkinson M, Namburete AIL. STAMP: Simultaneous Training and Model Pruning for low data regimes in medical image segmentation. Med Image Anal 2022; 81:102583. [PMID: 36037556 DOI: 10.1016/j.media.2022.102583] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 07/25/2022] [Accepted: 08/11/2022] [Indexed: 02/01/2023]
Abstract
Acquisition of high quality manual annotations is vital for the development of segmentation algorithms. However, to create them we require a substantial amount of expert time and knowledge. Large numbers of labels are required to train convolutional neural networks due to the vast number of parameters that must be learned in the optimisation process. Here, we develop the STAMP algorithm to allow the simultaneous training and pruning of a UNet architecture for medical image segmentation with targeted channelwise dropout to make the network robust to the pruning. We demonstrate the technique across segmentation tasks and imaging modalities. It is then shown that, through online pruning, we are able to train networks to have much higher performance than the equivalent standard UNet models while reducing their size by more than 85% in terms of parameters. This has the potential to allow networks to be directly trained on datasets where very low numbers of labels are available.
Collapse
Affiliation(s)
- Nicola K Dinsdale
- Wellcome Centre for Integrative Neuroimaging, FMRIB, Nuffield Department of Clinical Neurosciences, University of Oxford, UK; Oxford Machine Learning in NeuroImaging Lab (OMNI), Department of Computer Science, University of Oxford, UK.
| | - Mark Jenkinson
- Wellcome Centre for Integrative Neuroimaging, FMRIB, Nuffield Department of Clinical Neurosciences, University of Oxford, UK; Australian Institute for Machine Learning (AIML), School of Computer Science, University of Adelaide, Adelaide, Australia; South Australian Health and Medical Research Institute (SAHMRI), North Terrace, Adelaide, Australia
| | - Ana I L Namburete
- Oxford Machine Learning in NeuroImaging Lab (OMNI), Department of Computer Science, University of Oxford, UK
| |
Collapse
|
11
|
Abstract
Medical imaging is considered one of the most important advances in the history of medicine and has become an essential part of the diagnosis and treatment of patients. Earlier prediction and treatment have been driving the acquisition of higher image resolutions as well as the fusion of different modalities, raising the need for sophisticated hardware and software systems for medical image registration, storage, analysis, and processing. In this scenario and given the new clinical pipelines and the huge clinical burden of hospitals, these systems are often required to provide both highly accurate and real-time processing of large amounts of imaging data. Additionally, lowering the prices of each part of imaging equipment, as well as its development and implementation, and increasing their lifespan is crucial to minimize the cost and lead to more accessible healthcare. This paper focuses on the evolution and the application of different hardware architectures (namely, CPU, GPU, DSP, FPGA, and ASIC) in medical imaging through various specific examples and discussing different options depending on the specific application. The main purpose is to provide a general introduction to hardware acceleration techniques for medical imaging researchers and developers who need to accelerate their implementations.
Collapse
|