1
|
Li M, Tian F, Liang S, Wang Q, Shu X, Guo Y, Wang Y. M4S-Net: a motion-enhanced shape-aware semi-supervised network for echocardiography sequence segmentation. Med Biol Eng Comput 2025:10.1007/s11517-025-03330-0. [PMID: 39994151 DOI: 10.1007/s11517-025-03330-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2024] [Accepted: 02/11/2025] [Indexed: 02/26/2025]
Abstract
Sequence segmentation of echocardiograms is of great significance for the diagnosis and treatment of cardiovascular diseases. However, the low quality of ultrasound imaging and the complexity of cardiac motion pose great challenges to it. In addition, the difficulty and cost of labeling echocardiography sequences limit the performance of supervised learning methods. In this paper, we proposed a Motion-enhanced Shape-aware Semi-supervised Sequence Segmentation Network named M4S-Net. First, multi-level shape priors are used to enhance the model's shape representation capabilities, overcoming the low image quality and improving single-frame segmentation. Then, a motion-enhanced optimization module utilizes optical flows to assist segmentation in a geometric sense, which robustly responds to the complex motions and ensures the temporal consistency of sequence segmentation. A hybrid loss function is devised to maximize the effectiveness of each module and further improve the temporal stability of predicted masks. Furthermore, the parameter-sharing strategy allows it to perform sequence segmentation in a semi-supervised manner. Massive experiments on both public and in-house datasets show that M4S-Net outperforms the state-of-the-art methods in both spatial and temporal segmentation performance. A downstream apical rocking recognition task based on M4S-Net also achieves an AUC of 0.944, which significantly exceeds specialized physicians.
Collapse
Affiliation(s)
- Mingshan Li
- Department of Electronic Engineering, Fudan University, Shanghai, 200433, China
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention (MICCAI) of Shanghai, Shanghai, 200032, China
| | - Fangyan Tian
- Department of Echocardiography, Zhongshan Hospital, Fudan University, Shanghai Institute of Cardiovascular Disease, Shanghai Institute of Medical Imaging, Shanghai, China
| | - Shuyu Liang
- Department of Electronic Engineering, Fudan University, Shanghai, 200433, China
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention (MICCAI) of Shanghai, Shanghai, 200032, China
| | - Qin Wang
- Department of Electronic Engineering, Fudan University, Shanghai, 200433, China
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention (MICCAI) of Shanghai, Shanghai, 200032, China
| | - Xianhong Shu
- Department of Echocardiography, Zhongshan Hospital, Fudan University, Shanghai Institute of Cardiovascular Disease, Shanghai Institute of Medical Imaging, Shanghai, China.
| | - Yi Guo
- Department of Electronic Engineering, Fudan University, Shanghai, 200433, China.
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention (MICCAI) of Shanghai, Shanghai, 200032, China.
| | - Yuanyuan Wang
- Department of Electronic Engineering, Fudan University, Shanghai, 200433, China.
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention (MICCAI) of Shanghai, Shanghai, 200032, China.
| |
Collapse
|
2
|
Lin J, Xie W, Kang L, Wu H. Dynamic-Guided Spatiotemporal Attention for Echocardiography Video Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3843-3855. [PMID: 38771692 DOI: 10.1109/tmi.2024.3403687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2024]
Abstract
Left ventricle (LV) endocardium segmentation in echocardiography video has received much attention as an important step in quantifying LV ejection fraction. Most existing methods are dedicated to exploiting temporal information on top of 2D convolutional networks. In addition to single appearance semantic learning, some research attempted to introduce motion cues through the optical flow estimation (OFE) task to enhance temporal consistency modeling. However, OFE in these methods is tightly coupled to LV endocardium segmentation, resulting in noisy inter-frame flow prediction, and post-optimization based on these flows accumulates errors. To address these drawbacks, we propose dynamic-guided spatiotemporal attention (DSA) for semi-supervised echocardiography video segmentation. We first fine-tune the off-the-shelf OFE network RAFT on echocardiography data to provide dynamic information. Taking inter-frame flows as additional input, we use a dual-encoder structure to extract motion and appearance features separately. Based on the connection between dynamic continuity and semantic consistency, we propose a bilateral feature calibration module to enhance both features. For temporal consistency modeling, the DSA is proposed to aggregate neighboring frame context using deformable attention that is realized by offsets grid attention. Dynamic information is introduced into DSA through a bilateral offset estimation module to effectively combine with appearance semantics and predict attention offsets, thereby guiding semantic-based spatiotemporal attention. We evaluated our method on two popular echocardiography datasets, CAMUS and EchoNet-Dynamic, and achieved state-of-the-art.
Collapse
|
3
|
Ta K, Ahn SS, Thorn SL, Stendahl JC, Zhang X, Langdon J, Staib LH, Sinusas AJ, Duncan JS. Multi-Task Learning for Motion Analysis and Segmentation in 3D Echocardiography. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2010-2020. [PMID: 38231820 PMCID: PMC11514714 DOI: 10.1109/tmi.2024.3355383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Characterizing left ventricular deformation and strain using 3D+time echocardiography provides useful insights into cardiac function and can be used to detect and localize myocardial injury. To achieve this, it is imperative to obtain accurate motion estimates of the left ventricle. In many strain analysis pipelines, this step is often accompanied by a separate segmentation step; however, recent works have shown both tasks to be highly related and can be complementary when optimized jointly. In this work, we present a multi-task learning network that can simultaneously segment the left ventricle and track its motion between multiple time frames. Two task-specific networks are trained using a composite loss function. Cross-stitch units combine the activations of these networks by learning shared representations between the tasks at different levels. We also propose a novel shape-consistency unit that encourages motion propagated segmentations to match directly predicted segmentations. Using a combined synthetic and in-vivo 3D echocardiography dataset, we demonstrate that our proposed model can achieve excellent estimates of left ventricular motion displacement and myocardial segmentation. Additionally, we observe strong correlation of our image-based strain measurements with crystal-based strain measurements as well as good correspondence with SPECT perfusion mappings. Finally, we demonstrate the clinical utility of the segmentation masks in estimating ejection fraction and sphericity indices that correspond well with benchmark measurements.
Collapse
|
4
|
Ma F, Wang S, Dai C, Qi F, Meng J. A new retinal OCT-angiography diabetic retinopathy dataset for segmentation and DR grading. JOURNAL OF BIOPHOTONICS 2023; 16:e202300052. [PMID: 37421596 DOI: 10.1002/jbio.202300052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 06/29/2023] [Accepted: 06/30/2023] [Indexed: 07/10/2023]
Abstract
PURPOSE Diabetic retinopathy (DR) is one of the most common diseases caused by diabetes and can lead to vision loss or even blindness. The wide-field optical coherence tomography (OCT) angiography is non-invasive imaging technology and convenient to diagnose DR. METHODS A newly constructed Retinal OCT-Angiography Diabetic retinopathy (ROAD) dataset is utilized for segmentation and grading tasks. It contains 1200 normal images, 1440 DR images, and 1440 ground truths for DR image segmentation. To handle the problem of grading DR, we propose a novel and effective framework, named projective map attention-based convolutional neural network (PACNet). RESULTS The experimental results demonstrate the effectiveness of our PACNet. The accuracy of the proposed framework for grading DR is 87.5% on the ROAD dataset. CONCLUSIONS The information on ROAD can be viewed at URL https://mip2019.github.io/ROAD. The ROAD dataset will be helpful for the development of the early detection of DR field and future research. TRANSLATIONAL RELEVANCE The novel framework for grading DR is a valuable research and clinical diagnosis method.
Collapse
Affiliation(s)
- Fei Ma
- Qufu Normal University, Rizhao, Shandong, China
| | | | - Cuixia Dai
- College Science, Shanghai Institute of Technology, Shanghai, China
| | - Fumin Qi
- National Supercomputing Center in Shenzhen, Shenzhen, Guangdong, China
| | - Jing Meng
- Qufu Normal University, Rizhao, Shandong, China
| |
Collapse
|
5
|
Dai W, Li X, Ding X, Cheng KT. Cyclical Self-Supervision for Semi-Supervised Ejection Fraction Prediction From Echocardiogram Videos. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1446-1461. [PMID: 37015560 DOI: 10.1109/tmi.2022.3229136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Left-ventricular ejection fraction (LVEF) is an important indicator of heart failure. Existing methods for LVEF estimation from video require large amounts of annotated data to achieve high performance, e.g. using 10,030 labeled echocardiogram videos to achieve mean absolute error (MAE) of 4.10. Labeling these videos is time-consuming however and limits potential downstream applications to other heart diseases. This paper presents the first semi-supervised approach for LVEF prediction. Unlike general video prediction tasks, LVEF prediction is specifically related to changes in the left ventricle (LV) in echocardiogram videos. By incorporating knowledge learned from predicting LV segmentations into LVEF regression, we can provide additional context to the model for better predictions. To this end, we propose a novel Cyclical Self-Supervision (CSS) method for learning video-based LV segmentation, which is motivated by the observation that the heartbeat is a cyclical process with temporal repetition. Prediction masks from our segmentation model can then be used as additional input for LVEF regression to provide spatial context for the LV region. We also introduce teacher-student distillation to distill the information from LV segmentation masks into an end-to-end LVEF regression model that only requires video inputs. Results show our method outperforms alternative semi-supervised methods and can achieve MAE of 4.17, which is competitive with state-of-the-art supervised performance, using half the number of labels. Validation on an external dataset also shows improved generalization ability from using our method.
Collapse
|
6
|
Monkam P, Jin S, Lu W. An efficient annotated data generation method for echocardiographic image segmentation. Comput Biol Med 2022; 149:106090. [PMID: 36115304 DOI: 10.1016/j.compbiomed.2022.106090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 08/12/2022] [Accepted: 09/03/2022] [Indexed: 11/25/2022]
Abstract
BACKGROUND In recent years, deep learning techniques have demonstrated promising performances in echocardiography (echo) data segmentation, which constitutes a critical step in the diagnosis and prognosis of cardiovascular diseases (CVDs). However, their successful implementation requires large number and high-quality annotated samples, whose acquisition is arduous and expertise-demanding. To this end, this study aims at circumventing the tedious, time-consuming and expertise-demanding data annotation involved in deep learning-based echo data segmentation. METHODS We propose a two-phase framework for fast generation of annotated echo data needed for implementing intelligent cardiac structure segmentation systems. First, multi-size and multi-orientation cardiac structures are simulated leveraging polynomial fitting method. Second, the obtained cardiac structures are embedded onto curated endoscopic ultrasound images using Fourier Transform algorithm, resulting in pairs of annotated samples. The practical significance of the proposed framework is validated through using the generated realistic annotated images as auxiliary dataset to pretrain deep learning models for automatic segmentation of left ventricle and left ventricle wall in real echo data, respectively. RESULTS Extensive experimental analyses indicate that compared with training from scratch, fine-tuning after pretraining with the generated dataset always results in significant performance improvement whereby the improvement margins in terms of Dice and IoU can reach 12.9% and 7.74%, respectively. CONCLUSION The proposed framework has great potential to overcome the shortage of labeled data hampering the deployment of deep learning approaches in echo data analysis.
Collapse
Affiliation(s)
- Patrice Monkam
- Easysignal Group, State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China.
| | - Songbai Jin
- Easysignal Group, State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China.
| | - Wenkai Lu
- Easysignal Group, State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
7
|
Wu M, Awasthi N, Rad NM, Pluim JPW, Lopata RGP. Advanced Ultrasound and Photoacoustic Imaging in Cardiology. SENSORS (BASEL, SWITZERLAND) 2021; 21:7947. [PMID: 34883951 PMCID: PMC8659598 DOI: 10.3390/s21237947] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Revised: 11/23/2021] [Accepted: 11/26/2021] [Indexed: 12/26/2022]
Abstract
Cardiovascular diseases (CVDs) remain the leading cause of death worldwide. An effective management and treatment of CVDs highly relies on accurate diagnosis of the disease. As the most common imaging technique for clinical diagnosis of the CVDs, US imaging has been intensively explored. Especially with the introduction of deep learning (DL) techniques, US imaging has advanced tremendously in recent years. Photoacoustic imaging (PAI) is one of the most promising new imaging methods in addition to the existing clinical imaging methods. It can characterize different tissue compositions based on optical absorption contrast and thus can assess the functionality of the tissue. This paper reviews some major technological developments in both US (combined with deep learning techniques) and PA imaging in the application of diagnosis of CVDs.
Collapse
Affiliation(s)
- Min Wu
- Photoacoustics and Ultrasound Laboratory Eindhoven (PULS/e), Department of Biomedical Engineering, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands; (N.M.R.); (R.G.P.L.)
| | - Navchetan Awasthi
- Photoacoustics and Ultrasound Laboratory Eindhoven (PULS/e), Department of Biomedical Engineering, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands; (N.M.R.); (R.G.P.L.)
- Medical Image Analysis Group (IMAG/e), Department of Biomedical Engineering, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands;
| | - Nastaran Mohammadian Rad
- Photoacoustics and Ultrasound Laboratory Eindhoven (PULS/e), Department of Biomedical Engineering, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands; (N.M.R.); (R.G.P.L.)
- Medical Image Analysis Group (IMAG/e), Department of Biomedical Engineering, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands;
| | - Josien P. W. Pluim
- Medical Image Analysis Group (IMAG/e), Department of Biomedical Engineering, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands;
| | - Richard G. P. Lopata
- Photoacoustics and Ultrasound Laboratory Eindhoven (PULS/e), Department of Biomedical Engineering, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands; (N.M.R.); (R.G.P.L.)
| |
Collapse
|