1
|
Zhang Z, Yu C, Zhang H, Gao Z. Embedding Tasks Into the Latent Space: Cross-Space Consistency for Multi-Dimensional Analysis in Echocardiography. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2215-2228. [PMID: 38329865 DOI: 10.1109/tmi.2024.3362964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/10/2024]
Abstract
Multi-dimensional analysis in echocardiography has attracted attention due to its potential for clinical indices quantification and computer-aided diagnosis. It can utilize various information to provide the estimation of multiple cardiac indices. However, it still has the challenge of inter-task conflict. This is owing to regional confusion, global abnormalities, and time-accumulated errors. Task mapping methods have the potential to address inter-task conflict. However, they may overlook the inherent differences between tasks, especially for multi-level tasks (e.g., pixel-level, image-level, and sequence-level tasks). This may lead to inappropriate local and spurious task constraints. We propose cross-space consistency (CSC) to overcome the challenge. The CSC embeds multi-level tasks to the same-level to reduce inherent task differences. This allows multi-level task features to be consistent in a unified latent space. The latent space extracts task-common features and constrains the distance in these features. This constrains the task weight region that satisfies multiple task conditions. Extensive experiments compare the CSC with fifteen state-of-the-art echocardiographic analysis methods on five datasets (10,908 patients). The result shows that the CSC can provide left ventricular (LV) segmentation, (DSC = 0.932), keypoint detection (MAE = 3.06mm), and keyframe identification (accuracy = 0.943). These results demonstrate that our method can provide a multi-dimensional analysis of cardiac function and is robust in large-scale datasets.
Collapse
|
2
|
Freitas J, Gomes-Fonseca J, Tonelli AC, Correia-Pinto J, Fonseca JC, Queirós S. Automatic multi-view pose estimation in focused cardiac ultrasound. Med Image Anal 2024; 94:103146. [PMID: 38537416 DOI: 10.1016/j.media.2024.103146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 03/18/2024] [Accepted: 03/19/2024] [Indexed: 04/16/2024]
Abstract
Focused cardiac ultrasound (FoCUS) is a valuable point-of-care method for evaluating cardiovascular structures and function, but its scope is limited by equipment and operator's experience, resulting in primarily qualitative 2D exams. This study presents a novel framework to automatically estimate the 3D spatial relationship between standard FoCUS views. The proposed framework uses a multi-view U-Net-like fully convolutional neural network to regress line-based heatmaps representing the most likely areas of intersection between input images. The lines that best fit the regressed heatmaps are then extracted, and a system of nonlinear equations based on the intersection between view triplets is created and solved to determine the relative 3D pose between all input images. The feasibility and accuracy of the proposed pipeline were validated using a novel realistic in silico FoCUS dataset, demonstrating promising results. Interestingly, as shown in preliminary experiments, the estimation of the 2D images' relative poses enables the application of 3D image analysis methods and paves the way for 3D quantitative assessments in FoCUS examinations.
Collapse
Affiliation(s)
- João Freitas
- Life and Health Sciences Research Institute (ICVS), School of Medicine, University of Minho, Braga, Portugal; ICVS/3B's - PT Government Associate Laboratory, Braga/Guimarães, Portugal; Algoritmi Center, School of Engineering, University of Minho, Guimarães, Portugal
| | - João Gomes-Fonseca
- Life and Health Sciences Research Institute (ICVS), School of Medicine, University of Minho, Braga, Portugal; ICVS/3B's - PT Government Associate Laboratory, Braga/Guimarães, Portugal
| | | | - Jorge Correia-Pinto
- Life and Health Sciences Research Institute (ICVS), School of Medicine, University of Minho, Braga, Portugal; ICVS/3B's - PT Government Associate Laboratory, Braga/Guimarães, Portugal; Department of Pediatric Surgery, Hospital de Braga, Braga, Portugal
| | - Jaime C Fonseca
- Algoritmi Center, School of Engineering, University of Minho, Guimarães, Portugal
| | - Sandro Queirós
- Life and Health Sciences Research Institute (ICVS), School of Medicine, University of Minho, Braga, Portugal; ICVS/3B's - PT Government Associate Laboratory, Braga/Guimarães, Portugal.
| |
Collapse
|
3
|
Miao J, Zhou SP, Zhou GQ, Wang KN, Yang M, Zhou S, Chen Y. SC-SSL: Self-Correcting Collaborative and Contrastive Co-Training Model for Semi-Supervised Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1347-1364. [PMID: 37995173 DOI: 10.1109/tmi.2023.3336534] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2023]
Abstract
Image segmentation achieves significant improvements with deep neural networks at the premise of a large scale of labeled training data, which is laborious to assure in medical image tasks. Recently, semi-supervised learning (SSL) has shown great potential in medical image segmentation. However, the influence of the learning target quality for unlabeled data is usually neglected in these SSL methods. Therefore, this study proposes a novel self-correcting co-training scheme to learn a better target that is more similar to ground-truth labels from collaborative network outputs. Our work has three-fold highlights. First, we advance the learning target generation as a learning task, improving the learning confidence for unannotated data with a self-correcting module. Second, we impose a structure constraint to encourage the shape similarity further between the improved learning target and the collaborative network outputs. Finally, we propose an innovative pixel-wise contrastive learning loss to boost the representation capacity under the guidance of an improved learning target, thus exploring unlabeled data more efficiently with the awareness of semantic context. We have extensively evaluated our method with the state-of-the-art semi-supervised approaches on four public-available datasets, including the ACDC dataset, M&Ms dataset, Pancreas-CT dataset, and Task_07 CT dataset. The experimental results with different labeled-data ratios show our proposed method's superiority over other existing methods, demonstrating its effectiveness in semi-supervised medical image segmentation.
Collapse
|
4
|
Fan L, Gong X, Zheng C, Li J. Data pyramid structure for optimizing EUS-based GISTs diagnosis in multi-center analysis with missing label. Comput Biol Med 2024; 169:107897. [PMID: 38171262 DOI: 10.1016/j.compbiomed.2023.107897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 12/04/2023] [Accepted: 12/23/2023] [Indexed: 01/05/2024]
Abstract
This study introduces the Data Pyramid Structure (DPS) to address data sparsity and missing labels in medical image analysis. The DPS optimizes multi-task learning and enables sustainable expansion of multi-center data analysis. Specifically, It facilitates attribute prediction and malignant tumor diagnosis tasks by implementing a segmentation and aggregation strategy on data with absent attribute labels. To leverage multi-center data, we propose the Unified Ensemble Learning Framework (UELF) and the Unified Federated Learning Framework (UFLF), which incorporate strategies for data transfer and incremental learning in scenarios with missing labels. The proposed method was evaluated on a challenging EUS patient dataset from five centers, achieving promising diagnostic performance. The average accuracy was 0.984 with an AUC of 0.927 for multi-center analysis, surpassing state-of-the-art approaches. The interpretability of the predictions further highlights the potential clinical relevance of our method.
Collapse
Affiliation(s)
- Lin Fan
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, Sichuan 611756, China; Manufacturing Industry Chains Collaboration and Information Support Technology Key Laboratory of Sichuan Province, China; Engineering Research Center of Sustainable Urban Intelligent Transportation, Ministry of Education, China; National Engineering Laboratory of Integrated Transportation Big Data Application Technology, China
| | - Xun Gong
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, Sichuan 611756, China; Manufacturing Industry Chains Collaboration and Information Support Technology Key Laboratory of Sichuan Province, China; Engineering Research Center of Sustainable Urban Intelligent Transportation, Ministry of Education, China; National Engineering Laboratory of Integrated Transportation Big Data Application Technology, China.
| | - Cenyang Zheng
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, Sichuan 611756, China; Manufacturing Industry Chains Collaboration and Information Support Technology Key Laboratory of Sichuan Province, China; Engineering Research Center of Sustainable Urban Intelligent Transportation, Ministry of Education, China; National Engineering Laboratory of Integrated Transportation Big Data Application Technology, China
| | - Jiao Li
- Department of Gastroenterology, The Third People's Hospital of Chendu, Affiliated Hospital of Southwest Jiaotong University, Chengdu 610031, China
| |
Collapse
|
5
|
He Y, Ge R, Qi X, Chen Y, Wu J, Coatrieux JL, Yang G, Li S. Learning Better Registration to Learn Better Few-Shot Medical Image Segmentation: Authenticity, Diversity, and Robustness. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2588-2601. [PMID: 35895657 DOI: 10.1109/tnnls.2022.3190452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In this work, we address the task of few-shot medical image segmentation (MIS) with a novel proposed framework based on the learning registration to learn segmentation (LRLS) paradigm. To cope with the limitations of lack of authenticity, diversity, and robustness in the existing LRLS frameworks, we propose the better registration better segmentation (BRBS) framework with three main contributions that are experimentally shown to have substantial practical merit. First, we improve the authenticity in the registration-based generation program and propose the knowledge consistency constraint strategy that constrains the registration network to learn according to the domain knowledge. It brings the semantic-aligned and topology-preserved registration, thus allowing the generation program to output new data with great space and style authenticity. Second, we deeply studied the diversity of the generation process and propose the space-style sampling program, which introduces the modeling of the transformation path of style and space change between few atlases and numerous unlabeled images into the generation program. Therefore, the sampling on the transformation paths provides much more diverse space and style features to the generated data effectively improving the diversity. Third, we first highlight the robustness in the learning of segmentation in the LRLS paradigm and propose the mix misalignment regularization, which simulates the misalignment distortion and constrains the network to reduce the fitting degree of misaligned regions. Therefore, it builds regularization for these regions improving the robustness of segmentation learning. Without any bells and whistles, our approach achieves a new state-of-the-art performance in few-shot MIS on two challenging tasks that outperform the existing LRLS-based few-shot methods. We believe that this novel and effective framework will provide a powerful few-shot benchmark for the field of medical image and efficiently reduce the costs of medical image research. All of our code will be made publicly available online.
Collapse
|
6
|
Mamalakis M, Garg P, Nelson T, Lee J, Swift AJ, Wild JM, Clayton RH. Artificial Intelligence framework with traditional computer vision and deep learning approaches for optimal automatic segmentation of left ventricle with scar. Artif Intell Med 2023; 143:102610. [PMID: 37673578 DOI: 10.1016/j.artmed.2023.102610] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 05/17/2023] [Accepted: 06/06/2023] [Indexed: 09/08/2023]
Abstract
Automatic segmentation of the cardiac left ventricle with scars remains a challenging and clinically significant task, as it is essential for patient diagnosis and treatment pathways. This study aimed to develop a novel framework and cost function to achieve optimal automatic segmentation of the left ventricle with scars using LGE-MRI images. To ensure the generalization of the framework, an unbiased validation protocol was established using out-of-distribution (OOD) internal and external validation cohorts, and intra-observation and inter-observer variability ground truths. The framework employs a combination of traditional computer vision techniques and deep learning, to achieve optimal segmentation results. The traditional approach uses multi-atlas techniques, active contours, and k-means methods, while the deep learning approach utilizes various deep learning techniques and networks. The study found that the traditional computer vision technique delivered more accurate results than deep learning, except in cases where there was breath misalignment error. The optimal solution of the framework achieved robust and generalized results with Dice scores of 82.8 ± 6.4% and 72.1 ± 4.6% in the internal and external OOD cohorts, respectively. The developed framework offers a high-performance solution for automatic segmentation of the left ventricle with scars using LGE-MRI. Unlike existing state-of-the-art approaches, it achieves unbiased results across different hospitals and vendors without the need for training or tuning in hospital cohorts. This framework offers a valuable tool for experts to accomplish the task of fully automatic segmentation of the left ventricle with scars based on a single-modality cardiac scan.
Collapse
Affiliation(s)
- Michail Mamalakis
- Insigneo Institute for in-silico, Medicine, University of Sheffield, Sheffield, S1 4DP, UK; Department of Computer Science, University of Sheffield, Regent Court, Sheffield, S1 4DP, UK.
| | - Pankaj Garg
- Department of Cardiology, Sheffield Teaching Hospitals Sheffield S5 7AU, UK
| | - Tom Nelson
- Department of Cardiology, Sheffield Teaching Hospitals Sheffield S5 7AU, UK
| | - Justin Lee
- Department of Cardiology, Sheffield Teaching Hospitals Sheffield S5 7AU, UK
| | - Andrew J Swift
- Department of Computer Science, University of Sheffield, Regent Court, Sheffield, S1 4DP, UK; Department of Infection, Immunity & Cardiovascular Disease, University of Sheffield, Sheffield, UK
| | - James M Wild
- Insigneo Institute for in-silico, Medicine, University of Sheffield, Sheffield, S1 4DP, UK; Polaris, Imaging Sciences, Department of Infection, Immunity and Cardiovascular Disease, University of Sheffield, Sheffield, UK
| | - Richard H Clayton
- Insigneo Institute for in-silico, Medicine, University of Sheffield, Sheffield, S1 4DP, UK; Department of Computer Science, University of Sheffield, Regent Court, Sheffield, S1 4DP, UK.
| |
Collapse
|
7
|
Li D, Peng Y, Sun J, Guo Y. A task-unified network with transformer and spatial-temporal convolution for left ventricular quantification. Sci Rep 2023; 13:13529. [PMID: 37598235 PMCID: PMC10439898 DOI: 10.1038/s41598-023-40841-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 08/17/2023] [Indexed: 08/21/2023] Open
Abstract
Quantification of the cardiac function is vital for diagnosing and curing the cardiovascular diseases. Left ventricular function measurement is the most commonly used measure to evaluate the function of cardiac in clinical practice, how to improve the accuracy of left ventricular quantitative assessment results has always been the subject of research by medical researchers. Although considerable efforts have been put forward to measure the left ventricle (LV) automatically using deep learning methods, the accurate quantification is yet a challenge work as a result of the changeable anatomy structure of heart in the systolic diastolic cycle. Besides, most methods used direct regression method which lacks of visual based analysis. In this work, a deep learning segmentation and regression task-unified network with transformer and spatial-temporal convolution is proposed to segment and quantify the LV simultaneously. The segmentation module leverages a U-Net like 3D Transformer model to predict the contour of three anatomy structures, while the regression module learns spatial-temporal representations from the original images and the reconstruct feature map from segmentation path to estimate the finally desired quantification metrics. Furthermore, we employ a joint task loss function to train the two module networks. Our framework is evaluated on the MICCAI 2017 Left Ventricle Full Quantification Challenge dataset. The results of experiments demonstrate the effectiveness of our framework, which achieves competitive cardiac quantification metric results and at the same time produces visualized segmentation results that are conducive to later analysis.
Collapse
Affiliation(s)
- Dapeng Li
- Shandong University of Science and Technology, Qingdao, China
| | - Yanjun Peng
- Shandong University of Science and Technology, Qingdao, China.
- Shandong Province Key Laboratory of Wisdom Mining Information Technology, Qingdao, China.
| | - Jindong Sun
- Shandong University of Science and Technology, Qingdao, China
| | - Yanfei Guo
- Shandong University of Science and Technology, Qingdao, China
| |
Collapse
|
8
|
Bashkanov O, Rak M, Meyer A, Engelage L, Lumiani A, Muschter R, Hansen C. Automatic detection of prostate cancer grades and chronic prostatitis in biparametric MRI. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 239:107624. [PMID: 37271051 DOI: 10.1016/j.cmpb.2023.107624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 05/13/2023] [Accepted: 05/25/2023] [Indexed: 06/06/2023]
Abstract
BACKGROUND AND OBJECTIVE With emerging evidence to improve prostate cancer (PCa) screening, multiparametric magnetic prostate imaging is becoming an essential noninvasive component of the diagnostic routine. Computer-aided diagnostic (CAD) tools powered by deep learning can help radiologists interpret multiple volumetric images. In this work, our objective was to examine promising methods recently proposed in the multigrade prostate cancer detection task and to suggest practical considerations regarding model training in this context. METHODS We collected 1647 fine-grained biopsy-confirmed findings, including Gleason scores and prostatitis, to form a training dataset. In our experimental framework for lesion detection, all models utilized 3D nnU-Net architecture that accounts for anisotropy in the MRI data. First, we explore an optimal range of b-values for diffusion-weighted imaging (DWI) modality and its effect on the detection of clinically significant prostate cancer (csPCa) and prostatitis using deep learning, as the optimal range is not yet clearly defined in this domain. Next, we propose a simulated multimodal shift as a data augmentation technique to compensate for the multimodal shift present in the data. Third, we study the effect of incorporating the prostatitis class alongside cancer-related findings at three different granularities of the prostate cancer class (coarse, medium, and fine) and its impact on the detection rate of the target csPCa. Furthermore, ordinal and one-hot encoded (OHE) output formulations were tested. RESULTS An optimal model configuration with fine class granularity (prostatitis included) and OHE has scored the lesion-wise partial Free-Response Receiver Operating Characteristic (FROC) area under the curve (AUC) of 1.94 (CI 95%: 1.76-2.11) and patient-wise ROC AUC of 0.874 (CI 95%: 0.793-0.938) in the detection of csPCa. Inclusion of the auxiliary prostatitis class has demonstrated a stable relative improvement in specificity at a false positive rate (FPR) of 1.0 per patient, with an increase of 3%, 7%, and 4% for coarse, medium, and fine class granularities. CONCLUSIONS This paper examines several configurations for model training in the biparametric MRI setup and proposes optimal value ranges. It also shows that the fine-grained class configuration, including prostatitis, is beneficial for detecting csPCa. The ability to detect prostatitis in all low-risk cancer lesions suggests the potential to improve the quality of the early diagnosis of prostate diseases. It also implies an improved interpretability of the results by the radiologist.
Collapse
Affiliation(s)
- Oleksii Bashkanov
- Faculty of Computer Science and Research Campus STIMULATE, University of Magdeburg, Universitätsplatz 2, Magdeburg 39106, Germany.
| | - Marko Rak
- Faculty of Computer Science and Research Campus STIMULATE, University of Magdeburg, Universitätsplatz 2, Magdeburg 39106, Germany
| | - Anneke Meyer
- Faculty of Computer Science and Research Campus STIMULATE, University of Magdeburg, Universitätsplatz 2, Magdeburg 39106, Germany
| | | | | | | | - Christian Hansen
- Faculty of Computer Science and Research Campus STIMULATE, University of Magdeburg, Universitätsplatz 2, Magdeburg 39106, Germany
| |
Collapse
|
9
|
Tang S, Yu X, Cheang CF, Ji X, Yu HH, Choi IC. CLELNet: A continual learning network for esophageal lesion analysis on endoscopic images. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 231:107399. [PMID: 36780717 DOI: 10.1016/j.cmpb.2023.107399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 01/03/2023] [Accepted: 02/01/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVE A deep learning-based intelligent diagnosis system can significantly reduce the burden of endoscopists in the daily analysis of esophageal lesions. Considering the need to add new tasks in the diagnosis system, a deep learning model that can train a series of tasks incrementally using endoscopic images is essential for identifying the types and regions of esophageal lesions. METHOD In this paper, we proposed a continual learning-based esophageal lesion network (CLELNet), in which a convolutional autoencoder was designed to extract representation features of endoscopic images among different esophageal lesions. The proposed CLELNet consists of shared layers and task-specific layers. Shared layers are used to extract common features among different lesions while task-specific layers can complete different tasks. The first two tasks trained by the CLELNet are the classification (task 1) and the segmentation (task 2). We collected a dataset of esophageal endoscopic images from Macau Kiang Wu Hospital for training and testing the CLELNet. RESULTS The experimental results showed that the classification accuracy of task 1 was 95.96%, and the Intersection Over Union and the Dice Similarity Coefficient of task 2 were 65.66% and 78.08%, respectively. CONCLUSIONS The proposed CLELNet can realize task-incremental learning without forgetting the previous tasks and thus become a useful computer-aided diagnosis system in esophageal lesions analysis.
Collapse
Affiliation(s)
- Suigu Tang
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau SAR
| | - Xiaoyuan Yu
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau SAR
| | - Chak Fong Cheang
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau SAR.
| | - Xiaoyu Ji
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau SAR
| | - Hon Ho Yu
- Kiang Wu Hospital, Rua de Coelho do Amaral, Macau SAR
| | - I Cheong Choi
- Kiang Wu Hospital, Rua de Coelho do Amaral, Macau SAR
| |
Collapse
|
10
|
Zhao Y, Wang X, Che T, Bao G, Li S. Multi-task deep learning for medical image computing and analysis: A review. Comput Biol Med 2023; 153:106496. [PMID: 36634599 DOI: 10.1016/j.compbiomed.2022.106496] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 12/06/2022] [Accepted: 12/27/2022] [Indexed: 12/29/2022]
Abstract
The renaissance of deep learning has provided promising solutions to various tasks. While conventional deep learning models are constructed for a single specific task, multi-task deep learning (MTDL) that is capable to simultaneously accomplish at least two tasks has attracted research attention. MTDL is a joint learning paradigm that harnesses the inherent correlation of multiple related tasks to achieve reciprocal benefits in improving performance, enhancing generalizability, and reducing the overall computational cost. This review focuses on the advanced applications of MTDL for medical image computing and analysis. We first summarize four popular MTDL network architectures (i.e., cascaded, parallel, interacted, and hybrid). Then, we review the representative MTDL-based networks for eight application areas, including the brain, eye, chest, cardiac, abdomen, musculoskeletal, pathology, and other human body regions. While MTDL-based medical image processing has been flourishing and demonstrating outstanding performance in many tasks, in the meanwhile, there are performance gaps in some tasks, and accordingly we perceive the open challenges and the perspective trends. For instance, in the 2018 Ischemic Stroke Lesion Segmentation challenge, the reported top dice score of 0.51 and top recall of 0.55 achieved by the cascaded MTDL model indicate further research efforts in high demand to escalate the performance of current models.
Collapse
Affiliation(s)
- Yan Zhao
- Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, 100083, China
| | - Xiuying Wang
- School of Computer Science, The University of Sydney, Sydney, NSW, 2008, Australia.
| | - Tongtong Che
- Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, 100083, China
| | - Guoqing Bao
- School of Computer Science, The University of Sydney, Sydney, NSW, 2008, Australia
| | - Shuyu Li
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, 100875, China.
| |
Collapse
|
11
|
Liu L, Liu Y, Zhou J, Guo C, Duan H. A novel MCF-Net: Multi-level context fusion network for 2D medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107160. [PMID: 36191351 DOI: 10.1016/j.cmpb.2022.107160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 08/14/2022] [Accepted: 09/25/2022] [Indexed: 06/16/2023]
Abstract
Medical image segmentation is a crucial step in the clinical applications for diagnosis and analysis of some diseases. U-Net-based convolution neural networks have achieved impressive performance in medical image segmentation tasks. However, the multi-level contextual information integration capability and the feature extraction ability are often insufficient. In this paper, we present a novel multi-level context fusion network (MCF-Net) to improve the performance of U-Net on various segmentation tasks by designing three modules, hybrid attention-based residual atrous convolution (HARA) module, multi-scale feature memory (MSFM) module, and multi-receptive field fusion (MRFF) module, to fuse multi-scale contextual information. HARA module was proposed to effectively extract multi-receptive field features by combing atrous spatial pyramid pooling and attention mechanism. We further design the MSFM and MRFF modules to fuse features of different levels and effectively extract contextual information. The proposed MCF-Net was evaluated on the ISIC 2018, DRIVE, BUSI, and Kvasir-SEG datasets, which have challenging images of many sizes and widely varying anatomy. The experimental results show that MCF-Net is very competitive with other U-Net models, and it offers tremendous potential as a general-purpose deep learning model for 2D medical image segmentation.
Collapse
Affiliation(s)
- Lizhu Liu
- Engineering Research Center of Automotive Electrics and Control Technology, College of Mechanical and Vehicle Engineering, Hunan University, Changsha 410082, China; National Engineering Laboratory of Robot Visual Perception and Control Technology, School of Robotics, Hunan University, Changsha 410082, China.
| | - Yexin Liu
- Engineering Research Center of Automotive Electrics and Control Technology, College of Mechanical and Vehicle Engineering, Hunan University, Changsha 410082, China.
| | - Jian Zhou
- Engineering Research Center of Automotive Electrics and Control Technology, College of Mechanical and Vehicle Engineering, Hunan University, Changsha 410082, China.
| | - Cheng Guo
- Engineering Research Center of Automotive Electrics and Control Technology, College of Mechanical and Vehicle Engineering, Hunan University, Changsha 410082, China.
| | - Huigao Duan
- Engineering Research Center of Automotive Electrics and Control Technology, College of Mechanical and Vehicle Engineering, Hunan University, Changsha 410082, China.
| |
Collapse
|
12
|
Xiao X, Zhao J, Li S. Task relevance driven adversarial learning for simultaneous detection, size grading, and quantification of hepatocellular carcinoma via integrating multi-modality MRI. Med Image Anal 2022; 81:102554. [DOI: 10.1016/j.media.2022.102554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Revised: 07/12/2022] [Accepted: 07/18/2022] [Indexed: 11/26/2022]
|
13
|
Song X, Tang H, Yang C, Zhou G, Wang Y, Huang X, Hua J, Coatrieux G, He X, Chen Y. Deformable transformer for endoscopic video super-resolution. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103827] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
|
14
|
Niyas S, Pawan S, Anand Kumar M, Rajan J. Medical image segmentation with 3D convolutional neural networks: A survey. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.065] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
15
|
Ni J, Wu J, Elazab A, Tong J, Chen Z. DNL-Net: deformed non-local neural network for blood vessel segmentation. BMC Med Imaging 2022; 22:109. [PMID: 35668351 PMCID: PMC9169317 DOI: 10.1186/s12880-022-00836-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 05/31/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The non-local module has been primarily used in literature to capturing long-range dependencies. However, it suffers from prohibitive computational complexity and lacks the interactions among positions across the channels. METHODS We present a deformed non-local neural network (DNL-Net) for medical image segmentation, which has two prominent components; deformed non-local module (DNL) and multi-scale feature fusion. The former optimizes the structure of the non-local block (NL), hence, reduces the problem of excessive computation and memory usage, significantly. The latter is derived from the attention mechanisms to fuse the features of different levels and improve the ability to exchange information across channels. In addition, we introduce a residual squeeze and excitation pyramid pooling (RSEP) module that is like spatial pyramid pooling to effectively resample the features at different scales and improve the network receptive field. RESULTS The proposed method achieved 96.63% and 92.93% for Dice coefficient and mean intersection over union, respectively, on the intracranial blood vessel dataset. Also, DNL-Net attained 86.64%, 96.10%, and 98.37% for sensitivity, accuracy and area under receiver operation characteristic curve, respectively, on the DRIVE dataset. CONCLUSIONS The overall performance of DNL-Net outperforms other current state-of-the-art vessel segmentation methods, which indicates that the proposed network is more suitable for blood vessel segmentation, and is of great clinical significance.
Collapse
Affiliation(s)
- Jiajia Ni
- College of Internet of Things Engineering, HoHai University, Changzhou, China
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Jianhuang Wu
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| | - Ahmed Elazab
- School of Biomedical Engineering, Shenzhen University, Shenzhen, China
- Computer Science Department, Misr Higher Institute for Commerce and Computers, Mansoura, Egypt
| | - Jing Tong
- College of Internet of Things Engineering, HoHai University, Changzhou, China
| | - Zhengming Chen
- College of Internet of Things Engineering, HoHai University, Changzhou, China
| |
Collapse
|
16
|
Shin H, Choi GS, Shon OJ, Kim GB, Chang MC. Development of convolutional neural network model for diagnosing meniscus tear using magnetic resonance image. BMC Musculoskelet Disord 2022; 23:510. [PMID: 35637451 PMCID: PMC9150332 DOI: 10.1186/s12891-022-05468-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 05/23/2022] [Indexed: 11/22/2022] Open
Abstract
Background Deep learning (DL) is an advanced machine learning approach used in diverse areas, such as image analysis, bioinformatics, and natural language processing. A convolutional neural network (CNN) is a representative DL model that is advantageous for image recognition and classification. In this study, we aimed to develop a CNN to detect meniscal tears and classify tear types using coronal and sagittal magnetic resonance (MR) images of each patient. Methods We retrospectively collected 599 cases (medial meniscus tear = 384, lateral meniscus tear = 167, and medial and lateral meniscus tear = 48) of knee MR images from patients with meniscal tears and 449 cases of knee MR images from patients without meniscal tears. To develop the DL model for evaluating the presence of meniscal tears, all the collected knee MR images of 1048 cases were used. To develop the DL model for evaluating the type of meniscal tear, 538 cases with meniscal tears (horizontal tear = 268, complex tear = 147, radial tear = 48, and longitudinal tear = 75) and 449 cases without meniscal tears were used. Additionally, a CNN algorithm was used. To measure the model’s performance, 70% of the included data were randomly assigned to the training set, and the remaining 30% were assigned to the test set. Results The area under the curves (AUCs) of our model were 0.889, 0.817, and 0.924 for medial meniscal tears, lateral meniscal tears, and medial and lateral meniscal tears, respectively. The AUCs of the horizontal, complex, radial, and longitudinal tears were 0.761, 0.850, 0.601, and 0.858, respectively. Conclusion Our study showed that the CNN model has the potential to be used in diagnosing the presence of meniscal tears and differentiating the types of meniscal tears.
Collapse
Affiliation(s)
- Hyunkwang Shin
- Department of Information and Communication Engineering, Yeungnam University, Gyeongsan-si, Republic of Korea
| | - Gyu Sang Choi
- Department of Information and Communication Engineering, Yeungnam University, Gyeongsan-si, Republic of Korea
| | - Oog-Jin Shon
- Department of Orthopedic Surgery, Yeungnam University College of Medicine, Yeungnam University, 317-1, Daemyungdong, Namku, Daegu, 42415, Republic of Korea
| | - Gi Beom Kim
- Department of Orthopedic Surgery, Yeungnam University College of Medicine, Yeungnam University, 317-1, Daemyungdong, Namku, Daegu, 42415, Republic of Korea.
| | - Min Cheol Chang
- Department of Physical Medicine and Rehabilitation, College of Medicine, Yeungnam University, 317-1, Daemyungdong, Namku, Daegu, 42415, Republic of Korea.
| |
Collapse
|
17
|
Cui X, Cao Y, Liu Z, Sui X, Mi J, Zhang Y, Cui L, Li S. TRSA-Net: Task Relation Spatial co-Attention for Joint Segmentation, Quantification and Uncertainty Estimation on Paired 2D Echocardiography. IEEE J Biomed Health Inform 2022; 26:4067-4078. [PMID: 35503848 DOI: 10.1109/jbhi.2022.3171985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Clinical workflow of cardiac assessment on 2D echocardiography requires both accurate segmentation and quantification of the Left Ventricle (LV) from paired apical 4-chamber and 2-chamber. Moreover, uncertainty estimation is significant in clinically understanding the performance of a model. However, current research on 2D echocardiography ignores this vital task while joint segmentation with quantification, hence motivating the need for a unified optimization method. In this paper, we propose a multitask model with Task Relation Spatial co-Attention (referred as TRSA-Net) for joint segmentation, quantification, and uncertainty estimation on paired 2D echo. TRSA-Net achieves multitask joint learning by novelly exploring the spatial correlation between tasks. The task relation spatial co-attention learns the spatial mapping among task-specific features by non-local and co-excitation, which forcibly joints embedded spatial information in the segmentation and quantification. The Boundary-aware Structure Consistency (BSC) and Joint Indices Constraint (JIC) are integrated into the multitask learning optimization objective to guide the learning of segmentation and quantification paths. The BSC creatively promotes structural similarity of predictions, and JIC explores the internal relationship between three quantitative indices. We validate the efficacy of our TRSA-Net on the public CAMUS dataset. Extensive comparison and ablation experiments show that our approach can achieve competitive segmentation performance and highly accurate results on quantification.
Collapse
|
18
|
Cui X, Zhang P, Li Y, Liu Z, Xiao X, Zhang Y, Sun L, Cui L, Yang G, Li S. MCAL: An Anatomical Knowledge Learning Model for Myocardial Segmentation in 2-D Echocardiography. IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL 2022; 69:1277-1287. [PMID: 35167446 DOI: 10.1109/tuffc.2022.3151647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Segmentation of the left ventricular (LV) myocardium in 2-D echocardiography is essential for clinical decision making, especially in geometry measurement and index computation. However, segmenting the myocardium is a time-consuming process and challenging due to the fuzzy boundary caused by the low image quality. The ground-truth label is employed as pixel-level class associations or shape regulation in segmentation, which works limit for effective feature enhancement for 2-D echocardiography. We propose a training strategy named multiconstrained aggregate learning (referred to as MCAL), which leverages anatomical knowledge learned through ground-truth labels to infer segmented parts and discriminate boundary pixels. The new framework encourages the model to focus on the features in accordance with the learned anatomical representations, and the training objectives incorporate a boundary distance transform weight (BDTW) to enforce a higher weight value on the boundary region, which helps to improve the segmentation accuracy. The proposed method is built as an end-to-end framework with a top-down, bottom-up architecture with skip convolution fusion blocks and carried out on two datasets (our dataset and the public CAMUS dataset). The comparison study shows that the proposed network outperforms the other segmentation baseline models, indicating that our method is beneficial for boundary pixels discrimination in segmentation.
Collapse
|
19
|
X-CTRSNet: 3D cervical vertebra CT reconstruction and segmentation directly from 2D X-ray images. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107680] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
20
|
He J, Zhou G, Zhou S, Chen Y. Online Hard Patch Mining using Shape Models and Bandit Algorithm for Multi-organ Segmentation. IEEE J Biomed Health Inform 2021; 26:2648-2659. [PMID: 34928809 DOI: 10.1109/jbhi.2021.3136597] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Hard sample selection can effectively improve model convergence by extracting the most representative samples from a training set. However, due to the large capacity of medical images, existing sampling strategies suffer from insufficient exploitation for hard samples or high time cost for sample selection when adopted by 3D patch-based models in the field of multi-organ segmentation. In this paper, we present a novel and effective online hard patch mining (OHPM) algorithm. In our method, an average shape model that can be mapped with all training images is constructed to guide the exploration of hard patches and aggregate feedback from predicted patches. The process of hard mining is formalized as a multi-armed bandit problem and solved with bandit algorithms. With the shape model, OHPM requires negligible time consumption and can intuitively locate difficult anatomical areas during training. The employment of bandit algorithms ensures online and sufficient hard mining. We integrate OHPM with advanced segmentation networks and evaluate them on two datasets containing different anatomical structures. Comparative experiments with other sampling strategies demonstrate the superiority of OHPM in boosting segmentation performance and improving model convergence. The results in each dataset with each network suggest that OHPM significantly outperforms other sampling strategies by nearly 2% average Dice score.
Collapse
|
21
|
Automatic morphological classification of mitral valve diseases in echocardiographic images based on explainable deep learning methods. Int J Comput Assist Radiol Surg 2021; 17:413-425. [PMID: 34897594 DOI: 10.1007/s11548-021-02542-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 11/30/2021] [Indexed: 10/19/2022]
Abstract
PURPOSE Carpentier's functional classification is a guide to explain the types of mitral valve regurgitation based on morphological features. There are four types of pathological morphologies, regardless of the presence or absence of mitral regurgitation: Type I, normal; Type II, mitral valve prolapse; Type IIIa, mitral valve stenosis; and Type IIIb, restricted mitral leaflet motion. The aim of this study was to automatically classify mitral valves using echocardiographic images. METHODS In our procedure, after the classification of apical 4-chamber (A4C) and parasternal long-axis (PLA) views, we extracted the systolic/diastolic phase of the cardiac cycle by calculating the left ventricular area. Six typical pre-trained models were fine-tuned with a 4-class model for the PLA and a 3-class model for the A4C views. As an additional contribution, to provide explainability, we applied the Gradient-weighted Class Activation Mapping (Grad-CAM) algorithm to visualize areas of echocardiographic images where the different models generated a prediction. RESULTS This approach conferred a proper understanding of where various networks "look" into echocardiographic images to predict the four types of pathological mitral valve morphologies. Considering the accuracy metric and Grad-CAM maps and by applying the Inception-ResNet-v2 architecture to classify Type II in the PLA view and ResNeXt50 architecture to classify the other three classes in the A4C view, we achieved an 80% rate of model accuracy in the test data set. CONCLUSIONS We suggest an explainable, fully automated, and rule-based procedure to classify the four types of mitral valve morphologies based on Carpentier's functional classification using deep learning on transthoracic echocardiographic images. Our study results infer the feasibility of the use of deep learning models to prepare quick and precise assessments of mitral valve morphologies in echocardiograms. According to our knowledge, our study is the first one that provides a public data set regarding the Carpentier classification of MV pathologies.
Collapse
|
22
|
Wang L, Shen M, Chang Q, Shi C, Chen Y, Zhou Y, Zhang Y, Pu J, Chen H. Automated delineation of corneal layers on OCT images using a boundary-guided CNN. PATTERN RECOGNITION 2021; 120:108158. [PMID: 34421131 PMCID: PMC8372529 DOI: 10.1016/j.patcog.2021.108158] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Accurate segmentation of corneal layers depicted on optical coherence tomography (OCT) images is very helpful for quantitatively assessing and diagnosing corneal diseases (e.g., keratoconus and dry eye). In this study, we presented a novel boundary-guided convolutional neural network (CNN) architecture (BG-CNN) to simultaneously extract different corneal layers and delineate their boundaries. The developed BG-CNN architecture used three convolutional blocks to construct two network modules on the basis of the classical U-Net network. We trained and validated the network on a dataset consisting of 1,712 OCT images acquired on 121 subjects using a 10-fold cross-validation method. Our experiments showed an average dice similarity coefficient (DSC) of 0.9691, an intersection over union (IOU) of 0.9411, and a Hausdorff distance (HD) of 7.4423 pixels. Compared with several other classical networks, namely U-Net, Attention U-Net, Asymmetric U-Net, BiO-Net, CE-Net, CPFnte, M-Net, and Deeplabv3, on the same dataset, the developed network demonstrated a promising performance, suggesting its unique strength in segmenting corneal layers depicted on OCT images.
Collapse
Affiliation(s)
- Lei Wang
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
- Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, China
- Corresponding author. (L. Wang)
| | - Meixiao Shen
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| | - Qian Chang
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| | - Ce Shi
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| | - Yang Chen
- Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, China
| | - Yuheng Zhou
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| | - Yanchun Zhang
- Department of Ophthalmology, Xi’an People’s Hospital (Xi’an Fourth Hospital), Xi’an, China
| | - Jiantao Pu
- Departments of Radiology and Bioengineering, University of Pittsburgh, Pittsburgh, PA 15213, United States
| | - Hao Chen
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| |
Collapse
|
23
|
Convolutional squeeze-and-excitation network for ECG arrhythmia detection. Artif Intell Med 2021; 121:102181. [PMID: 34763803 DOI: 10.1016/j.artmed.2021.102181] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 09/22/2021] [Accepted: 09/22/2021] [Indexed: 11/21/2022]
Abstract
Automatic detection of arrhythmia through an electrocardiogram (ECG) is of great significance for the prevention and treatment of cardiovascular diseases. In Convolutional neural network, the ECG signal is converted into multiple feature channels with equal weights through the convolution operation. Multiple feature channels can provide richer and more comprehensive information, but also contain redundant information, which will affect the diagnosis of arrhythmia, so feature channels that contain arrhythmia information should be paid attention to and given larger weight. In this paper, we introduced the Squeeze-and-Excitation (SE) block for the first time for the automatic detection of multiple types of arrhythmias with ECG. Our algorithm combines the residual convolutional module and the SE block to extract features from the original ECG signal. The SE block adaptively enhances the discriminative features and suppresses noise by explicitly modeling the interdependence between the channels, which can adaptively integrate information from different feature channels of ECG. The one-dimensional convolution operation over the time dimension is used to extract temporal information and the shortcut connection of the Se-Residual convolutional module in the proposed model makes the network easier to optimize. Thanks to the powerful feature extraction capabilities of the network, which can effectively extract discriminative arrhythmia features in multiple feature channels, so that no extra data preprocessing including denoising in other methods are need for our framework. It thus improves the working efficiency and keeps the collected biological information without loss. Experiments conducted with the 12-lead ECG dataset of the China Physiological Signal Challenge (CPSC) 2018 and the dataset of PhysioNet/Computing in Cardiology (CinC) Challenge 2017. The experiment results show that our model gains great performance and has great potential in clinical.
Collapse
|
24
|
Lyu T, Yang G, Zhao X, Shu H, Luo L, Chen D, Xiong J, Yang J, Li S, Coatrieux JL, Chen Y. Dissected aorta segmentation using convolutional neural networks. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 211:106417. [PMID: 34587564 DOI: 10.1016/j.cmpb.2021.106417] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2021] [Accepted: 09/12/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVE Aortic dissection is a severe cardiovascular pathology in which an injury of the intimal layer of the aorta allows blood flowing into the aortic wall, forcing the wall layers apart. Such situation presents a high mortality rate and requires an in-depth understanding of the 3-D morphology of the dissected aorta to plan the right treatment. An accurate automatic segmentation algorithm is therefore needed. METHOD In this paper, we propose a deep-learning-based algorithm to segment dissected aorta on computed tomography angiography (CTA) images. The algorithm consists of two steps. Firstly, a 3-D convolutional neural network (CNN) is applied to divide the 3-D volume into two anatomical portions. Secondly, two 2-D CNNs based on pyramid scene parsing network (PSPnet) segment each specific portion separately. An edge extraction branch was added to the 2-D model to get higher segmentation accuracy on intimal flap area. RESULTS The experiments conducted and the comparisons made show that the proposed solution performs well with an average dice index over 92%. The combination of 3-D and 2-D models improves the aorta segmentation accuracy compared to 3-D only models and the segmentation robustness compared to 2-D only models. The edge extraction branch improves the DICE index near aorta boundaries from 73.41% to 81.39%. CONCLUSIONS The proposed algorithm has satisfying performance for capturing the aorta structure while avoiding false positives on the intimal flaps.
Collapse
Affiliation(s)
- Tianling Lyu
- Laboratory of Imaging Science and Technology, Southeast University, Nanjing, China
| | - Guanyu Yang
- Laboratory of Imaging Science and Technology, Southeast University, Nanjing, China
| | - Xingran Zhao
- Laboratory of Imaging Science and Technology, Southeast University, Nanjing, China
| | - Huazhong Shu
- Laboratory of Imaging Science and Technology, Southeast University, Nanjing, China
| | - Limin Luo
- Laboratory of Imaging Science and Technology, Southeast University, Nanjing, China
| | - Duanduan Chen
- Department of Biomedical Engineering, Beijing Institute of Technology, Beijing, China
| | | | - Jian Yang
- School of Optoelectronics, Beijing Institute of Technology, Beijing, China
| | - Shuo Li
- Digital Imaging Group of London, London, Canada
| | | | - Yang Chen
- Laboratory of Imaging Science and Technology, Southeast University, Nanjing, China; School of Cyber Science and Engineering, Southeast University, Nanjing, China; Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, China.
| |
Collapse
|
25
|
Wu W, Hu D, Niu C, Broeke LV, Butler APH, Cao P, Atlas J, Chernoglazov A, Vardhanabhuti V, Wang G. Deep learning based spectral CT imaging. Neural Netw 2021; 144:342-358. [PMID: 34560584 DOI: 10.1016/j.neunet.2021.08.026] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Revised: 07/14/2021] [Accepted: 08/20/2021] [Indexed: 10/20/2022]
Abstract
Spectral computed tomography (CT) has attracted much attention in radiation dose reduction, metal artifacts removal, tissue quantification and material discrimination. The x-ray energy spectrum is divided into several bins, each energy-bin-specific projection has a low signal-noise-ratio (SNR) than the current-integrating counterpart, which makes image reconstruction a unique challenge. Traditional wisdom is to use prior knowledge based iterative methods. However, this kind of methods demands a great computational cost. Inspired by deep learning, here we first develop a deep learning based reconstruction method; i.e., U-net with Lpp-norm, Total variation, Residual learning, and Anisotropic adaption (ULTRA). Specifically, we emphasize the various multi-scale feature fusion and multichannel filtering enhancement with a denser connection encoding architecture for residual learning and feature fusion. To address the image deblurring problem associated with the L22- loss, we propose a general Lpp-loss, p>0. Furthermore, the images from different energy bins share similar structures of the same object, the regularization characterizing correlations of different energy bins is incorporated into the Lpp- loss function, which helps unify the deep learning based methods with traditional compressed sensing based methods. Finally, the anisotropically weighted total variation is employed to characterize the sparsity in the spatial-spectral domain to regularize the proposed network In particular, we validate our ULTRA networks on three large-scale spectral CT datasets, and obtain excellent results relative to the competing algorithms. In conclusion, our quantitative and qualitative results in numerical simulation and preclinical experiments demonstrate that our proposed approach is accurate, efficient and robust for high-quality spectral CT image reconstruction.
Collapse
Affiliation(s)
- Weiwen Wu
- Department of Diagnostic Radiology, Queen Mary Hospital, University of Hong Kong, Hong Kong, People's Republic of China; Biomedical Imaging Center, Center for Biotechnology and Interdisciplinary Studies, Department of Biomedical Engineering, School of Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA
| | - Dianlin Hu
- The Laboratory of Image Science and Technology, Southeast University, Nanjing, People's Republic of China
| | - Chuang Niu
- Biomedical Imaging Center, Center for Biotechnology and Interdisciplinary Studies, Department of Biomedical Engineering, School of Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA
| | - Lieza Vanden Broeke
- Department of Diagnostic Radiology, Queen Mary Hospital, University of Hong Kong, Hong Kong, People's Republic of China
| | | | - Peng Cao
- Department of Diagnostic Radiology, Queen Mary Hospital, University of Hong Kong, Hong Kong, People's Republic of China
| | - James Atlas
- Department of Radiology, University of Otago, Christchurch, New Zealand
| | | | - Varut Vardhanabhuti
- Department of Diagnostic Radiology, Queen Mary Hospital, University of Hong Kong, Hong Kong, People's Republic of China.
| | - Ge Wang
- Biomedical Imaging Center, Center for Biotechnology and Interdisciplinary Studies, Department of Biomedical Engineering, School of Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA
| |
Collapse
|
26
|
Du X, Xu X, Liu H, Li S. TSU-net: Two-stage multi-scale cascade and multi-field fusion U-net for right ventricular segmentation. Comput Med Imaging Graph 2021; 93:101971. [PMID: 34482121 DOI: 10.1016/j.compmedimag.2021.101971] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 07/12/2021] [Accepted: 08/06/2021] [Indexed: 01/21/2023]
Abstract
Accurate segmentation of the right ventricle from cardiac magnetic resonance images (MRI) is a critical step in cardiac function analysis and disease diagnosis. It is still an open problem due to some difficulties, such as a large variety of object sizes and ill-defined borders. In this paper, we present a TSU-net network that grips deeper features and captures targets of different sizes with multi-scale cascade and multi-field fusion in the right ventricle. TSU-net mainly contains two major components: Dilated-Convolution Block (DB) and Multi-Layer-Pool Block (MB). DB extracts and aggregates multi-scale features for the right ventricle. MB mainly relies on multiple effective field-of-views to detect objects at different sizes and fill boundary features. Different from previous networks, we used DB and MB to replace the convolution layer in the encoding layer, thus, we can gather multi-scale information of right ventricle, detect different size targets and fill boundary information in each encoding layer. In addition, in the decoding layer, we used DB to replace the convolution layer, so that we can aggregate the multi-scale features of the right ventricle in each decoding layer. Furthermore, the two-stage U-net structure is used to further improve the utilization of DB and MB through a two-layer encoding/decoding layer. Our method is validated on the RVSC, a public right ventricular data set. The results demonstrated that TSU-net achieved an average Dice coefficient of 0.86 on endocardium and 0.90 on the epicardium, thereby outperforming other models. It effectively assists doctors to diagnose the disease and promotes the development of medical images. In addition, we also provide an intuitive explanation of our network, which fully explain MB and TSU-net's ability to detect targets of different sizes and fill in boundary features.
Collapse
Affiliation(s)
- Xiuquan Du
- Key Laboratory of Intelligent Computing and Signal Processing, Ministry of Education, Anhui University, Hefei, Anhui, China; School of Computer Science and Technology, Anhui University, Hefei, Anhui, China.
| | - Xiaofei Xu
- School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
| | - Heng Liu
- Department of Gastroenterology, The First Affiliated Hospital of Anhui Medical University, Hefei, Anhui, China
| | - Shuo Li
- Department of Medical Imaging, Western University, London, ON, Canada
| |
Collapse
|
27
|
Lin L, Tao X, Yang W, Pang S, Su Z, Lu H, Li S, Feng Q, Chen B. Quantifying Axial Spine Images Using Object-Specific Bi-Path Network. IEEE J Biomed Health Inform 2021; 25:2978-2987. [PMID: 33788697 DOI: 10.1109/jbhi.2021.3070235] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Automatic estimation of indices from medical images is the main goal of computer-aided quantification (CADq), which speeds up diagnosis and lightens the workload of radiologists. Deep learning technique is a good choice for implementing CADq. Usually, to acquire high-accuracy quantification, specific network architecture needs to be designed for a given CADq task. In this study, considering that the target organs are the intervertebral disc and the dural sac, we propose an object-specific bi-path network (OSBP-Net) for axial spine image quantification. Each path of the OSBP-Net comprises a shallow feature extraction layer (SFE) and a deep feature extraction sub-network (DFE). The SFEs use different convolution strides because the two target organs have different anatomical sizes. The DFEs use average pooling for downsampling based on the observation that the target organs have lower intensity than the background. In addition, an inter-path dissimilarity constraint is proposed and applied to the output of the SFEs, taking into account that the activated regions in the feature maps of two paths should be different theoretically. An inter-index correlation regularization is introduced and applied to the output of the DFEs based on the observation that the diameter and area of the same object express an approximately linear relation. The prediction results of OSBP-Net are compared to several state-of-the-art machine learning-based CADq methods. The comparison reveals that the proposed methods precede other competing methods extensively, indicating its great potential for spine CADq.
Collapse
|
28
|
Dezaki FT, Luong C, Ginsberg T, Rohling R, Gin K, Abolmaesumi P, Tsang T. Echo-SyncNet: Self-Supervised Cardiac View Synchronization in Echocardiography. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:2092-2104. [PMID: 33835916 DOI: 10.1109/tmi.2021.3071951] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In echocardiography (echo), an electrocardiogram (ECG) is conventionally used to temporally align different cardiac views for assessing critical measurements. However, in emergencies or point-of-care situations, acquiring an ECG is often not an option, hence motivating the need for alternative temporal synchronization methods. Here, we propose Echo-SyncNet, a self-supervised learning framework to synchronize various cross-sectional 2D echo series without any human supervision or external inputs. The proposed framework takes advantage of two types of supervisory signals derived from the input data: spatiotemporal patterns found between the frames of a single cine (intra-view self-supervision) and interdependencies between multiple cines (inter-view self-supervision). The combined supervisory signals are used to learn a feature-rich and low dimensional embedding space where multiple echo cines can be temporally synchronized. Two intra-view self-supervisions are used, the first is based on the information encoded by the temporal ordering of a cine (temporal intra-view) and the second on the spatial similarities between nearby frames (spatial intra-view). The inter-view self-supervision is used to promote the learning of similar embeddings for frames captured from the same cardiac phase in different echo views. We evaluate the framework with multiple experiments: 1) Using data from 998 patients, Echo-SyncNet shows promising results for synchronizing Apical 2 chamber and Apical 4 chamber cardiac views, which are acquired spatially perpendicular to each other; 2) Using data from 3070 patients, our experiments reveal that the learned representations of Echo-SyncNet outperform a supervised deep learning method that is optimized for automatic detection of fine-grained cardiac cycle phase; 3) We go one step further and show the usefulness of the learned representations in a one-shot learning scenario of cardiac key-frame detection. Without any fine-tuning, key frames in 1188 validation patient studies are identified by synchronizing them with only one labeled reference cine. We do not make any prior assumption about what specific cardiac views are used for training, and hence we show that Echo-SyncNet can accurately generalize to views not present in its training set. Project repository: github.com/fatemehtd/Echo-SyncNet>.
Collapse
|
29
|
Zhu H, Jiang L, Zhang H, Luo L, Chen Y, Chen Y. An automatic machine learning approach for ischemic stroke onset time identification based on DWI and FLAIR imaging. NEUROIMAGE-CLINICAL 2021; 31:102744. [PMID: 34245995 PMCID: PMC8271155 DOI: 10.1016/j.nicl.2021.102744] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Revised: 06/22/2021] [Accepted: 06/23/2021] [Indexed: 11/11/2022]
Abstract
We only used two-modal MR image (DWI, FLAIR) for fast time since stroke identification. We constructed cross-modal convolutional network for lesion ROI segmentation in FLAIR. The network used ROI features in DWI as prior information for better FLAIR segmentation. Five independent machine learning classifiers were trained and voted to obtain the final classification label. The voting of five classifiers can improve classification accuracy effectively.
Current thrombolysis for acute ischemic stroke (AIS) treatment strictly relies on the time since stroke (TSS) less than 4.5 h. However, some patients are excluded from thrombolytic treatment because of the unknown TSS. The diffusion-weighted imaging (DWI) and fluid-attenuated inversion recovery (FLAIR) mismatch can simply identify TSS since lesion intensities are not identical at different onset time. In this paper, we propose an automatic machine learning method to classify the TSS less than or more than 4.5 h. First, we develop a cross-modal convolutional neural network to accurately segment the stroke lesions from DWI and FLAIR images. Second, the features are extracted from DWI and FLAIR according to the segmentation regions of interest (ROI). Finally, the features are fed to machine learning models to identify TSS. In DWI and FLAIR ROI segmentation, the networks obtain high Dice coefficients with 0.803 and 0.647. The classification test results show that our model achieves an accuracy of 0.805, with a sensitivity of 0.769 and a specificity of 0.840. Our approach outperforms human reading DWI-FLAIR mismatch model, illustrating the potential for automatic and fast TSS identification.
Collapse
Affiliation(s)
- Haichen Zhu
- Lab of Image Science and Technology, Key Laboratory of Computer Network and Information Integration (Ministry of Education), School of Computer Science and Engineering, Southeast University, Nanjing 210096, China
| | - Liang Jiang
- Department of Radiology, Nanjing First Hospital, Nanjing Medical University, Nanjing 210006, China
| | - Hong Zhang
- Department of Radiology, Affiliated Jiangning Hospital of Nanjing Medical University, Nanjing 210000, China
| | - Limin Luo
- Lab of Image Science and Technology, Key Laboratory of Computer Network and Information Integration (Ministry of Education), School of Computer Science and Engineering, Southeast University, Nanjing 210096, China
| | - Yang Chen
- Lab of Image Science and Technology, Key Laboratory of Computer Network and Information Integration (Ministry of Education), School of Computer Science and Engineering, Southeast University, Nanjing 210096, China; School of Cyber Science and Engineering, Southeast University, Nanjing 210096, China.
| | - Yuchen Chen
- Department of Radiology, Nanjing First Hospital, Nanjing Medical University, Nanjing 210006, China.
| |
Collapse
|
30
|
Zhao J, Li D, Xiao X, Accorsi F, Marshall H, Cossetto T, Kim D, McCarthy D, Dawson C, Knezevic S, Chen B, Li S. United adversarial learning for liver tumor segmentation and detection of multi-modality non-contrast MRI. Med Image Anal 2021; 73:102154. [PMID: 34280670 DOI: 10.1016/j.media.2021.102154] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 04/13/2021] [Accepted: 06/08/2021] [Indexed: 02/05/2023]
Abstract
Simultaneous segmentation and detection of liver tumors (hemangioma and hepatocellular carcinoma (HCC)) by using multi-modality non-contrast magnetic resonance imaging (NCMRI) are crucial for the clinical diagnosis. However, it is still a challenging task due to: (1) the HCC information on NCMRI is insufficient makes extraction of liver tumors feature difficult; (2) diverse imaging characteristics in multi-modality NCMRI causes feature fusion and selection difficult; (3) no specific information between hemangioma and HCC on NCMRI cause liver tumors detection difficult. In this study, we propose a united adversarial learning framework (UAL) for simultaneous liver tumors segmentation and detection using multi-modality NCMRI. The UAL first utilizes a multi-view aware encoder to extract multi-modality NCMRI information for liver tumor segmentation and detection. In this encoder, a novel edge dissimilarity feature pyramid module is designed to facilitate the complementary multi-modality feature extraction. Secondly, the newly designed fusion and selection channel is used to fuse the multi-modality feature and make the decision of the feature selection. Then, the proposed mechanism of coordinate sharing with padding integrates the multi-task of segmentation and detection so that it enables multi-task to perform united adversarial learning in one discriminator. Lastly, an innovative multi-phase radiomics guided discriminator exploits the clear and specific tumor information to improve the multi-task performance via the adversarial learning strategy. The UAL is validated in corresponding multi-modality NCMRI (i.e. T1FS pre-contrast MRI, T2FS MRI, and DWI) and three phases contrast-enhanced MRI of 255 clinical subjects. The experiments show that UAL gains high performance with the dice similarity coefficient of 83.63%, the pixel accuracy of 97.75%, the intersection-over-union of 81.30%, the sensitivity of 92.13%, the specificity of 93.75%, and the detection accuracy of 92.94%, which demonstrate that UAL has great potential in the clinical diagnosis of liver tumors.
Collapse
Affiliation(s)
- Jianfeng Zhao
- Shandong Key Laboratory of Medical Physics and Image Processing, Shandong Institute of Industrial Technology for Health Sciences and Precision Medicine, School of Physics and Electronics, Shandong Normal University, Jinan, 250358, China; Digital Imaging Group of London, London, ON, Canada
| | - Dengwang Li
- Shandong Key Laboratory of Medical Physics and Image Processing, Shandong Institute of Industrial Technology for Health Sciences and Precision Medicine, School of Physics and Electronics, Shandong Normal University, Jinan, 250358, China.
| | - Xiaojiao Xiao
- School of Information and Computer, Taiyuan University of Technology, Shanxi, 030000, China; Digital Imaging Group of London, London, ON, Canada
| | - Fabio Accorsi
- Department of Medical Imaging, Western University, London, ON, Canada; Digital Imaging Group of London, London, ON, Canada
| | - Harry Marshall
- Department of Medical Imaging, Western University, London, ON, Canada; Digital Imaging Group of London, London, ON, Canada
| | - Tyler Cossetto
- Department of Medical Imaging, Western University, London, ON, Canada; Digital Imaging Group of London, London, ON, Canada
| | - Dongkeun Kim
- Department of Medical Imaging, Western University, London, ON, Canada; Digital Imaging Group of London, London, ON, Canada
| | - Daniel McCarthy
- Department of Medical Imaging, Western University, London, ON, Canada; Digital Imaging Group of London, London, ON, Canada
| | - Cameron Dawson
- Department of Medical Imaging, Western University, London, ON, Canada; Digital Imaging Group of London, London, ON, Canada
| | - Stefan Knezevic
- Department of Medical Imaging, Western University, London, ON, Canada; Digital Imaging Group of London, London, ON, Canada
| | - Bo Chen
- Digital Imaging Group of London, London, ON, Canada
| | - Shuo Li
- Department of Medical Imaging, Western University, London, ON, Canada; Digital Imaging Group of London, London, ON, Canada.
| |
Collapse
|
31
|
Ulloa Cerna AE, Jing L, Good CW, vanMaanen DP, Raghunath S, Suever JD, Nevius CD, Wehner GJ, Hartzel DN, Leader JB, Alsaid A, Patel AA, Kirchner HL, Pfeifer JM, Carry BJ, Pattichis MS, Haggerty CM, Fornwalt BK. Deep-learning-assisted analysis of echocardiographic videos improves predictions of all-cause mortality. Nat Biomed Eng 2021; 5:546-554. [PMID: 33558735 DOI: 10.1038/s41551-020-00667-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 11/24/2020] [Indexed: 01/30/2023]
Abstract
Machine learning promises to assist physicians with predictions of mortality and of other future clinical events by learning complex patterns from historical data, such as longitudinal electronic health records. Here we show that a convolutional neural network trained on raw pixel data in 812,278 echocardiographic videos from 34,362 individuals provides superior predictions of one-year all-cause mortality. The model's predictions outperformed the widely used pooled cohort equations, the Seattle Heart Failure score (measured in an independent dataset of 2,404 patients with heart failure who underwent 3,384 echocardiograms), and a machine learning model involving 58 human-derived variables from echocardiograms and 100 clinical variables derived from electronic health records. We also show that cardiologists assisted by the model substantially improved the sensitivity of their predictions of one-year all-cause mortality by 13% while maintaining prediction specificity. Large unstructured datasets may enable deep learning to improve a wide range of clinical prediction models.
Collapse
Affiliation(s)
- Alvaro E Ulloa Cerna
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA.,Electrical and Computer Engineering Department, University of New Mexico, Albuquerque, NM, USA
| | - Linyuan Jing
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
| | | | - David P vanMaanen
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
| | - Sushravya Raghunath
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
| | - Jonathan D Suever
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
| | - Christopher D Nevius
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
| | - Gregory J Wehner
- Department of Biomedical Engineering, University of Kentucky, Lexington, KY, USA
| | - Dustin N Hartzel
- Phenomic Analytics and Clinical Data Core, Geisinger, Danville, PA, USA
| | - Joseph B Leader
- Phenomic Analytics and Clinical Data Core, Geisinger, Danville, PA, USA
| | - Amro Alsaid
- Heart Institute, Geisinger, Danville, PA, USA
| | | | - H Lester Kirchner
- Department of Population Health Sciences, Geisinger, Danville, PA, USA
| | - John M Pfeifer
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA.,Heart and Vascular Center, Evangelical Hospital, Lewisburg, PA, USA
| | | | - Marios S Pattichis
- Electrical and Computer Engineering Department, University of New Mexico, Albuquerque, NM, USA
| | - Christopher M Haggerty
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA.,Heart Institute, Geisinger, Danville, PA, USA
| | - Brandon K Fornwalt
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA. .,Heart Institute, Geisinger, Danville, PA, USA. .,Department of Radiology, Geisinger, Danville, PA, USA.
| |
Collapse
|
32
|
Su R, Zhang D, Liu J, Cheng C. MSU-Net: Multi-Scale U-Net for 2D Medical Image Segmentation. Front Genet 2021; 12:639930. [PMID: 33679900 PMCID: PMC7928319 DOI: 10.3389/fgene.2021.639930] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 01/20/2021] [Indexed: 11/15/2022] Open
Abstract
Aiming at the limitation of the convolution kernel with a fixed receptive field and unknown prior to optimal network width in U-Net, multi-scale U-Net (MSU-Net) is proposed by us for medical image segmentation. First, multiple convolution sequence is used to extract more semantic features from the images. Second, the convolution kernel with different receptive fields is used to make features more diverse. The problem of unknown network width is alleviated by efficient integration of convolution kernel with different receptive fields. In addition, the multi-scale block is extended to other variants of the original U-Net to verify its universality. Five different medical image segmentation datasets are used to evaluate MSU-Net. A variety of imaging modalities are included in these datasets, such as electron microscopy, dermoscope, ultrasound, etc. Intersection over Union (IoU) of MSU-Net on each dataset are 0.771, 0.867, 0.708, 0.900, and 0.702, respectively. Experimental results show that MSU-Net achieves the best performance on different datasets. Our implementation is available at https://github.com/CN-zdy/MSU_Net.
Collapse
Affiliation(s)
- Run Su
- Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, China
- Science Island Branch of Graduate School, University of Science and Technology of China, Hefei, China
| | - Deyun Zhang
- School of Engineering, Anhui Agricultural University, Hefei, China
| | - Jinhuai Liu
- Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, China
- Science Island Branch of Graduate School, University of Science and Technology of China, Hefei, China
| | - Chuandong Cheng
- Department of Neurosurgery, The First Affiliated Hospital of University of Science and Technology of China (USTC), Hefei, China
- Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
- Anhui Province Key Laboratory of Brain Function and Brain Disease, Hefei, China
| |
Collapse
|
33
|
Xu C, Zhang D, Chong J, Chen B, Li S. Synthesis of gadolinium-enhanced liver tumors on nonenhanced liver MR images using pixel-level graph reinforcement learning. Med Image Anal 2021; 69:101976. [PMID: 33535110 DOI: 10.1016/j.media.2021.101976] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 01/11/2021] [Accepted: 01/18/2021] [Indexed: 01/24/2023]
Abstract
If successful, synthesis of gadolinium (Gd)-enhanced liver tumors on nonenhanced liver MR images will be critical for liver tumor diagnosis and treatment. This synthesis will offer a safe, efficient, and low-cost clinical alternative to eliminate the use of contrast agents in the current clinical workflow and significantly benefit global healthcare systems. In this study, we propose a novel pixel-level graph reinforcement learning method (Pix-GRL). This method directly takes regular nonenhanced liver images as input and outputs AI-enhanced liver tumor images, thereby making them comparable to traditional Gd-enhanced liver tumor images. In Pix-GRL, each pixel has a pixel-level agent, and the agent explores the pixels features and outputs a pixel-level action to iteratively change the pixel value, ultimately generating AI-enhanced liver tumor images. Most importantly, Pix-GRL creatively embeds a graph convolution to represent all the pixel-level agents. A graph convolution is deployed to the agent for feature exploration to improve the effectiveness through the aggregation of long-range contextual features, as well as outputting the action to enhance the efficiency through shared parameter training between agents. Moreover, in our Pix-GRL method, a novel reward is used to measure pixel-level action to significantly improve the performance by considering the improvement in each action in each pixel with its own future state, as well as those of neighboring pixels. Pix-GRL significantly upgrades the existing medical DRL methods from a single agent to multiple pixel-level agents, becoming the first DRL method for medical image synthesis. Comprehensive experiments on three types of liver tumor datasets (benign, cancerous, and healthy controls) with 325 patients (24,375 images) show that our novel Pix-GRL method outperforms existing medical image synthesis learning methods. It achieved an SSIM of 0.85 ± 0.06 and a Pearson correlation coefficient of 0.92 in terms of the tumor size. These results prove that the potential exists to develop a successful clinical alternative to Gd-enhanced liver MR imaging.
Collapse
Affiliation(s)
- Chenchu Xu
- School of Computer Science and Technology, Anhui University, Hefei, China; Department of Medical Imaging, Western University, London ON, Canada
| | - Dong Zhang
- Department of Medical Imaging, Western University, London ON, Canada
| | - Jaron Chong
- Department of Medical Imaging, Western University, London ON, Canada
| | - Bo Chen
- School of Health Science, Western University, London ON, Canada
| | - Shuo Li
- Department of Medical Imaging, Western University, London ON, Canada.
| |
Collapse
|
34
|
Zhang C, Shu H, Yang G, Li F, Wen Y, Zhang Q, Dillenseger JL, Coatrieux JL. HIFUNet: Multi-Class Segmentation of Uterine Regions From MR Images Using Global Convolutional Networks for HIFU Surgery Planning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:3309-3320. [PMID: 32356741 DOI: 10.1109/tmi.2020.2991266] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Accurate segmentation of uterus, uterine fibroids, and spine from MR images is crucial for high intensity focused ultrasound (HIFU) therapy but remains still difficult to achieve because of 1) the large shape and size variations among individuals, 2) the low contrast between adjacent organs and tissues, and 3) the unknown number of uterine fibroids. To tackle this problem, in this paper, we propose a large kernel Encoder-Decoder Network based on a 2D segmentation model. The use of this large kernel can capture multi-scale contexts by enlarging the valid receptive field. In addition, a deep multiple atrous convolution block is also employed to enlarge the receptive field and extract denser feature maps. Our approach is compared to both conventional and other deep learning methods and the experimental results conducted on a large dataset show its effectiveness.
Collapse
|
35
|
Wu Z, Ge R, Wen M, Liu G, Chen Y, Zhang P, He X, Hua J, Luo L, Li S. ELNet:Automatic classification and segmentation for esophageal lesions using convolutional neural network. Med Image Anal 2020; 67:101838. [PMID: 33129148 DOI: 10.1016/j.media.2020.101838] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 09/06/2020] [Accepted: 09/21/2020] [Indexed: 02/06/2023]
Abstract
Automatic and accurate esophageal lesion classification and segmentation is of great significance to clinically estimate the lesion statuses of the esophageal diseases and make suitable diagnostic schemes. Due to individual variations and visual similarities of lesions in shapes, colors, and textures, current clinical methods remain subject to potential high-risk and time-consumption issues. In this paper, we propose an Esophageal Lesion Network (ELNet) for automatic esophageal lesion classification and segmentation using deep convolutional neural networks (DCNNs). The underlying method automatically integrates dual-view contextual lesion information to extract global features and local features for esophageal lesion classification and lesion-specific segmentation network is proposed for automatic esophageal lesion annotation at pixel level. For the established clinical large-scale database of 1051 white-light endoscopic images, ten-fold cross-validation is used in method validation. Experiment results show that the proposed framework achieves classification with sensitivity of 0.9034, specificity of 0.9718, and accuracy of 0.9628, and the segmentation with sensitivity of 0.8018, specificity of 0.9655, and accuracy of 0.9462. All of these indicate that our method enables an efficient, accurate, and reliable esophageal lesion diagnosis in clinics.
Collapse
Affiliation(s)
- Zhan Wu
- School of Cyberspace Security, Southeast University, Nanjing, Jiangsu, China
| | - Rongjun Ge
- School of Computer Science and Engineering, Southeast University, Nanjing, Jiangsu, China
| | - Minli Wen
- School of Cyberspace Security, Southeast University, Nanjing, Jiangsu, China
| | - Gaoshuang Liu
- Department of Geriatric Gastroenterology, the First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu, China
| | - Yang Chen
- School of Cyberspace Security, Southeast University, Nanjing, Jiangsu, China; School of Computer Science and Engineering, Southeast University, Nanjing, Jiangsu, China; Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, China; Centre de Recherche en Information Biomedicale Sino-Francais (LIA CRIBs), Rennes, France.
| | - Pinzheng Zhang
- School of Computer Science and Engineering, Southeast University, Nanjing, Jiangsu, China
| | - Xiaopu He
- Department of Geriatric Gastroenterology, the First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu, China.
| | - Jie Hua
- Department of Gastroenterology, the First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu, China
| | - Limin Luo
- School of Cyberspace Security, Southeast University, Nanjing, Jiangsu, China; School of Computer Science and Engineering, Southeast University, Nanjing, Jiangsu, China; Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, China; Centre de Recherche en Information Biomedicale Sino-Francais (LIA CRIBs), Rennes, France
| | - Shuo Li
- Department of Medical Imaging, Western University, London, Canada
| |
Collapse
|
36
|
Dynamically constructed network with error correction for accurate ventricle volume estimation. Med Image Anal 2020; 64:101723. [DOI: 10.1016/j.media.2020.101723] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Revised: 05/07/2020] [Accepted: 05/08/2020] [Indexed: 11/20/2022]
|