1
|
Zhang J, Ye L, Gong W, Chen M, Liu G, Cheng Y. A Novel Network for Low-Dose CT Denoising Based on Dual-Branch Structure and Multi-Scale Residual Attention. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01254-z. [PMID: 39261373 DOI: 10.1007/s10278-024-01254-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2024] [Revised: 08/15/2024] [Accepted: 08/22/2024] [Indexed: 09/13/2024]
Abstract
Deep learning-based denoising of low-dose medical CT images has received great attention both from academic researchers and physicians in recent years, and has shown important application value in clinical practice. In this work, a novel two-branch and multi-scale residual attention-based network for low-dose CT image denoising is proposed. It adopts a two-branch framework structure, to extract and fuse image features at shallow and deep levels respectively, to recover image texture and structure information as much as possible. We propose the adaptive dynamic convolution block (ADCB) in the local information extraction layer. It can effectively extract the detailed information of low-dose CT denoising and enables the network to better capture the local details and texture features of the image, thereby improving the denoising effect and image quality. Multi-scale edge enhancement attention block (MEAB) is proposed in the global information extraction layer, to perform feature fusion through dilated convolution and a multi-dimensional attention mechanism. A multi-scale residual convolution block (MRCB) is proposed to integrate feature information and improve the robustness and generalization of the network. To demonstrate the effectiveness of our method, extensive comparison experiments are conducted and the performances evaluated on two publicly available datasets. Our model achieves 29.3004 PSNR, 0.8659 SSIM, and 14.0284 RMSE on the AAPM-Mayo dataset. It is evaluated by adding four different noise levels σ = 15, 30, 45, and 60 on the Qin_LUNG_CT dataset and achieves the best results. Ablation studies show that the proposed ADCB, MEAB, and MRCB modules improve the denoising performances significantly. The source code is available at https://github.com/Ye111-cmd/LDMANet .
Collapse
Affiliation(s)
- Ju Zhang
- College of Information Science and Technology, Hangzhou Normal University, Hangzhou, 310030, China
| | - Lieli Ye
- College of Information Science and Technology, Hangzhou Normal University, Hangzhou, 310030, China
| | - Weiwei Gong
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Mingyang Chen
- College of Information Science and Technology, Hangzhou Normal University, Hangzhou, 310030, China
| | - Guangyu Liu
- College of Information Science and Technology, Hangzhou Normal University, Hangzhou, 310030, China
| | - Yun Cheng
- Department of Medical Imaging, Zhejiang Hospital, Hangzhou, 310058, China.
| |
Collapse
|
2
|
Huang J, Zhong A, Wei Y. A new visual State Space Model for low-dose CT denoising. Med Phys 2024. [PMID: 39231014 DOI: 10.1002/mp.17387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 08/08/2024] [Accepted: 08/19/2024] [Indexed: 09/06/2024] Open
Abstract
BACKGROUND Low-dose computed tomography (LDCT) can mitigate potential health risks to the public. However, the severe noise and artifacts in LDCT images can impede subsequent clinical diagnosis and analysis. Convolutional neural networks (CNNs) and Transformers stand out as the two most popular backbones in LDCT denoising. Nonetheless, CNNs suffer from a lack of long-range modeling capabilities, while Transformers are hindered by high computational complexity. PURPOSE In this study, our main goal is to develop a simple and efficient model that can both focus on local spatial context and model long-range dependencies with linear computational complexity for LDCT denoising. METHODS In this study, we make the first attempt to apply the State Space Model to LDCT denoising and propose a novel LDCT denoising model named Visual Mamba Encoder-Decoder Network (ViMEDnet). To efficiently and effectively capture both the local and global features, we propose the Mixed State Space Module (MSSM), where the depth-wise convolution, max-pooling, and 2D Selective Scan Module (2DSSM) are coupled together through a partial channel splitting mechanism. 2DSSM is capable of capturing global information with linear computational complexity, while convolution and max-pooling can effectively learn local signals to facilitate detail restoration. Furthermore, the network uses a weighted gradient-sensitive hybrid loss function to facilitate the preservation of image details, improving the overall denoising performance. RESULTS The performance of our proposed ViMEDnet is compared to five state-of-the-art LDCT denoising methods, including an iterative algorithm, two CNN-based methods, and two Transformer-based methods. The comparative experimental results demonstrate that the proposed ViMEDnet can achieve better visual quality and quantitative assessment outcomes. In visual evaluation, ViMEDnet effectively removes noise and artifacts, while exhibiting superior performance in restoring fine structures and low-contrast structural edges, resulting in minimal deviation of denoised images from NDCT. In quantitative assessment, ViMEDnet obtains the lowest RMSE and the highest PSNR, SSIM, and FSIM scores, further substantiating the superiority of ViMEDnet. CONCLUSIONS The proposed ViMEDnet possesses excellent LDCT denoising performance and provides a new alternative to LDCT denoising models beyond the existing CNN and Transformer options.
Collapse
Affiliation(s)
- Jiexing Huang
- Department of Radiation Oncology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Anni Zhong
- Department of Digital Hospital Construction, the Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Yajing Wei
- Department of Obstetrics and Gynecology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
3
|
Nerella S, Bandyopadhyay S, Zhang J, Contreras M, Siegel S, Bumin A, Silva B, Sena J, Shickel B, Bihorac A, Khezeli K, Rashidi P. Transformers and large language models in healthcare: A review. Artif Intell Med 2024; 154:102900. [PMID: 38878555 DOI: 10.1016/j.artmed.2024.102900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 05/28/2024] [Accepted: 05/30/2024] [Indexed: 08/09/2024]
Abstract
With Artificial Intelligence (AI) increasingly permeating various aspects of society, including healthcare, the adoption of the Transformers neural network architecture is rapidly changing many applications. Transformer is a type of deep learning architecture initially developed to solve general-purpose Natural Language Processing (NLP) tasks and has subsequently been adapted in many fields, including healthcare. In this survey paper, we provide an overview of how this architecture has been adopted to analyze various forms of healthcare data, including clinical NLP, medical imaging, structured Electronic Health Records (EHR), social media, bio-physiological signals, biomolecular sequences. Furthermore, which have also include the articles that used the transformer architecture for generating surgical instructions and predicting adverse outcomes after surgeries under the umbrella of critical care. Under diverse settings, these models have been used for clinical diagnosis, report generation, data reconstruction, and drug/protein synthesis. Finally, we also discuss the benefits and limitations of using transformers in healthcare and examine issues such as computational cost, model interpretability, fairness, alignment with human values, ethical implications, and environmental impact.
Collapse
Affiliation(s)
- Subhash Nerella
- Department of Biomedical Engineering, University of Florida, Gainesville, United States
| | | | - Jiaqing Zhang
- Department of Electrical and Computer Engineering, University of Florida, Gainesville, United States
| | - Miguel Contreras
- Department of Biomedical Engineering, University of Florida, Gainesville, United States
| | - Scott Siegel
- Department of Biomedical Engineering, University of Florida, Gainesville, United States
| | - Aysegul Bumin
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, United States
| | - Brandon Silva
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, United States
| | - Jessica Sena
- Department Of Computer Science, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Benjamin Shickel
- Department of Medicine, University of Florida, Gainesville, United States
| | - Azra Bihorac
- Department of Medicine, University of Florida, Gainesville, United States
| | - Kia Khezeli
- Department of Biomedical Engineering, University of Florida, Gainesville, United States
| | - Parisa Rashidi
- Department of Biomedical Engineering, University of Florida, Gainesville, United States.
| |
Collapse
|
4
|
Chi J, Wei X, Sun Z, Yang Y, Yang B. Low-Dose CT Image Super-resolution Network with Noise Inhibition Based on Feedback Feature Distillation Mechanism. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:1902-1921. [PMID: 38378965 PMCID: PMC11300784 DOI: 10.1007/s10278-024-00979-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 12/06/2023] [Accepted: 12/07/2023] [Indexed: 02/22/2024]
Abstract
Low-dose computed tomography (LDCT) has been widely used in medical diagnosis. In practice, doctors often zoom in on LDCT slices for clearer lesions and issues, while, a simple zooming operation fails to suppress low-dose artifacts, leading to distorted details. Therefore, numerous LDCT super-resolution (SR) methods have been proposed to promote the quality of zooming without the increase of the dose in CT scanning. However, there are still some drawbacks that need to be addressed in existing methods. First, the region of interest (ROI) is not emphasized due to the lack of guidance in the reconstruction process. Second, the convolutional blocks extracting fix-resolution features fail to concentrate on the essential multi-scale features. Third, a single SR head cannot suppress the residual artifacts. To address these issues, we propose an LDCT CT joint SR and denoising reconstruction network. Our proposed network consists of global dual-guidance attention fusion modules (GDAFMs) and multi-scale anastomosis blocks (MABs). The GDAFM directs the network to focus on ROI by fusing the extra mask guidance and average CT image guidance, while the MAB introduces hierarchical features through anastomosis connections to leverage multi-scale features and promote the feature representation ability. To suppress radial residual artifacts, we optimize our network using the feedback feature distillation mechanism (FFDM) which shares the backbone to learn features corresponding to the denoising task. We apply the proposed method to the 3D-IRCADB and PANCREAS datasets to evaluate its ability on LDCT image SR reconstruction. The experimental results compared with state-of-the-art methods illustrate the superiority of our approach with respect to peak signal-to-noise (PSNR), structural similarity (SSIM), and qualitative observations. Our proposed LDCT joint SR and denoising reconstruction network has been extensively evaluated through ablation, quantitative, and qualitative experiments. The results demonstrate that our method can recover noise-free and detail-sharp images, resulting in better reconstruction results. Code is available at https://github.com/neu-szy/ldct_sr_dn_w_ffdm .
Collapse
Affiliation(s)
- Jianning Chi
- Faculty of Robot Science and Engineering, Northeastern University, Zhihui Street, Shenyang, 110169, Liaoning, China
- Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Zhihui Street, Shenyang, 110169, Liaoning, China
| | - Xiaolin Wei
- Department of Rehabilitation, the Second Hospital of Beijing, No. 36 Youfang Hutong, 100031, Beijing, China
| | - Zhiyi Sun
- Faculty of Robot Science and Engineering, Northeastern University, Zhihui Street, Shenyang, 110169, Liaoning, China.
| | - Yongming Yang
- Faculty of Robot Science and Engineering, Northeastern University, Zhihui Street, Shenyang, 110169, Liaoning, China
- State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110016, China
| | - Bin Yang
- Department of Radiology, the Second Hospital of Beijing, No. 36 Youfang Hutong, 100031, Beijing, China
| |
Collapse
|
5
|
Chi J, Sun Z, Tian S, Wang H, Wang S. A Hybrid Framework of Dual-Domain Signal Restoration and Multi-depth Feature Reinforcement for Low-Dose Lung CT Denoising. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:1944-1959. [PMID: 38424278 PMCID: PMC11300419 DOI: 10.1007/s10278-023-00934-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 09/05/2023] [Accepted: 09/06/2023] [Indexed: 03/02/2024]
Abstract
Low-dose computer tomography (LDCT) has been widely used in medical diagnosis. Various denoising methods have been presented to remove noise in LDCT scans. However, existing methods cannot achieve satisfactory results due to the difficulties in (1) distinguishing the characteristics of structures, textures, and noise confused in the image domain, and (2) representing local details and global semantics in the hierarchical features. In this paper, we propose a novel denoising method consisting of (1) a 2D dual-domain restoration framework to reconstruct noise-free structure and texture signals separately, and (2) a 3D multi-depth reinforcement U-Net model to further recover image details with enhanced hierarchical features. In the 2D dual-domain restoration framework, the convolutional neural networks are adopted in both the image domain where the image structures are well preserved through the spatial continuity, and the sinogram domain where the textures and noise are separately represented by different wavelet coefficients and processed adaptively. In the 3D multi-depth reinforcement U-Net model, the hierarchical features from the 3D U-Net are enhanced by the cross-resolution attention module (CRAM) and dual-branch graph convolution module (DBGCM). The CRAM preserves local details by integrating adjacent low-level features with different resolutions, while the DBGCM enhances global semantics by building graphs for high-level features in intra-feature and inter-feature dimensions. Experimental results on the LUNA16 dataset and 2016 NIH-AAPM-Mayo Clinic LDCT Grand Challenge dataset illustrate the proposed method outperforms the state-of-the-art methods on removing noise from LDCT images with clear structures and textures, proving its potential in clinical practice.
Collapse
Affiliation(s)
- Jianning Chi
- Faculty of Robot Science and Engineering, Northeastern University, Zhihui Street, Shenyang, 110169, Liaoning, China.
- Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Zhihui Street, Shenyang, 110169, Liaoning, China.
| | - Zhiyi Sun
- Faculty of Robot Science and Engineering, Northeastern University, Zhihui Street, Shenyang, 110169, Liaoning, China
| | - Shuyu Tian
- Graduate School, Dalian Medical University, Lyushunnan, Dalian, 116000, Liaoning, China
| | - Huan Wang
- Faculty of Robot Science and Engineering, Northeastern University, Zhihui Street, Shenyang, 110169, Liaoning, China
| | - Siqi Wang
- Faculty of Robot Science and Engineering, Northeastern University, Zhihui Street, Shenyang, 110169, Liaoning, China
| |
Collapse
|
6
|
Ko Y, Song S, Baek J, Shim H. Adapting low-dose CT denoisers for texture preservation using zero-shot local noise-level matching. Med Phys 2024; 51:4181-4200. [PMID: 38478305 DOI: 10.1002/mp.17015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 01/27/2024] [Accepted: 01/28/2024] [Indexed: 06/05/2024] Open
Abstract
BACKGROUND On enhancing the image quality of low-dose computed tomography (LDCT), various denoising methods have achieved meaningful improvements. However, they commonly produce over-smoothed results; the denoised images tend to be more blurred than the normal-dose targets (NDCTs). Furthermore, many recent denoising methods employ deep learning(DL)-based models, which require a vast amount of CT images (or image pairs). PURPOSE Our goal is to address the problem of over-smoothed results and design an algorithm that works regardless of the need for a large amount of training dataset to achieve plausible denoising results. Over-smoothed images negatively affect the diagnosis and treatment since radiologists had developed clinical experiences with NDCT. Besides, a large-scale training dataset is often not available in clinical situations. To overcome these limitations, we propose locally-adaptive noise-level matching (LANCH), emphasizing the output should retain the same noise-level and characteristics to that of the NDCT without additional training. METHODS We represent the NDCT image as the pixel-wisely weighted sum of an over-smoothed output from off-the-shelf denoiser (OSD) and the difference between the LDCT image and the OSD output. Herein, LANCH determines a 2D ratio map (i.e., pixel-wise weight matrix) by locally matching the noise-level of output and NDCT, where the LDCT-to-NDCT device flux (mAs) ratio reveals the NDCT noise-level. Thereby, LANCH can preserve important details in LDCT, and enhance the sharpness of the noise-free regions. Note that LANCH can enhance any LDCT denoisers without additional training data (i.e., zero-shot). RESULTS The proposed method is applicable to any OSD denoisers, reporting significant texture plausibility development over the baseline denoisers in quantitative and qualitative manners. It is surprising that the denoising accuracy achieved by our method with zero-shot denoiser was comparable or superior to that of the best training-based denoisers; our result showed 1% and 33% gains in terms of SSIM and DISTS, respectively. Reader study with experienced radiologists shows significant image quality improvements, a gain of + 1.18 on a five-point mean opinion score scale. CONCLUSIONS In this paper, we propose a technique to enhance any low-dose CT denoiser by leveraging the fundamental physical relationship between the x-ray flux and noise variance. Our method is capable of operating in a zero-shot condition, which means that only a single low-dose CT image is required for the enhancement process. We demonstrate that our approach is comparable or even superior to supervised DL-based denoisers that are trained using numerous CT images. Extensive experiments illustrate that our method consistently improves the performance of all tested LDCT denoisers.
Collapse
Affiliation(s)
- Youngjun Ko
- School of Integrated Technology, Yonsei University, Incheon, South Korea
| | - Seongjong Song
- School of Integrated Technology, Yonsei University, Incheon, South Korea
| | - Jongduk Baek
- School of Integrated Technology, Yonsei University, Incheon, South Korea
| | | |
Collapse
|
7
|
Zhang Y, Zhang R, Cao R, Xu F, Jiang F, Meng J, Ma F, Guo Y, Liu J. Unsupervised low-dose CT denoising using bidirectional contrastive network. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 251:108206. [PMID: 38723435 DOI: 10.1016/j.cmpb.2024.108206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 04/16/2024] [Accepted: 04/29/2024] [Indexed: 05/31/2024]
Abstract
BACKGROUND AND OBJECTIVE Low-dose computed tomography (LDCT) scans significantly reduce radiation exposure, but introduce higher levels of noise and artifacts that compromise image quality and diagnostic accuracy. Supervised learning methods have proven effective in denoising LDCT images, but are hampered by the need for large, paired datasets, which pose significant challenges in data acquisition. This study aims to develop a robust unsupervised LDCT denoising method that overcomes the reliance on paired LDCT and normal-dose CT (NDCT) samples, paving the way for more accessible and practical denoising techniques. METHODS We propose a novel unsupervised network model, Bidirectional Contrastive Unsupervised Denoising (BCUD), for LDCT denoising. This model innovatively combines a bidirectional network structure with contrastive learning theory to map the precise mutual correspondence between the noisy LDCT image domain and the clean NDCT image domain. Specifically, we employ dual encoders and discriminators for domain-specific data generation, and use unique projection heads for each domain to adaptively learn customized embedded representations. We then align corresponding features across domains within the learned embedding spaces to achieve effective noise reduction. This approach fundamentally improves the model's ability to match features in latent space, thereby improving noise reduction while preserving fine image detail. RESULTS Through extensive experimental validation on the AAPM-Mayo public dataset and real-world clinical datasets, the proposed BCUD method demonstrated superior performance. It achieved a peak signal-to-noise ratio (PSNR) of 31.387 dB, a structural similarity index measure (SSIM) of 0.886, an information fidelity criterion (IFC) of 2.305, and a visual information fidelity (VIF) of 0.373. Notably, subjective evaluation by radiologists resulted in a mean score of 4.23, highlighting its advantages over existing methods in terms of clinical applicability. CONCLUSIONS This paper presents an innovative unsupervised LDCT denoising method using a bidirectional contrastive network, which greatly improves clinical applicability by eliminating the need for perfectly matched image pairs. The method sets a new benchmark in unsupervised LDCT image denoising, excelling in noise reduction and preservation of fine structural details.
Collapse
Affiliation(s)
- Yuanke Zhang
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; Shandong Provincial Key Laboratory of Data Security and Intelligent Computing, Qufu Normal University, Rizhao 276826, China.
| | - Rui Zhang
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Rujuan Cao
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Fan Xu
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Fengjuan Jiang
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Jing Meng
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; Shandong Provincial Key Laboratory of Data Security and Intelligent Computing, Qufu Normal University, Rizhao 276826, China
| | - Fei Ma
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; Shandong Provincial Key Laboratory of Data Security and Intelligent Computing, Qufu Normal University, Rizhao 276826, China
| | - Yanfei Guo
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; Shandong Provincial Key Laboratory of Data Security and Intelligent Computing, Qufu Normal University, Rizhao 276826, China
| | - Jianlei Liu
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; Shandong Provincial Key Laboratory of Data Security and Intelligent Computing, Qufu Normal University, Rizhao 276826, China
| |
Collapse
|
8
|
Oh J, Wu D, Hong B, Lee D, Kang M, Li Q, Kim K. Texture-preserving low dose CT image denoising using Pearson divergence. Phys Med Biol 2024; 69:115021. [PMID: 38688292 DOI: 10.1088/1361-6560/ad45a4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 04/30/2024] [Indexed: 05/02/2024]
Abstract
Objective.The mean squared error (MSE), also known asL2loss, has been widely used as a loss function to optimize image denoising models due to its strong performance as a mean estimator of the Gaussian noise model. Recently, various low-dose computed tomography (LDCT) image denoising methods using deep learning combined with the MSE loss have been developed; however, this approach has been observed to suffer from the regression-to-the-mean problem, leading to over-smoothed edges and degradation of texture in the image.Approach.To overcome this issue, we propose a stochastic function in the loss function to improve the texture of the denoised CT images, rather than relying on complicated networks or feature space losses. The proposed loss function includes the MSE loss to learn the mean distribution and the Pearson divergence loss to learn feature textures. Specifically, the Pearson divergence loss is computed in an image space to measure the distance between two intensity measures of denoised low-dose and normal-dose CT images. The evaluation of the proposed model employs a novel approach of multi-metric quantitative analysis utilizing relative texture feature distance.Results.Our experimental results show that the proposed Pearson divergence loss leads to a significant improvement in texture compared to the conventional MSE loss and generative adversarial network (GAN), both qualitatively and quantitatively.Significance.Achieving consistent texture preservation in LDCT is a challenge in conventional GAN-type methods due to adversarial aspects aimed at minimizing noise while preserving texture. By incorporating the Pearson regularizer in the loss function, we can easily achieve a balance between two conflicting properties. Consistent high-quality CT images can significantly help clinicians in diagnoses and supporting researchers in the development of AI-diagnostic models.
Collapse
Affiliation(s)
- Jieun Oh
- Center for Advanced Medical Computing and Analysis (CAMCA), Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, United States of America
- Chungnam National University College of Medicine, Chungnam National University Hospital, Daejeon, 35015, Republic of Korea
| | - Dufan Wu
- Center for Advanced Medical Computing and Analysis (CAMCA), Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, United States of America
| | - Boohwi Hong
- Chungnam National University College of Medicine, Chungnam National University Hospital, Daejeon, 35015, Republic of Korea
| | - Dongheon Lee
- Chungnam National University College of Medicine, Chungnam National University Hospital, Daejeon, 35015, Republic of Korea
| | - Minwoong Kang
- Chungnam National University College of Medicine, Chungnam National University Hospital, Daejeon, 35015, Republic of Korea
| | - Quanzheng Li
- Center for Advanced Medical Computing and Analysis (CAMCA), Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, United States of America
| | - Kyungsang Kim
- Center for Advanced Medical Computing and Analysis (CAMCA), Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, United States of America
| |
Collapse
|
9
|
Chen Z, Niu C, Gao Q, Wang G, Shan H. LIT-Former: Linking In-Plane and Through-Plane Transformers for Simultaneous CT Image Denoising and Deblurring. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1880-1894. [PMID: 38194396 DOI: 10.1109/tmi.2024.3351723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Abstract
This paper studies 3D low-dose computed tomography (CT) imaging. Although various deep learning methods were developed in this context, typically they focus on 2D images and perform denoising due to low-dose and deblurring for super-resolution separately. Up to date, little work was done for simultaneous in-plane denoising and through-plane deblurring, which is important to obtain high-quality 3D CT images with lower radiation and faster imaging speed. For this task, a straightforward method is to directly train an end-to-end 3D network. However, it demands much more training data and expensive computational costs. Here, we propose to link in-plane and through-plane transformers for simultaneous in-plane denoising and through-plane deblurring, termed as LIT-Former, which can efficiently synergize in-plane and through-plane sub-tasks for 3D CT imaging and enjoy the advantages of both convolution and transformer networks. LIT-Former has two novel designs: efficient multi-head self-attention modules (eMSM) and efficient convolutional feed-forward networks (eCFN). First, eMSM integrates in-plane 2D self-attention and through-plane 1D self-attention to efficiently capture global interactions of 3D self-attention, the core unit of transformer networks. Second, eCFN integrates 2D convolution and 1D convolution to extract local information of 3D convolution in the same fashion. As a result, the proposed LIT-Former synergizes these two sub-tasks, significantly reducing the computational complexity as compared to 3D counterparts and enabling rapid convergence. Extensive experimental results on simulated and clinical datasets demonstrate superior performance over state-of-the-art models. The source code is made available at https://github.com/hao1635/LIT-Former.
Collapse
|
10
|
Nazir N, Sarwar A, Saini BS. Recent developments in denoising medical images using deep learning: An overview of models, techniques, and challenges. Micron 2024; 180:103615. [PMID: 38471391 DOI: 10.1016/j.micron.2024.103615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 02/20/2024] [Accepted: 02/26/2024] [Indexed: 03/14/2024]
Abstract
Medical imaging plays a critical role in diagnosing and treating various medical conditions. However, interpreting medical images can be challenging even for expert clinicians, as they are often degraded by noise and artifacts that can hinder the accurate identification and analysis of diseases, leading to severe consequences such as patient misdiagnosis or mortality. Various types of noise, including Gaussian, Rician, and Salt-pepper noise, can corrupt the area of interest, limiting the precision and accuracy of algorithms. Denoising algorithms have shown the potential in improving the quality of medical images by removing noise and other artifacts that obscure essential information. Deep learning has emerged as a powerful tool for image analysis and has demonstrated promising results in denoising different medical images such as MRIs, CT scans, PET scans, etc. This review paper provides a comprehensive overview of state-of-the-art deep learning algorithms used for denoising medical images. A total of 120 relevant papers were reviewed, and after screening with specific inclusion and exclusion criteria, 104 papers were selected for analysis. This study aims to provide a thorough understanding for researchers in the field of intelligent denoising by presenting an extensive survey of current techniques and highlighting significant challenges that remain to be addressed. The findings of this review are expected to contribute to the development of intelligent models that enable timely and accurate diagnoses of medical disorders. It was found that 40% of the researchers used models based on Deep convolutional neural networks to denoise the images, followed by encoder-decoder (18%) and other artificial intelligence-based techniques (15%) (Like DIP, etc.). Generative adversarial network was used by 12%, transformer-based approaches (13%) and multilayer perceptron was used by 2% of the researchers. Moreover, Gaussian noise was present in 35% of the images, followed by speckle noise (16%), poisson noise (14%), artifacts (10%), rician noise (7%), Salt-pepper noise (6%), Impulse noise (3%) and other types of noise (9%). While the progress in developing novel models for the denoising of medical images is evident, significant work remains to be done in creating standardized denoising models that perform well across a wide spectrum of medical images. Overall, this review highlights the importance of denoising medical images and provides a comprehensive understanding of the current state-of-the-art deep learning algorithms in this field.
Collapse
|
11
|
An R, Chen K, Li H. Self-supervised dual-domain balanced dropblock-network for low-dose CT denoising. Phys Med Biol 2024; 69:075026. [PMID: 38359449 DOI: 10.1088/1361-6560/ad29ba] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 02/15/2024] [Indexed: 02/17/2024]
Abstract
Objective.Self-supervised learning methods have been successfully applied for low-dose computed tomography (LDCT) denoising, with the advantage of not requiring labeled data. Conventional self-supervised methods operate only in the image domain, ignoring valuable priors in the sinogram domain. Recently proposed dual-domain methods address this limitation but encounter issues with blurring artifacts in the reconstructed image due to the inhomogeneous distribution of noise levels in low-dose sinograms.Approach.To tackle this challenge, this paper proposes SDBDNet, an end-to-end dual-domain self-supervised method for LDCT denoising. With the network designed based on the properties of inhomogeneous noise in low-dose sinograms and the principle of moderate sinogram-domain denoising, SDBDNet achieves effective denoising in dual domains without introducing blurring artifacts. Specifically, we split the sinogram into two subsets based on the positions of detector cells to generate paired training data with high similarity and independent noise. These sub-sinograms are then restored to their original size using 1D interpolation and learning-based correction. To achieve adaptive and moderate smoothing in the sinogram domain, we integrate Dropblock, a type of convolution layer with regularization, into SDBDNet, and set a weighted average between the denoised sinograms and their noisy counterparts, leading to a well-balanced dual-domain approach.Main results.Numerical experiments show that our method outperforms popular non-learning and self-supervised learning methods, demonstrating its effectiveness and superior performance.Significance.While introducing a novel high-performance dual-domain self-supervised LDCT denoising method, this paper also emphasizes and verifies the importance of appropriate sinogram-domain denoising in dual-domain methods, which might inspire future work.
Collapse
Affiliation(s)
- Ran An
- School of Mathematical Sciences, Capital Normal University, Beijing, 100048, People's Republic of China
- Centre for Mathematical Imaging Techniques, University of Liverpool, Liverpool, L69 7ZL, United Kingdom
| | - Ke Chen
- Department of Mathematics and Statistics, University of Strathclyde, Glasgow, G1 1XQ, United Kingdom
| | - Hongwei Li
- School of Mathematical Sciences, Capital Normal University, Beijing, 100048, People's Republic of China
| |
Collapse
|
12
|
Xiong L, Li N, Qiu W, Luo Y, Li Y, Zhang Y. Re-UNet: a novel multi-scale reverse U-shape network architecture for low-dose CT image reconstruction. Med Biol Eng Comput 2024; 62:701-712. [PMID: 37982956 DOI: 10.1007/s11517-023-02966-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 11/03/2023] [Indexed: 11/21/2023]
Abstract
In recent years, the growing awareness of public health has brought attention to low-dose computed tomography (LDCT) scans. However, the CT image generated in this way contains a lot of noise or artifacts, which make increasing researchers to investigate methods to enhance image quality. The advancement of deep learning technology has provided researchers with novel approaches to enhance the quality of LDCT images. In the past, numerous studies based on convolutional neural networks (CNN) have yielded remarkable results in LDCT image reconstruction. Nonetheless, they all tend to continue to design new networks based on the fixed network architecture of UNet shape, which also leads to more and more complex networks. In this paper, we proposed a novel network model with a reverse U-shape architecture for the noise reduction in the LDCT image reconstruction task. In the model, we further designed a novel multi-scale feature extractor and edge enhancement module that yields a positive impact on CT images to exhibit strong structural characteristics. Evaluated on a public dataset, the experimental results demonstrate that the proposed model outperforms the compared algorithms based on traditional U-shaped architecture in terms of preserving texture details and reducing noise, as demonstrated by achieving the highest PSNR, SSIM and RMSE value. This study may shed light on the reverse U-shaped network architecture for CT image reconstruction, and could investigate the potential on other medical image processing.
Collapse
Affiliation(s)
- Lianjin Xiong
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, China
| | - Ning Li
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, China
| | - Wei Qiu
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, China
| | - Yiqian Luo
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, China
| | - Yishi Li
- Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital of Chongqing Medical University, Chongqing, 400016, China
| | - Yangsong Zhang
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, China.
- NHC Key Laboratory of Nuclear Technology Medical Transformation (MIANYANG CENTRAL HOSPITAL), Mianyang, 621000, China.
- Key Laboratory of Testing Technology for Manufacturing Process, Ministry of Education, Southwest University of Science and Technology, Mianyang, 621010, China.
| |
Collapse
|
13
|
Li S, Chen K, Ma X, Liang Z. Semi-supervised low-dose SPECT restoration using sinogram inner-structure aware graph neural network. Phys Med Biol 2024; 69:055016. [PMID: 38324896 DOI: 10.1088/1361-6560/ad2716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Accepted: 02/07/2024] [Indexed: 02/09/2024]
Abstract
Objective.To mitigate the potential radiation risk, low-dose single photon emission computed tomography (SPECT) is of increasing interest. Numerous deep learning-based methods have been developed to perform low-dose imaging while maintaining image quality. However, most existing methods seldom explore the unique inner-structure inherent within sinograms. In addition, traditional supervised learning methods require large-scale labeled data, where the normal-dose data serves as annotation and is intractable to acquire in low-dose imaging. In this study, we aim to develop a novel sinogram inner-structure-aware semi-supervised framework for the task of low-dose SPECT sinogram restoration.Approach.The proposed framework retains the strengths of UNet, meanwhile introducing a sinogram-structure-based non-local neighbors graph neural network (SSN-GNN) module and a window-based K-nearest neighbors GNN (W-KNN-GNN) module to effectively exploit the inherent inner-structure within SPECT sinograms. Moreover, the proposed framework employs the mean teacher semi-supervised learning approach to leverage the information available in abundant unlabeled low-dose sinograms.Main results.The datasets exploited in this study were acquired from the (Extended Cardiac-Torso) XCAT anthropomorphic digital phantoms, which provide realistic images for imaging research of various modalities. Quantitative as well as qualitative results demonstrate that the proposed framework achieves superior performance compared to several state-of-the-art reconstruction methods. To further validate the effectiveness of the proposed framework, ablation and robustness experiments were also performed. The experimental results show that each component of the proposed framework effectively improves the model performance, and the framework exhibits superior robustness with respect to various noise levels. Besides, the proposed semi-supervised paradigm showcases the efficacy of incorporating supplementary unlabeled low-dose sinograms.Significance.The proposed framework improves the quality of low-dose SPECT reconstructed images by utilizing sinogram inner-structure and incorporating supplementary unlabeled data, which provides an important tool for dose reduction without sacrificing the image quality.
Collapse
Affiliation(s)
- Si Li
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, People's Republic of China
| | - Keming Chen
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, People's Republic of China
| | - Xiangyuan Ma
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, People's Republic of China
| | - Zengguo Liang
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, People's Republic of China
| |
Collapse
|
14
|
Gao X, Jiang B, Wang X, Huang L, Tu Z. Chest x-ray diagnosis via spatial-channel high-order attention representation learning. Phys Med Biol 2024; 69:045026. [PMID: 38347732 DOI: 10.1088/1361-6560/ad2014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 01/18/2024] [Indexed: 02/15/2024]
Abstract
Objective. Chest x-ray image representation and learning is an important problem in computer-aided diagnostic area. Existing methods usually adopt CNN or Transformers for feature representation learning and focus on learning effective representations for chest x-ray images. Although good performance can be obtained, however, these works are still limited mainly due to the ignorance of mining the correlations of channels and pay little attention on the local context-aware feature representation of chest x-ray image.Approach. To address these problems, in this paper, we propose a novel spatial-channel high-order attention model (SCHA) for chest x-ray image representation and diagnosis. The proposed network architecture mainly contains three modules, i.e. CEBN, SHAM and CHAM. To be specific, firstly, we introduce a context-enhanced backbone network by employing multi-head self-attention to extract initial features for the input chest x-ray images. Then, we develop a novel SCHA which contains both spatial and channel high-order attention learning branches. For the spatial branch, we develop a novel local biased self-attention mechanism which can capture both local and long-range global dependences of positions to learn rich context-aware representation. For the channel branch, we employ Brownian Distance Covariance to encode the correlation information of channels and regard it as the image representation. Finally, the two learning branches are integrated together for the final multi-label diagnosis classification and prediction.Main results. Experiments on the commonly used datasets including ChestX-ray14 and CheXpert demonstrate that our proposed SCHA approach can obtain better performance when comparing many related approaches.Significance. This study obtains a more discriminative method for chest x-ray classification and provides a technique for computer-aided diagnosis.
Collapse
Affiliation(s)
- Xinyue Gao
- The School of Computer Science and Technology, Anhui University, Hefei 230601, People's Republic of China
| | - Bo Jiang
- The School of Computer Science and Technology, Anhui University, Hefei 230601, People's Republic of China
| | - Xixi Wang
- The School of Computer Science and Technology, Anhui University, Hefei 230601, People's Republic of China
| | - Lili Huang
- The School of Computer Science and Technology, Anhui University, Hefei 230601, People's Republic of China
| | - Zhengzheng Tu
- The School of Computer Science and Technology, Anhui University, Hefei 230601, People's Republic of China
| |
Collapse
|
15
|
Azad R, Kazerouni A, Heidari M, Aghdam EK, Molaei A, Jia Y, Jose A, Roy R, Merhof D. Advances in medical image analysis with vision Transformers: A comprehensive review. Med Image Anal 2024; 91:103000. [PMID: 37883822 DOI: 10.1016/j.media.2023.103000] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 09/30/2023] [Accepted: 10/11/2023] [Indexed: 10/28/2023]
Abstract
The remarkable performance of the Transformer architecture in natural language processing has recently also triggered broad interest in Computer Vision. Among other merits, Transformers are witnessed as capable of learning long-range dependencies and spatial correlations, which is a clear advantage over convolutional neural networks (CNNs), which have been the de facto standard in Computer Vision problems so far. Thus, Transformers have become an integral part of modern medical image analysis. In this review, we provide an encyclopedic review of the applications of Transformers in medical imaging. Specifically, we present a systematic and thorough review of relevant recent Transformer literature for different medical image analysis tasks, including classification, segmentation, detection, registration, synthesis, and clinical report generation. For each of these applications, we investigate the novelty, strengths and weaknesses of the different proposed strategies and develop taxonomies highlighting key properties and contributions. Further, if applicable, we outline current benchmarks on different datasets. Finally, we summarize key challenges and discuss different future research directions. In addition, we have provided cited papers with their corresponding implementations in https://github.com/mindflow-institue/Awesome-Transformer.
Collapse
Affiliation(s)
- Reza Azad
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | - Amirhossein Kazerouni
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| | - Moein Heidari
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| | | | - Amirali Molaei
- School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
| | - Yiwei Jia
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | - Abin Jose
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | - Rijo Roy
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | - Dorit Merhof
- Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany; Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany.
| |
Collapse
|
16
|
Ahn C, Kim JH. AntiHalluciNet: A Potential Auditing Tool of the Behavior of Deep Learning Denoising Models in Low-Dose Computed Tomography. Diagnostics (Basel) 2023; 14:96. [PMID: 38201404 PMCID: PMC10795730 DOI: 10.3390/diagnostics14010096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/14/2023] [Accepted: 12/30/2023] [Indexed: 01/12/2024] Open
Abstract
Gaining the ability to audit the behavior of deep learning (DL) denoising models is of crucial importance to prevent potential hallucinations and adversarial clinical consequences. We present a preliminary version of AntiHalluciNet, which is designed to predict spurious structural components embedded in the residual noise from DL denoising models in low-dose CT and assess its feasibility for auditing the behavior of DL denoising models. We created a paired set of structure-embedded and pure noise images and trained AntiHalluciNet to predict spurious structures in the structure-embedded noise images. The performance of AntiHalluciNet was evaluated by using a newly devised residual structure index (RSI), which represents the prediction confidence based on the presence of structural components in the residual noise image. We also evaluated whether AntiHalluciNet could assess the image fidelity of a denoised image by using only a noise component instead of measuring the SSIM, which requires both reference and test images. Then, we explored the potential of AntiHalluciNet for auditing the behavior of DL denoising models. AntiHalluciNet was applied to three DL denoising models (two pre-trained models, RED-CNN and CTformer, and a commercial software, ClariCT.AI [version 1.2.3]), and whether AntiHalluciNet could discriminate between the noise purity performances of DL denoising models was assessed. AntiHalluciNet demonstrated an excellent performance in predicting the presence of structural components. The RSI values for the structural-embedded and pure noise images measured using the 50% low-dose dataset were 0.57 ± 31 and 0.02 ± 0.02, respectively, showing a substantial difference with a p-value < 0.0001. The AntiHalluciNet-derived RSI could differentiate between the quality of the degraded denoised images, with measurement values of 0.27, 0.41, 0.48, and 0.52 for the 25%, 50%, 75%, and 100% mixing rates of the degradation component, which showed a higher differentiation potential compared with the SSIM values of 0.9603, 0.9579, 0.9490, and 0.9333. The RSI measurements from the residual images of the three DL denoising models showed a distinct distribution, being 0.28 ± 0.06, 0.21 ± 0.06, and 0.15 ± 0.03 for RED-CNN, CTformer, and ClariCT.AI, respectively. AntiHalluciNet has the potential to predict the structural components embedded in the residual noise from DL denoising models in low-dose CT. With AntiHalluciNet, it is feasible to audit the performance and behavior of DL denoising models in clinical environments where only residual noise images are available.
Collapse
Affiliation(s)
- Chulkyun Ahn
- Department of Transdisciplinary Studies, Program in Biomedical Radiation Sciences, Graduate School of Convergence Science and Technology, Seoul National University, Seoul 08826, Republic of Korea;
- ClariPi Research, ClariPi, Seoul 03088, Republic of Korea
| | - Jong Hyo Kim
- Department of Transdisciplinary Studies, Program in Biomedical Radiation Sciences, Graduate School of Convergence Science and Technology, Seoul National University, Seoul 08826, Republic of Korea;
- ClariPi Research, ClariPi, Seoul 03088, Republic of Korea
- Department of Applied Bioengineering, Graduate School of Convergence Science and Technology, Seoul National University, Seoul 08826, Republic of Korea
- Department of Radiology, Seoul National University College of Medicine, Seoul 03080, Republic of Korea
- Department of Radiology, Seoul National University Hospital, Seoul 03080, Republic of Korea
- Center for Medical-IT Convergence Technology Research, Advanced Institutes of Convergence Technology, Suwon-si 16229, Republic of Korea
| |
Collapse
|
17
|
Nadkarni R, Clark DP, Allphin AJ, Badea CT. A Deep Learning Approach for Rapid and Generalizable Denoising of Photon-Counting Micro-CT Images. Tomography 2023; 9:1286-1302. [PMID: 37489470 PMCID: PMC10366887 DOI: 10.3390/tomography9040102] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 06/27/2023] [Accepted: 06/30/2023] [Indexed: 07/26/2023] Open
Abstract
Photon-counting CT (PCCT) is powerful for spectral imaging and material decomposition but produces noisy weighted filtered backprojection (wFBP) reconstructions. Although iterative reconstruction effectively denoises these images, it requires extensive computation time. To overcome this limitation, we propose a deep learning (DL) model, UnetU, which quickly estimates iterative reconstruction from wFBP. Utilizing a 2D U-net convolutional neural network (CNN) with a custom loss function and transformation of wFBP, UnetU promotes accurate material decomposition across various photon-counting detector (PCD) energy threshold settings. UnetU outperformed multi-energy non-local means (ME NLM) and a conventional denoising CNN called UnetwFBP in terms of root mean square error (RMSE) in test set reconstructions and their respective matrix inversion material decompositions. Qualitative results in reconstruction and material decomposition domains revealed that UnetU is the best approximation of iterative reconstruction. In reconstructions with varying undersampling factors from a high dose ex vivo scan, UnetU consistently gave higher structural similarity (SSIM) and peak signal-to-noise ratio (PSNR) to the fully sampled iterative reconstruction than ME NLM and UnetwFBP. This research demonstrates UnetU's potential as a fast (i.e., 15 times faster than iterative reconstruction) and generalizable approach for PCCT denoising, holding promise for advancing preclinical PCCT research.
Collapse
Affiliation(s)
- Rohan Nadkarni
- Quantitative Imaging and Analysis Lab, Department of Radiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Darin P Clark
- Quantitative Imaging and Analysis Lab, Department of Radiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Alex J Allphin
- Quantitative Imaging and Analysis Lab, Department of Radiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Cristian T Badea
- Quantitative Imaging and Analysis Lab, Department of Radiology, Duke University Medical Center, Durham, NC 27710, USA
| |
Collapse
|
18
|
Wang S, Liu Y, Zhang P, Chen P, Li Z, Yan R, Li S, Hou R, Gui Z. Compound feature attention network with edge enhancement for low-dose CT denoising. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2023; 31:915-933. [PMID: 37355934 DOI: 10.3233/xst-230064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2023]
Abstract
BACKGROUND Low-dose CT (LDCT) images usually contain serious noise and artifacts, which weaken the readability of the image. OBJECTIVE To solve this problem, we propose a compound feature attention network with edge enhancement for LDCT denoising (CFAN-Net), which consists of an edge-enhanced module and a proposed compound feature attention block (CFAB). METHODS The edge enhancement module extracts edge details with the trainable Sobel convolution. CFAB consists of an interactive feature learning module (IFLM), a multi-scale feature fusion module (MFFM), and a joint attention module (JAB), which removes noise from LDCT images in a coarse-to-fine manner. First, in IFLM, the noise is initially removed by cross-latitude interactive judgment learning. Second, in MFFM, multi-scale and pixel attention are integrated to explore fine noise removal. Finally, in JAB, we focus on key information, extract useful features, and improve the efficiency of network learning. To construct a high-quality image, we repeat the above operation by cascading CFAB. RESULTS By applying CFAN-Net to process the 2016 NIH AAPM-Mayo LDCT challenge test dataset, experiments show that the peak signal-to-noise ratio value is 33.9692 and the structural similarity value is 0.9198. CONCLUSIONS Compared with several existing LDCT denoising algorithms, CFAN-Net effectively preserves the texture of CT images while removing noise and artifacts.
Collapse
Affiliation(s)
- Shubin Wang
- State Key Laboratory of Dynamic Testing Technology, School of Information and Communication Engineering, North University of China, Taiyuan Shanxi Province, China
| | - Yi Liu
- State Key Laboratory of Dynamic Testing Technology, School of Information and Communication Engineering, North University of China, Taiyuan Shanxi Province, China
| | - Pengcheng Zhang
- State Key Laboratory of Dynamic Testing Technology, School of Information and Communication Engineering, North University of China, Taiyuan Shanxi Province, China
| | - Ping Chen
- State Key Laboratory of Dynamic Testing Technology, School of Information and Communication Engineering, North University of China, Taiyuan Shanxi Province, China
| | - Zhiyuan Li
- State Key Laboratory of Dynamic Testing Technology, School of Information and Communication Engineering, North University of China, Taiyuan Shanxi Province, China
| | - Rongbiao Yan
- State Key Laboratory of Dynamic Testing Technology, School of Information and Communication Engineering, North University of China, Taiyuan Shanxi Province, China
| | - Shu Li
- State Key Laboratory of Dynamic Testing Technology, School of Information and Communication Engineering, North University of China, Taiyuan Shanxi Province, China
| | - Ruifeng Hou
- State Key Laboratory of Dynamic Testing Technology, School of Information and Communication Engineering, North University of China, Taiyuan Shanxi Province, China
| | - Zhiguo Gui
- State Key Laboratory of Dynamic Testing Technology, School of Information and Communication Engineering, North University of China, Taiyuan Shanxi Province, China
| |
Collapse
|
19
|
Lina J, Xu H, Aimin H, Beibei J, Zhiguo G. A densely connected LDCT image denoising network based on dual-edge extraction and multi-scale attention under compound loss. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2023; 31:1207-1226. [PMID: 37742690 DOI: 10.3233/xst-230132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
BACKGROUND Low dose computed tomography (LDCT) uses lower radiation dose, but the reconstructed images contain higher noise that can have negative impact in disease diagnosis. Although deep learning with the edge extraction operators reserves edge information well, only applying the edge extraction operators to input LDCT images does not yield overall satisfactory results. OBJECTIVE To improve LDCT images quality, this study proposes and tests a dual edge extraction multi-scale attention mechanism convolution neural network (DEMACNN) based on a compound loss. METHODS The network uses edge extraction operators to extract edge information from both the input images and the feature maps in the network, improving the utilization of the edge operators and retaining the images edge information. The feature enhancement block is constructed by fusing the attention mechanism and multi-scale module, enhancing effective information, while suppressing useless information. The residual learning method is used to learn the network, improving the performance of the network, and solving the problem of gradient disappearance. Except for the network structure, a compound loss function, which consists of the MSE loss, the proposed joint total variation loss, and the edge loss, is proposed to enhance the denoising ability of the network and reserve the edge of images. RESULTS Compared with other advanced methods (REDCNN, CT-former and EDCNN), the proposed new network achieves the best PSNR and SSIM values in LDCT images of the abdomen, which are 33.3486 and 0.9104, respectively. In addition, the new network also performs well on head and chest image data. CONCLUSION The experimental results demonstrate that the proposed new network structure and denoising algorithm not only effectively removes the noise in LDCT images, but also protects the edges and details of the images well.
Collapse
Affiliation(s)
- Jia Lina
- School of Physics and Electronic Engineering, Shanxi University, Taiyuan, China
| | - He Xu
- School of Physics and Electronic Engineering, Shanxi University, Taiyuan, China
| | - Huang Aimin
- School of Physics and Electronic Engineering, Shanxi University, Taiyuan, China
| | - Jia Beibei
- School of Physics and Electronic Engineering, Shanxi University, Taiyuan, China
| | - Gui Zhiguo
- State Key Laboratory of Dynamic Measurement Technology, North University of China, Taiyuan, China
| |
Collapse
|
20
|
Yan H, Fang C, Liu P, Qiao Z. CGP-Uformer: A low-dose CT image denoising Uformer based on channel graph perception. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2023; 31:1189-1205. [PMID: 37718835 DOI: 10.3233/xst-230158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/19/2023]
Abstract
BACKGROUND An effective method for achieving low-dose CT is to keep the number of projection angles constant while reducing radiation dose at each angle. However, this leads to high-intensity noise in the reconstructed image, adversely affecting subsequent image processing, analysis, and diagnosis. OBJECTIVE This paper proposes a novel Channel Graph Perception based U-shaped Transformer (CGP-Uformer) network, aiming to achieve high-performance denoising of low-dose CT images. METHODS The network consists of convolutional feed-forward Transformer (ConvF-Transformer) blocks, a channel graph perception block (CGPB), and spatial cross-attention (SC-Attention) blocks. The ConvF-Transformer blocks enhance the ability of feature representation and information transmission through the CNN-based feed-forward network. The CGPB introduces Graph Convolutional Network (GCN) for Channel-to-Channel feature extraction, promoting the propagation of information across distinct channels and enabling inter-channel information interchange. The SC-Attention blocks reduce the semantic difference in feature fusion between the encoder and decoder by computing spatial cross-attention. RESULTS By applying CGP-Uformer to process the 2016 NIH AAPM-Mayo LDCT challenge dataset, experiments show that the peak signal-to-noise ratio value is 35.56 and the structural similarity value is 0.9221. CONCLUSIONS Compared to the other four representative denoising networks currently, this new network demonstrates superior denoising performance and better preservation of image details.
Collapse
Affiliation(s)
- Huimin Yan
- School of Computer and Information Technology, Shanxi University, Taiyuan, Shanxi, China
| | - Chenyun Fang
- School of Computer and Information Technology, Shanxi University, Taiyuan, Shanxi, China
| | - Peng Liu
- School of Computer and Information Technology, Shanxi University, Taiyuan, Shanxi, China
| | - Zhiwei Qiao
- School of Computer and Information Technology, Shanxi University, Taiyuan, Shanxi, China
| |
Collapse
|
21
|
Liu Y, Yan R, Liu Y, Zhang P, Chen Y, Gui Z. Enhancement based convolutional dictionary network with adaptive window for low-dose CT denoising. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2023; 31:1165-1187. [PMID: 37694333 DOI: 10.3233/xst-230094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
BACKGROUND Recently, one promising approach to suppress noise/artifacts in low-dose CT (LDCT) images is the CNN-based approach, which learns the mapping function from LDCT to normal-dose CT (NDCT). However, most CNN-based methods are purely data-driven, thus lacking sufficient interpretability and often losing details. OBJECTIVE To solve this problem, we propose a deep convolutional dictionary learning method for LDCT denoising, in which a novel convolutional dictionary learning model with adaptive window (CDL-AW) is designed, and a corresponding enhancement-based convolutional dictionary learning network (called ECDAW-Net) is constructed to unfold the CDL-AW model iteratively using the proximal gradient descent technique. METHODS In detail, the adaptive window-constrained convolutional dictionary atom is proposed to alleviate spectrum leakage caused by data truncation during convolution. Furthermore, in the ECDAW-Net, a multi-scale edge extraction module that consists of LoG and Sobel convolution layers is proposed in the unfolding iteration, to supplement lost textures and details. Additionally, to further improve the detail retention ability, the ECDAW-Net is trained by the compound loss function of the pixel-level MSE loss and the proposed patch-level loss, which can assist to retain richer structural information. RESULTS Applying ECDAW-Net to the Mayo dataset, we obtained the highest peak signal-to-noise ratio (33.94) and sub-optimal structural similarity (0.92). CONCLUSIONS Compared with some state-of-art methods, the interpretable ECDAW-Net performs well in suppressing noise/artifacts and preserving textures of tissue.
Collapse
Affiliation(s)
- Yi Liu
- The State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| | - Rongbiao Yan
- The State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| | - Yuhang Liu
- The State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| | - Pengcheng Zhang
- The State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| | - Yang Chen
- The Key Laboratory of Computer Network and Information Integration, Southeast University, Ministry of Education, Nanjing, China
| | - Zhiguo Gui
- The State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| |
Collapse
|