1
|
Wang H, Xie Q, Zhao Q, Li Y, Liang Y, Zheng Y, Meng D. RCDNet: An Interpretable Rain Convolutional Dictionary Network for Single Image Deraining. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:8668-8682. [PMID: 37018568 DOI: 10.1109/tnnls.2022.3231453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
As common weather, rain streaks adversely degrade the image quality and tend to negatively affect the performance of outdoor computer vision systems. Hence, removing rains from an image has become an important issue in the field. To handle such an ill-posed single image deraining task, in this article, we specifically build a novel deep architecture, called rain convolutional dictionary network (RCDNet), which embeds the intrinsic priors of rain streaks and has clear interpretability. In specific, we first establish a rain convolutional dictionary (RCD) model for representing rain streaks and utilize the proximal gradient descent technique to design an iterative algorithm only containing simple operators for solving the model. By unfolding it, we then build the RCDNet in which every network module has clear physical meanings and corresponds to each operation involved in the algorithm. This good interpretability greatly facilitates an easy visualization and analysis of what happens inside the network and why it works well in the inference process. Moreover, taking into account the domain gap issue in real scenarios, we further design a novel dynamic RCDNet, where the rain kernels can be dynamically inferred corresponding to input rainy images and then help shrink the space for rain layer estimation with few rain maps, so as to ensure a fine generalization performance in the inconsistent scenarios of rain types between training and testing data. By end-to-end training such an interpretable network, all involved rain kernels and proximal operators can be automatically extracted, faithfully characterizing the features of both rain and clean background layers and, thus, naturally leading to better deraining performance. Comprehensive experiments implemented on a series of representative synthetic and real datasets substantiate the superiority of our method, especially on its well generality to diverse testing scenarios and good interpretability for all its modules, compared with state-of-the-art single image derainers both visually and quantitatively. Code is available at https://github.com/hongwang01/DRCDNet.
Collapse
|
2
|
Fan F, Ritschl L, Beister M, Biniazan R, Wagner F, Kreher B, Gottschalk TM, Kappler S, Maier A. Simulation-driven training of vision transformers enables metal artifact reduction of highly truncated CBCT scans. Med Phys 2024; 51:3360-3375. [PMID: 38150576 DOI: 10.1002/mp.16919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 11/17/2023] [Accepted: 12/13/2023] [Indexed: 12/29/2023] Open
Abstract
BACKGROUND Due to the high attenuation of metals, severe artifacts occur in cone beam computed tomography (CBCT). The metal segmentation in CBCT projections usually serves as a prerequisite for metal artifact reduction (MAR) algorithms. PURPOSE The occurrence of truncation caused by the limited detector size leads to the incomplete acquisition of metal masks from the threshold-based method in CBCT volume. Therefore, segmenting metal directly in CBCT projections is pursued in this work. METHODS Since the generation of high quality clinical training data is a constant challenge, this study proposes to generate simulated digital radiographs (data I) based on real CT data combined with self-designed computer aided design (CAD) implants. In addition to the simulated projections generated from 3D volumes, 2D x-ray images combined with projections of implants serve as the complementary data set (data II) to improve the network performance. In this work, SwinConvUNet consisting of shift window (Swin) vision transformers (ViTs) with patch merging as encoder is proposed for metal segmentation. RESULTS The model's performance is evaluated on accurately labeled test datasets obtained from cadaver scans as well as the unlabeled clinical projections. When trained on the data I only, the convolutional neural network (CNN) encoder-based networks UNet and TransUNet achieve only limited performance on the cadaver test data, with an average dice score of 0.821 and 0.850. After using both data II and data I during training, the average dice scores for the two models increase to 0.906 and 0.919, respectively. By replacing the CNN encoder with Swin transformer, the proposed SwinConvUNet reaches an average dice score of 0.933 for cadaver projections when only trained on the data I. Furthermore, SwinConvUNet has the largest average dice score of 0.953 for cadaver projections when trained on the combined data set. CONCLUSIONS Our experiments quantitatively demonstrate the effectiveness of the combination of the projections simulated under two pathways for network training. Besides, the proposed SwinConvUNet trained on the simulated projections performs state-of-the-art, robust metal segmentation as demonstrated on experiments on cadaver and clinical data sets. With the accurate segmentations from the proposed model, MAR can be conducted even for highly truncated CBCT scans.
Collapse
Affiliation(s)
- Fuxin Fan
- Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | | | | | | | - Fabian Wagner
- Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | | | | | | | - Andreas Maier
- Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
3
|
Selles M, van Osch JAC, Maas M, Boomsma MF, Wellenberg RHH. Advances in metal artifact reduction in CT images: A review of traditional and novel metal artifact reduction techniques. Eur J Radiol 2024; 170:111276. [PMID: 38142571 DOI: 10.1016/j.ejrad.2023.111276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/14/2023] [Accepted: 12/18/2023] [Indexed: 12/26/2023]
Abstract
Metal artifacts degrade CT image quality, hampering clinical assessment. Numerous metal artifact reduction methods are available to improve the image quality of CT images with metal implants. In this review, an overview of traditional methods is provided including the modification of acquisition and reconstruction parameters, projection-based metal artifact reduction techniques (MAR), dual energy CT (DECT) and the combination of these techniques. Furthermore, the additional value and challenges of novel metal artifact reduction techniques that have been introduced over the past years are discussed such as photon counting CT (PCCT) and deep learning based metal artifact reduction techniques.
Collapse
Affiliation(s)
- Mark Selles
- Department of Radiology, Isala, 8025 AB Zwolle, the Netherlands; Department of Radiology and Nuclear Medicine, Amsterdam University Medical Centre, 1105 AZ Amsterdam, the Netherlands; Amsterdam Movement Sciences, 1081 BT Amsterdam, the Netherlands.
| | | | - Mario Maas
- Department of Radiology and Nuclear Medicine, Amsterdam University Medical Centre, 1105 AZ Amsterdam, the Netherlands; Amsterdam Movement Sciences, 1081 BT Amsterdam, the Netherlands
| | | | - Ruud H H Wellenberg
- Department of Radiology and Nuclear Medicine, Amsterdam University Medical Centre, 1105 AZ Amsterdam, the Netherlands; Amsterdam Movement Sciences, 1081 BT Amsterdam, the Netherlands
| |
Collapse
|
4
|
Du M, Liang K, Zhang L, Gao H, Liu Y, Xing Y. Deep-Learning-Based Metal Artefact Reduction With Unsupervised Domain Adaptation Regularization for Practical CT Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2133-2145. [PMID: 37022909 DOI: 10.1109/tmi.2023.3244252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
CT metal artefact reduction (MAR) methods based on supervised deep learning are often troubled by domain gap between simulated training dataset and real-application dataset, i.e., methods trained on simulation cannot generalize well to practical data. Unsupervised MAR methods can be trained directly on practical data, but they learn MAR with indirect metrics and often perform unsatisfactorily. To tackle the domain gap problem, we propose a novel MAR method called UDAMAR based on unsupervised domain adaptation (UDA). Specifically, we introduce a UDA regularization loss into a typical image-domain supervised MAR method, which mitigates the domain discrepancy between simulated and practical artefacts by feature-space alignment. Our adversarial-based UDA focuses on a low-level feature space where the domain difference of metal artefacts mainly lies. UDAMAR can simultaneously learn MAR from simulated data with known labels and extract critical information from unlabeled practical data. Experiments on both clinical dental and torso datasets show the superiority of UDAMAR by outperforming its supervised backbone and two state-of-the-art unsupervised methods. We carefully analyze UDAMAR by both experiments on simulated metal artefacts and various ablation studies. On simulation, its close performance to the supervised methods and advantages over the unsupervised methods justify its efficacy. Ablation studies on the influence from the weight of UDA regularization loss, UDA feature layers, and the amount of practical data used for training further demonstrate the robustness of UDAMAR. UDAMAR provides a simple and clean design and is easy to implement. These advantages make it a very feasible solution for practical CT MAR.
Collapse
|
5
|
Wang H, Li Y, Zhang H, Meng D, Zheng Y. InDuDoNet+: A deep unfolding dual domain network for metal artifact reduction in CT images. Med Image Anal 2023; 85:102729. [PMID: 36623381 DOI: 10.1016/j.media.2022.102729] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Revised: 11/27/2022] [Accepted: 12/09/2022] [Indexed: 12/25/2022]
Abstract
During the computed tomography (CT) imaging process, metallic implants within patients often cause harmful artifacts, which adversely degrade the visual quality of reconstructed CT images and negatively affect the subsequent clinical diagnosis. For the metal artifact reduction (MAR) task, current deep learning based methods have achieved promising performance. However, most of them share two main common limitations: (1) the CT physical imaging geometry constraint is not comprehensively incorporated into deep network structures; (2) the entire framework has weak interpretability for the specific MAR task; hence, the role of each network module is difficult to be evaluated. To alleviate these issues, in the paper, we construct a novel deep unfolding dual domain network, termed InDuDoNet+, into which CT imaging process is finely embedded. Concretely, we derive a joint spatial and Radon domain reconstruction model and propose an optimization algorithm with only simple operators for solving it. By unfolding the iterative steps involved in the proposed algorithm into the corresponding network modules, we easily build the InDuDoNet+ with clear interpretability. Furthermore, we analyze the CT values among different tissues, and merge the prior observations into a prior network for our InDuDoNet+, which significantly improve its generalization performance. Comprehensive experiments on synthesized data and clinical data substantiate the superiority of the proposed methods as well as the superior generalization performance beyond the current state-of-the-art (SOTA) MAR methods. Code is available at https://github.com/hongwang01/InDuDoNet_plus.
Collapse
Affiliation(s)
| | | | - Haimiao Zhang
- Beijing Information Science and Technology University, Beijing, China
| | - Deyu Meng
- Xi'an Jiaotong University, Xi'an, China; Peng Cheng Laboratory, Shenzhen, China; Macau University of Science and Technology, Taipa, Macao.
| | | |
Collapse
|
6
|
Zhu M, Zhu Q, Song Y, Guo Y, Zeng D, Bian Z, Wang Y, Ma J. Physics-informed sinogram completion for metal artifact reduction in CT imaging. Phys Med Biol 2023; 68. [PMID: 36808913 DOI: 10.1088/1361-6560/acbddf] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 02/21/2023] [Indexed: 02/23/2023]
Abstract
Objective.Metal artifacts in the computed tomography (CT) imaging are unavoidably adverse to the clinical diagnosis and treatment outcomes. Most metal artifact reduction (MAR) methods easily result in the over-smoothing problem and loss of structure details near the metal implants, especially for these metal implants with irregular elongated shapes. To address this problem, we present the physics-informed sinogram completion (PISC) method for MAR in CT imaging, to reduce metal artifacts and recover more structural textures.Approach.Specifically, the original uncorrected sinogram is firstly completed by the normalized linear interpolation algorithm to reduce metal artifacts. Simultaneously, the uncorrected sinogram is also corrected based on the beam-hardening correction physical model, to recover the latent structure information in metal trajectory region by leveraging the attenuation characteristics of different materials. Both corrected sinograms are fused with the pixel-wise adaptive weights, which are manually designed according to the shape and material information of metal implants. To furtherly reduce artifacts and improve the CT image quality, a post-processing frequency split algorithm is adopted to yield the final corrected CT image after reconstructing the fused sinogram.Main results.We qualitatively and quantitatively evaluated the presented PISC method on two simulated datasets and three real datasets. All results demonstrate that the presented PISC method can effectively correct the metal implants with various shapes and materials, in terms of artifact suppression and structure preservation.Significance.We proposed a sinogram-domain MAR method to compensate for the over-smoothing problem existing in most MAR methods by taking advantage of the physical prior knowledge, which has the potential to improve the performance of the deep learning based MAR approaches.
Collapse
Affiliation(s)
- Manman Zhu
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China.,Pazhou Lab (Huangpu), Guangzhou 510700, People's Republic of China
| | - Qisen Zhu
- Pazhou Lab (Huangpu), Guangzhou 510700, People's Republic of China
| | - Yuyan Song
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China.,Pazhou Lab (Huangpu), Guangzhou 510700, People's Republic of China
| | - Yi Guo
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China.,Pazhou Lab (Huangpu), Guangzhou 510700, People's Republic of China
| | - Dong Zeng
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China.,Pazhou Lab (Huangpu), Guangzhou 510700, People's Republic of China
| | - Zhaoying Bian
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China.,Pazhou Lab (Huangpu), Guangzhou 510700, People's Republic of China
| | - Yongbo Wang
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China.,Pazhou Lab (Huangpu), Guangzhou 510700, People's Republic of China
| | - Jianhua Ma
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China.,Pazhou Lab (Huangpu), Guangzhou 510700, People's Republic of China
| |
Collapse
|
7
|
Koetzier LR, Mastrodicasa D, Szczykutowicz TP, van der Werf NR, Wang AS, Sandfort V, van der Molen AJ, Fleischmann D, Willemink MJ. Deep Learning Image Reconstruction for CT: Technical Principles and Clinical Prospects. Radiology 2023; 306:e221257. [PMID: 36719287 PMCID: PMC9968777 DOI: 10.1148/radiol.221257] [Citation(s) in RCA: 51] [Impact Index Per Article: 51.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 09/26/2022] [Accepted: 10/13/2022] [Indexed: 02/01/2023]
Abstract
Filtered back projection (FBP) has been the standard CT image reconstruction method for 4 decades. A simple, fast, and reliable technique, FBP has delivered high-quality images in several clinical applications. However, with faster and more advanced CT scanners, FBP has become increasingly obsolete. Higher image noise and more artifacts are especially noticeable in lower-dose CT imaging using FBP. This performance gap was partly addressed by model-based iterative reconstruction (MBIR). Yet, its "plastic" image appearance and long reconstruction times have limited widespread application. Hybrid iterative reconstruction partially addressed these limitations by blending FBP with MBIR and is currently the state-of-the-art reconstruction technique. In the past 5 years, deep learning reconstruction (DLR) techniques have become increasingly popular. DLR uses artificial intelligence to reconstruct high-quality images from lower-dose CT faster than MBIR. However, the performance of DLR algorithms relies on the quality of data used for model training. Higher-quality training data will become available with photon-counting CT scanners. At the same time, spectral data would greatly benefit from the computational abilities of DLR. This review presents an overview of the principles, technical approaches, and clinical applications of DLR, including metal artifact reduction algorithms. In addition, emerging applications and prospects are discussed.
Collapse
Affiliation(s)
| | | | - Timothy P. Szczykutowicz
- From the Department of Radiology (L.R.K., D.M., A.S.W., V.S., D.F.,
M.J.W.) and Stanford Cardiovascular Institute (D.M., D.F., M.J.W.), Stanford
University School of Medicine, 300 Pasteur Dr, Stanford, CA 94305-5105;
Department of Radiology, University of Wisconsin–Madison, School of
Medicine and Public Health, Madison, Wis (T.P.S.); Department of Radiology,
Erasmus Medical Center, Rotterdam, the Netherlands (N.R.v.d.W.); Clinical
Science Western Europe, Philips Healthcare, Best, the Netherlands (N.R.v.d.W.);
and Department of Radiology, Leiden University Medical Center, Leiden, the
Netherlands (A.J.v.d.M.)
| | - Niels R. van der Werf
- From the Department of Radiology (L.R.K., D.M., A.S.W., V.S., D.F.,
M.J.W.) and Stanford Cardiovascular Institute (D.M., D.F., M.J.W.), Stanford
University School of Medicine, 300 Pasteur Dr, Stanford, CA 94305-5105;
Department of Radiology, University of Wisconsin–Madison, School of
Medicine and Public Health, Madison, Wis (T.P.S.); Department of Radiology,
Erasmus Medical Center, Rotterdam, the Netherlands (N.R.v.d.W.); Clinical
Science Western Europe, Philips Healthcare, Best, the Netherlands (N.R.v.d.W.);
and Department of Radiology, Leiden University Medical Center, Leiden, the
Netherlands (A.J.v.d.M.)
| | - Adam S. Wang
- From the Department of Radiology (L.R.K., D.M., A.S.W., V.S., D.F.,
M.J.W.) and Stanford Cardiovascular Institute (D.M., D.F., M.J.W.), Stanford
University School of Medicine, 300 Pasteur Dr, Stanford, CA 94305-5105;
Department of Radiology, University of Wisconsin–Madison, School of
Medicine and Public Health, Madison, Wis (T.P.S.); Department of Radiology,
Erasmus Medical Center, Rotterdam, the Netherlands (N.R.v.d.W.); Clinical
Science Western Europe, Philips Healthcare, Best, the Netherlands (N.R.v.d.W.);
and Department of Radiology, Leiden University Medical Center, Leiden, the
Netherlands (A.J.v.d.M.)
| | - Veit Sandfort
- From the Department of Radiology (L.R.K., D.M., A.S.W., V.S., D.F.,
M.J.W.) and Stanford Cardiovascular Institute (D.M., D.F., M.J.W.), Stanford
University School of Medicine, 300 Pasteur Dr, Stanford, CA 94305-5105;
Department of Radiology, University of Wisconsin–Madison, School of
Medicine and Public Health, Madison, Wis (T.P.S.); Department of Radiology,
Erasmus Medical Center, Rotterdam, the Netherlands (N.R.v.d.W.); Clinical
Science Western Europe, Philips Healthcare, Best, the Netherlands (N.R.v.d.W.);
and Department of Radiology, Leiden University Medical Center, Leiden, the
Netherlands (A.J.v.d.M.)
| | - Aart J. van der Molen
- From the Department of Radiology (L.R.K., D.M., A.S.W., V.S., D.F.,
M.J.W.) and Stanford Cardiovascular Institute (D.M., D.F., M.J.W.), Stanford
University School of Medicine, 300 Pasteur Dr, Stanford, CA 94305-5105;
Department of Radiology, University of Wisconsin–Madison, School of
Medicine and Public Health, Madison, Wis (T.P.S.); Department of Radiology,
Erasmus Medical Center, Rotterdam, the Netherlands (N.R.v.d.W.); Clinical
Science Western Europe, Philips Healthcare, Best, the Netherlands (N.R.v.d.W.);
and Department of Radiology, Leiden University Medical Center, Leiden, the
Netherlands (A.J.v.d.M.)
| | - Dominik Fleischmann
- From the Department of Radiology (L.R.K., D.M., A.S.W., V.S., D.F.,
M.J.W.) and Stanford Cardiovascular Institute (D.M., D.F., M.J.W.), Stanford
University School of Medicine, 300 Pasteur Dr, Stanford, CA 94305-5105;
Department of Radiology, University of Wisconsin–Madison, School of
Medicine and Public Health, Madison, Wis (T.P.S.); Department of Radiology,
Erasmus Medical Center, Rotterdam, the Netherlands (N.R.v.d.W.); Clinical
Science Western Europe, Philips Healthcare, Best, the Netherlands (N.R.v.d.W.);
and Department of Radiology, Leiden University Medical Center, Leiden, the
Netherlands (A.J.v.d.M.)
| | - Martin J. Willemink
- From the Department of Radiology (L.R.K., D.M., A.S.W., V.S., D.F.,
M.J.W.) and Stanford Cardiovascular Institute (D.M., D.F., M.J.W.), Stanford
University School of Medicine, 300 Pasteur Dr, Stanford, CA 94305-5105;
Department of Radiology, University of Wisconsin–Madison, School of
Medicine and Public Health, Madison, Wis (T.P.S.); Department of Radiology,
Erasmus Medical Center, Rotterdam, the Netherlands (N.R.v.d.W.); Clinical
Science Western Europe, Philips Healthcare, Best, the Netherlands (N.R.v.d.W.);
and Department of Radiology, Leiden University Medical Center, Leiden, the
Netherlands (A.J.v.d.M.)
| |
Collapse
|
8
|
Liu Y, Yan R, Liu Y, Zhang P, Chen Y, Gui Z. Enhancement based convolutional dictionary network with adaptive window for low-dose CT denoising. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2023; 31:1165-1187. [PMID: 37694333 DOI: 10.3233/xst-230094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
BACKGROUND Recently, one promising approach to suppress noise/artifacts in low-dose CT (LDCT) images is the CNN-based approach, which learns the mapping function from LDCT to normal-dose CT (NDCT). However, most CNN-based methods are purely data-driven, thus lacking sufficient interpretability and often losing details. OBJECTIVE To solve this problem, we propose a deep convolutional dictionary learning method for LDCT denoising, in which a novel convolutional dictionary learning model with adaptive window (CDL-AW) is designed, and a corresponding enhancement-based convolutional dictionary learning network (called ECDAW-Net) is constructed to unfold the CDL-AW model iteratively using the proximal gradient descent technique. METHODS In detail, the adaptive window-constrained convolutional dictionary atom is proposed to alleviate spectrum leakage caused by data truncation during convolution. Furthermore, in the ECDAW-Net, a multi-scale edge extraction module that consists of LoG and Sobel convolution layers is proposed in the unfolding iteration, to supplement lost textures and details. Additionally, to further improve the detail retention ability, the ECDAW-Net is trained by the compound loss function of the pixel-level MSE loss and the proposed patch-level loss, which can assist to retain richer structural information. RESULTS Applying ECDAW-Net to the Mayo dataset, we obtained the highest peak signal-to-noise ratio (33.94) and sub-optimal structural similarity (0.92). CONCLUSIONS Compared with some state-of-art methods, the interpretable ECDAW-Net performs well in suppressing noise/artifacts and preserving textures of tissue.
Collapse
Affiliation(s)
- Yi Liu
- The State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| | - Rongbiao Yan
- The State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| | - Yuhang Liu
- The State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| | - Pengcheng Zhang
- The State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| | - Yang Chen
- The Key Laboratory of Computer Network and Information Integration, Southeast University, Ministry of Education, Nanjing, China
| | - Zhiguo Gui
- The State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| |
Collapse
|
9
|
Liu S, Cai T, Tang X, Zhang Y, Wang C. COVID-19 diagnosis via chest X-ray image classification based on multiscale class residual attention. Comput Biol Med 2022; 149:106065. [PMID: 36081225 PMCID: PMC9433340 DOI: 10.1016/j.compbiomed.2022.106065] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Revised: 08/07/2022] [Accepted: 08/27/2022] [Indexed: 12/11/2022]
Abstract
Aiming at detecting COVID-19 effectively, a multiscale class residual attention (MCRA) network is proposed via chest X-ray (CXR) image classification. First, to overcome the data shortage and improve the robustness of our network, a pixel-level image mixing of local regions was introduced to achieve data augmentation and reduce noise. Secondly, multi-scale fusion strategy was adopted to extract global contextual information at different scales and enhance semantic representation. Last but not least, class residual attention was employed to generate spatial attention for each class, which can avoid inter-class interference and enhance related features to further improve the COVID-19 detection. Experimental results show that our network achieves superior diagnostic performance on COVIDx dataset, and its accuracy, PPV, sensitivity, specificity and F1-score are 97.71%, 96.76%, 96.56%, 98.96% and 96.64%, respectively; moreover, the heat maps can endow our deep model with somewhat interpretability.
Collapse
|