1
|
Liu T, Chen M, Duan Z, Cui A. Multi-focused image fusion algorithm based on multi-scale hybrid attention residual network. PLoS One 2024; 19:e0302545. [PMID: 38787800 PMCID: PMC11125476 DOI: 10.1371/journal.pone.0302545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 04/08/2024] [Indexed: 05/26/2024] Open
Abstract
In order to improve the detection performance of image fusion in focus areas and realize end-to-end decision diagram optimization, we design a multi-focus image fusion network based on deep learning. The network is trained using unsupervised learning and a multi-scale hybrid attention residual network model is introduced to enable solving for features at different levels of the image. In the training stage, multi-scale features are extracted from two source images with different focal points using hybrid multi-scale residual blocks (MSRB), and the up-down projection module (UDP) is introduced to obtain multi-scale edge information, then the extracted features are operated to obtain deeper image features. These blocks can effectively utilize multi-scale feature information without increasing the number of parameters. The deep features of the image are extracted in its test phase, input to the spatial frequency domain to calculate and measure the activity level and obtain the initial decision map, and use post-processing techniques to eliminate the edge errors. Finally, the decision map is generated and optimized, and the final fused image is obtained by combining the optimized decision map with the source image. The comparative experiments show that our proposed model achieves better fusion performance in subjective evaluation, and the quality of the obtained fused images is more robust with richer details. The objective evaluation metrics work better and the image fusion quality is higher.
Collapse
Affiliation(s)
- Tingting Liu
- Sichuan Key Laboratory of Artificial Intelligence, Sichuan University of Science and Engineering, Yibin, Sichuan, China
- School of Automation and Information, Sichuan University of Science and Engineering, Yibin, Sichuan, China
| | - Mingju Chen
- Sichuan Key Laboratory of Artificial Intelligence, Sichuan University of Science and Engineering, Yibin, Sichuan, China
- School of Automation and Information, Sichuan University of Science and Engineering, Yibin, Sichuan, China
| | - Zhengxu Duan
- Sichuan Key Laboratory of Artificial Intelligence, Sichuan University of Science and Engineering, Yibin, Sichuan, China
- School of Automation and Information, Sichuan University of Science and Engineering, Yibin, Sichuan, China
| | - Anle Cui
- Sichuan Key Laboratory of Artificial Intelligence, Sichuan University of Science and Engineering, Yibin, Sichuan, China
- School of Automation and Information, Sichuan University of Science and Engineering, Yibin, Sichuan, China
| |
Collapse
|
2
|
He D, Li W, Wang G, Huang Y, Liu S. LRFNet: A real-time medical image fusion method guided by detail information. Comput Biol Med 2024; 173:108381. [PMID: 38569237 DOI: 10.1016/j.compbiomed.2024.108381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2023] [Revised: 03/07/2024] [Accepted: 03/24/2024] [Indexed: 04/05/2024]
Abstract
Multimodal medical image fusion (MMIF) technology plays a crucial role in medical diagnosis and treatment by integrating different images to obtain fusion images with comprehensive information. Deep learning-based fusion methods have demonstrated superior performance, but some of them still encounter challenges such as imbalanced retention of color and texture information and low fusion efficiency. To alleviate the above issues, this paper presents a real-time MMIF method, called a lightweight residual fusion network. First, a feature extraction framework with three branches is designed. Two independent branches are used to fully extract brightness and texture information. The fusion branch enables different modal information to be interactively fused at a shallow level, thereby better retaining brightness and texture information. Furthermore, a lightweight residual unit is designed to replace the conventional residual convolution in the model, thereby improving the fusion efficiency and reducing the overall model size by approximately 5 times. Finally, considering that the high-frequency image decomposed by the wavelet transform contains abundant edge and texture information, an adaptive strategy is proposed for assigning weights to the loss function based on the information content in the high-frequency image. This strategy effectively guides the model toward preserving intricate details. The experimental results on MRI and functional images demonstrate that the proposed method exhibits superior fusion performance and efficiency compared to alternative approaches. The code of LRFNet is available at https://github.com/HeDan-11/LRFNet.
Collapse
Affiliation(s)
- Dan He
- School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Weisheng Li
- School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China; Chongqing Key Laboratory of Image Recognition, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China; Key Laboratory of Cyberspace Big Data Intelligent Security (Chongqing University of Posts and Telecommunications), Ministry of Education, Chongqing, 400065, China.
| | - Guofen Wang
- College of Computer and Information Science, Chongqing Normal University, Chongqing, 401331, China
| | - Yuping Huang
- School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Shiqiang Liu
- School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| |
Collapse
|
3
|
Liu X, Su S, Gu W, Yao T, Shen J, Mo Y. Super-Resolution Reconstruction of CT Images Based on Multi-scale Information Fused Generative Adversarial Networks. Ann Biomed Eng 2024; 52:57-70. [PMID: 38064116 DOI: 10.1007/s10439-023-03412-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 11/16/2023] [Indexed: 01/04/2024]
Abstract
The popularization and widespread use of computed tomography (CT) in the field of medicine evocated public attention to the potential radiation exposure endured by patients. Reducing the radiation dose may lead to scattering noise and low resolution, which can adversely affect the radiologists' judgment. Hence, this paper introduces a new network called PANet-UP-ESRGAN (PAUP-ESRGAN), specifically designed to obtain low-dose CT (LDCT) images with high peak signal-to-noise ratio (PSNR) and high resolution (HR). The model was trained on synthetic medical image data based on a Generative Adversarial Network (GAN). A degradation modeling process was introduced to accurately represent realistic degradation complexities. To reconstruct image edge textures, a pyramidal attention model call PANet was added before the middle of the multiple residual dense blocks (MRDB) in the generator to focus on high-frequency image information. The U-Net discriminator with spectral normalization was also designed to improve its efficiency and stabilize the training dynamics. The proposed PAUP-ESRGAN model was evaluated on the abdomen and lung image datasets, which demonstrated a significant improvement in terms of robustness of model and LDCT image detail reconstruction, compared to the latest real-esrgan network. Results showed that the mean PSNR increated by 19.1%, 25.05%, and 21.25%, the mean SSIM increated by 0.4%, 0.4%, and 0.4%, and the mean NRMSE decreated by 0.25%, 0.25%, and 0.35% at 2[Formula: see text], 4[Formula: see text], and 8[Formula: see text] super-resolution scales, respectively. Experimental results demonstrate that our method outperforms the state-of-the-art super-resolution methods on restoring CT images with respect to peak signal-to-noise ratio (PSNR), structural similarity (SSIM) and normalized root-mean-square error (NRMSE) indices.
Collapse
Affiliation(s)
- Xiaobao Liu
- Faculty of Mechanical and Electrical Engineering, Kunming University of Science and Technology, No.727, Jingming South Road, Chenggong District, Kunming, 650500, China.
| | - Shuailin Su
- Faculty of Mechanical and Electrical Engineering, Kunming University of Science and Technology, No.727, Jingming South Road, Chenggong District, Kunming, 650500, China
| | - Wenjuan Gu
- Faculty of Mechanical and Electrical Engineering, Kunming University of Science and Technology, No.727, Jingming South Road, Chenggong District, Kunming, 650500, China
| | - Tingqiang Yao
- Faculty of Mechanical and Electrical Engineering, Kunming University of Science and Technology, No.727, Jingming South Road, Chenggong District, Kunming, 650500, China
| | - Jihong Shen
- The First Department of Urology, The First Affiliated Hospital of Kunming Medical University, 295 Xichang Road, Chenggong District, Kunming, 650032, China
| | - Yin Mo
- The First Department of Urology, The First Affiliated Hospital of Kunming Medical University, 295 Xichang Road, Chenggong District, Kunming, 650032, China
| |
Collapse
|
4
|
Huang W, Zhang H, Cheng Y, Quan X. DRCM: a disentangled representation network based on coordinate and multimodal attention for medical image fusion. Front Physiol 2023; 14:1241370. [PMID: 38028809 PMCID: PMC10656763 DOI: 10.3389/fphys.2023.1241370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 10/02/2023] [Indexed: 12/01/2023] Open
Abstract
Recent studies on medical image fusion based on deep learning have made remarkable progress, but the common and exclusive features of different modalities, especially their subsequent feature enhancement, are ignored. Since medical images of different modalities have unique information, special learning of exclusive features should be designed to express the unique information of different modalities so as to obtain a medical fusion image with more information and details. Therefore, we propose an attention mechanism-based disentangled representation network for medical image fusion, which designs coordinate attention and multimodal attention to extract and strengthen common and exclusive features. First, the common and exclusive features of each modality were obtained by the cross mutual information and adversarial objective methods, respectively. Then, coordinate attention is focused on the enhancement of the common and exclusive features of different modalities, and the exclusive features are weighted by multimodal attention. Finally, these two kinds of features are fused. The effectiveness of the three innovation modules is verified by ablation experiments. Furthermore, eight comparison methods are selected for qualitative analysis, and four metrics are used for quantitative comparison. The values of the four metrics demonstrate the effect of the DRCM. Furthermore, the DRCM achieved better results on SCD, Nabf, and MS-SSIM metrics, which indicates that the DRCM achieved the goal of further improving the visual quality of the fused image with more information from source images and less noise. Through the comprehensive comparison and analysis of the experimental results, it was found that the DRCM outperforms the comparison method.
Collapse
Affiliation(s)
| | - Han Zhang
- College of Artificial Intelligence, Nankai University, Tianjin, China
| | | | | |
Collapse
|
5
|
Ye S, Wang T, Ding M, Zhang X. F-DARTS: Foveated Differentiable Architecture Search Based Multimodal Medical Image Fusion. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3348-3361. [PMID: 37285248 DOI: 10.1109/tmi.2023.3283517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Multimodal medical image fusion (MMIF) is highly significant in such fields as disease diagnosis and treatment. The traditional MMIF methods are difficult to provide satisfactory fusion accuracy and robustness due to the influence of such possible human-crafted components as image transform and fusion strategies. Existing deep learning based fusion methods are generally difficult to ensure image fusion effect due to the adoption of a human-designed network structure and a relatively simple loss function and the ignorance of human visual characteristics during weight learning. To address these issues, we have presented the foveated differentiable architecture search (F-DARTS) based unsupervised MMIF method. In this method, the foveation operator is introduced into the weight learning process to fully explore human visual characteristics for the effective image fusion. Meanwhile, a distinctive unsupervised loss function is designed for network training by integrating mutual information, sum of the correlations of differences, structural similarity and edge preservation value. Based on the presented foveation operator and loss function, an end-to-end encoder-decoder network architecture will be searched using the F-DARTS to produce the fused image. Experimental results on three multimodal medical image datasets demonstrate that the F-DARTS performs better than several traditional and deep learning based fusion methods by providing visually superior fused results and better objective evaluation metrics.
Collapse
|
6
|
Lu X, Liu X, Xiao Z, Zhang S, Huang J, Yang C, Liu S. Self-supervised dual-head attentional bootstrap learning network for prostate cancer screening in transrectal ultrasound images. Comput Biol Med 2023; 165:107337. [PMID: 37672927 DOI: 10.1016/j.compbiomed.2023.107337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 07/13/2023] [Accepted: 08/07/2023] [Indexed: 09/08/2023]
Abstract
Current convolutional neural network-based ultrasound automatic classification models for prostate cancer often rely on extensive manual labeling. Although Self-supervised Learning (SSL) have shown promise in addressing this problem, those data that from medical scenarios contains intra-class similarity conflicts, so using loss calculations directly that include positive and negative sample pairs can mislead training. SSL method tends to focus on global consistency at the image level and does not consider the internal informative relationships of the feature map. To improve the efficiency of prostate cancer diagnosis, using SSL method to learn key diagnostic information in ultrasound images, we proposed a self-supervised dual-head attentional bootstrap learning network (SDABL), including Online-Net and Target-Net. Self-Position Attention Module (SPAM) and adaptive maximum channel attention module (CAAM) are inserted in both paths simultaneously. They captures position and inter-channel attention and of the original feature map with a small number of parameters, solve the information optimization problem of feature maps in SSL. In loss calculations, we discard the construction of negative sample pairs, and instead guide the network to learn the consistency of the location space and channel space by drawing closer to the embedding representation of positive samples continuously. We conducted numerous experiments on the prostate Transrectal ultrasound (TRUS) dataset, experiments show that our SDABL pre-training method has significant advantages over both mainstream contrast learning methods and other attention-based methods. Specifically, the SDABL pre-trained backbone achieves 80.46% accuracy on our TRUS dataset after fine-tuning.
Collapse
Affiliation(s)
- Xu Lu
- Guangdong Polytechnic Normal University, Guangzhou 510665, China; Guangdong Provincial Key Laboratory of Intellectual Property & Big Data, Guangzhou 510665, China; Pazhou Lab, Guangzhou 510330, China
| | - Xiangjun Liu
- Guangdong Polytechnic Normal University, Guangzhou 510665, China
| | - Zhiwei Xiao
- Guangdong Polytechnic Normal University, Guangzhou 510665, China
| | - Shulian Zhang
- Guangdong Polytechnic Normal University, Guangzhou 510665, China
| | - Jun Huang
- Department of Ultrasonography, The First Affiliated Hospital of Jinan University, Guangzhou 510630, China.
| | - Chuan Yang
- Department of Ultrasonography, The First Affiliated Hospital of Jinan University, Guangzhou 510630, China.
| | - Shaopeng Liu
- Guangdong Polytechnic Normal University, Guangzhou 510665, China.
| |
Collapse
|
7
|
Li H, Zeng P, Bai C, Wang W, Yu Y, Yu P. PMJAF-Net: Pyramidal multi-scale joint attention and adaptive fusion network for explainable skin lesion segmentation. Comput Biol Med 2023; 165:107454. [PMID: 37716246 DOI: 10.1016/j.compbiomed.2023.107454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Revised: 08/18/2023] [Accepted: 09/04/2023] [Indexed: 09/18/2023]
Abstract
Traditional convolutional neural networks have achieved remarkable success in skin lesion segmentation. However, the successive pooling operations and convolutional spans reduce the feature resolution and hinder the dense prediction for spatial information, resulting in blurred boundaries, low accuracy and poor interpretability for irregular lesion segmentation under low contrast. To solve the above issues, a pyramidal multi-scale joint attention and adaptive fusion network for explainable (PMJAF-Net) skin lesion segmentation is proposed. Firstly, an adaptive spatial attention module is designed to establish the long-term correlation between pixels, enrich the global and local contextual information, and refine the detailed features. Subsequently, an efficient pyramidal multi-scale channel attention module is proposed to capture the multi-scale information and edge features by using the pyramidal module. Meanwhile, a channel attention module is devised to establish the long-term correlation between channels and highlight the most related feature channels to capture the multi-scale key information on each channel. Thereafter, a multi-scale adaptive fusion attention module is put forward to efficiently fuse the scale features at different decoding stages. Finally, a novel hybrid loss function based on region salient features and boundary quality is presented to guide the network to learn from map-level, patch-level and pixel-level and to accurately predict the lesion regions with clear boundaries. In addition, visualizing attention weight maps are utilized to visually enhance the interpretability of our proposed model. Comprehensive experiments are conducted on four public skin lesion datasets, and the results demonstrate that the proposed network outperforms the state-of-the-art methods, with the segmentation assessment evaluation metrics Dice, JI, and ACC improved to 92.65%, 87.86% and 96.26%, respectively.
Collapse
Affiliation(s)
- Haiyan Li
- School of Information, Yunnan University, Kunming, 650504, China
| | - Peng Zeng
- School of Information, Yunnan University, Kunming, 650504, China
| | - Chongbin Bai
- Otolaryngology Department, Honghe Prefecture Second People's Hospital, Jianshui, 654300, China
| | - Wei Wang
- School of Software, Yunnan University, Kunming, 650504, China.
| | - Ying Yu
- School of Information, Yunnan University, Kunming, 650504, China
| | - Pengfei Yu
- School of Information, Yunnan University, Kunming, 650504, China
| |
Collapse
|
8
|
Li X, Li X. Multimodal brain image fusion based on error texture elimination and salient feature detection. Front Neurosci 2023; 17:1204263. [PMID: 37521686 PMCID: PMC10372795 DOI: 10.3389/fnins.2023.1204263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 06/13/2023] [Indexed: 08/01/2023] Open
Abstract
As an important clinically oriented information fusion technology, multimodal medical image fusion integrates useful information from different modal images into a comprehensive fused image. Nevertheless, existing methods routinely consider only energy information when fusing low-frequency or base layers, ignoring the fact that useful texture information may exist in pixels with lower energy values. Thus, erroneous textures may be introduced into the fusion results. To resolve this problem, we propose a novel multimodal brain image fusion algorithm based on error texture removal. A two-layer decomposition scheme is first implemented to generate the high- and low-frequency subbands. We propose a salient feature detection operator based on gradient difference and entropy. The proposed operator integrates the gradient difference and amount of information in the high-frequency subbands to effectively identify clearly detailed information. Subsequently, we detect the energy information of the low-frequency subband by utilizing the local phase feature of each pixel as the intensity measurement and using a random walk algorithm to detect the energy information. Finally, we propose a rolling guidance filtering iterative least-squares model to reconstruct the texture information in the low-frequency components. Through extensive experiments, we successfully demonstrate that the proposed algorithm outperforms some state-of-the-art methods. Our source code is publicly available at https://github.com/ixilai/ETEM.
Collapse
|
9
|
Dinh PH. Medical image fusion based on enhanced three-layer image decomposition and Chameleon swarm algorithm. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/26/2023]
|
10
|
Yu W, Wang R, Hu X. Learning Attentional Communication with a Common Network for Multiagent Reinforcement Learning. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2023; 2023:5814420. [PMID: 37416594 PMCID: PMC10322483 DOI: 10.1155/2023/5814420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 03/11/2023] [Accepted: 03/20/2023] [Indexed: 07/08/2023]
Abstract
For multiagent communication and cooperation tasks in partially observable environments, most of the existing works only use the information contained in hidden layers of a network at the current moment, limiting the source of information. In this paper, we propose a novel algorithm named multiagent attentional communication with the common network (MAACCN), which adds a consensus information module to expand the source of communication information. We regard the best-performing overall network in the historical moment for agents as the common network, and we extract consensus knowledge by leveraging such a network. Especially, we combine current observation information with the consensus knowledge to infer more effective information as input for decision-making through the attention mechanism. Experiments conducted on the StarCraft multiagent challenge (SMAC) demonstrate the effectiveness of MAACCN in comparison to a set of baselines and also reveal that MAACCN can improve performance by more than 20% in a super hard scenario especially.
Collapse
Affiliation(s)
- Wenwu Yu
- University of Chinese Academy of Sciences, Beijing 100049, China
- Institute of Software Chinese Academy of Sciences, Beijing 100190, China
| | - Rui Wang
- Institute of Software Chinese Academy of Sciences, Beijing 100190, China
| | - Xiaohui Hu
- Institute of Software Chinese Academy of Sciences, Beijing 100190, China
| |
Collapse
|
11
|
Fu J, He B, Yang J, Liu J, Ouyang A, Wang Y. CDRNet: Cascaded dense residual network for grayscale and pseudocolor medical image fusion. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 234:107506. [PMID: 37003041 DOI: 10.1016/j.cmpb.2023.107506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 03/18/2023] [Accepted: 03/22/2023] [Indexed: 06/19/2023]
Abstract
OBJECTIVE Multimodal medical fusion images have been widely used in clinical medicine, computer-aided diagnosis and other fields. However, the existing multimodal medical image fusion algorithms generally have shortcomings such as complex calculations, blurred details and poor adaptability. To solve this problem, we propose a cascaded dense residual network and use it for grayscale and pseudocolor medical image fusion. METHODS The cascaded dense residual network uses a multiscale dense network and a residual network as the basic network architecture, and a multilevel converged network is obtained through cascade. The cascaded dense residual network contains 3 networks, the first-level network inputs two images with different modalities to obtain a fused Image 1, the second-level network uses fused Image 1 as the input image to obtain fused Image 2 and the third-level network uses fused Image 2 as the input image to obtain fused Image 3. The multimodal medical image is trained through each level of the network, and the output fusion image is enhanced step-by-step. RESULTS As the number of networks increases, the fusion image becomes increasingly clearer. Through numerous fusion experiments, the fused images of the proposed algorithm have higher edge strength, richer details, and better performance in the objective indicators than the reference algorithms. CONCLUSION Compared with the reference algorithms, the proposed algorithm has better original information, higher edge strength, richer details and an improvement of the four objective SF, AG, MZ and EN indicator metrics.
Collapse
Affiliation(s)
- Jun Fu
- School of Information Engineering, Zunyi Normal University, Zunyi, Guizhou, 563006, China.
| | - Baiqing He
- Nanchang Institute of Technology, Nanchang, Jiangxi, 330044, China
| | - Jie Yang
- School of Information Engineering, Zunyi Normal University, Zunyi, Guizhou, 563006, China
| | - Jianpeng Liu
- School of Science, East China Jiaotong University, Nanchang, Jiangxi, 330013, China
| | - Aijia Ouyang
- School of Information Engineering, Zunyi Normal University, Zunyi, Guizhou, 563006, China
| | - Ya Wang
- School of Information Engineering, Zunyi Normal University, Zunyi, Guizhou, 563006, China
| |
Collapse
|
12
|
Song E, Zhan B, Liu H, Cetinkaya C, Hung CC. NMNet: Learning Multi-level semantic information from scale extension domain for improved medical image segmentation. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2023]
|
13
|
Zhou T, Cheng Q, Lu H, Li Q, Zhang X, Qiu S. Deep learning methods for medical image fusion: A review. Comput Biol Med 2023; 160:106959. [PMID: 37141652 DOI: 10.1016/j.compbiomed.2023.106959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 04/12/2023] [Accepted: 04/17/2023] [Indexed: 05/06/2023]
Abstract
The image fusion methods based on deep learning has become a research hotspot in the field of computer vision in recent years. This paper reviews these methods from five aspects: Firstly, the principle and advantages of image fusion methods based on deep learning are expounded; Secondly, the image fusion methods are summarized in two aspects: End-to-End and Non-End-to-End, according to the different tasks of deep learning in the feature processing stage, the non-end-to-end image fusion methods are divided into two categories: deep learning for decision mapping and deep learning for feature extraction. According to the different types of the networks, the end-to-end image fusion methods are divided into three categories: image fusion methods based on Convolutional Neural Network, Generative Adversarial Network, and Encoder-Decoder Network; Thirdly, the application of the image fusion methods based on deep learning in medical image field is summarized from two aspects: method and data set; Fourthly, evaluation metrics commonly used in the field of medical image fusion are sorted out from 14 aspects; Fifthly, the main challenges faced by the medical image fusion are discussed from two aspects: data sets and fusion methods. And the future development direction is prospected. This paper systematically summarizes the image fusion methods based on the deep learning, which has a positive guiding significance for the in-depth study of multi modal medical images.
Collapse
Affiliation(s)
- Tao Zhou
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China
| | - QianRu Cheng
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China.
| | - HuiLing Lu
- School of Science, Ningxia Medical University, Yinchuan, 750004, China.
| | - Qi Li
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China
| | - XiangXiang Zhang
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China
| | - Shi Qiu
- Key Laboratory of Spectral Imaging Technology CAS, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an, 710119, China
| |
Collapse
|
14
|
FDGNet: A pair feature difference guided network for multimodal medical image fusion. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
15
|
Jiang Y, Cheng T, Dong J, Liang J, Zhang Y, Lin X, Yao H. Dermoscopic image segmentation based on Pyramid Residual Attention Module. PLoS One 2022; 17:e0267380. [PMID: 36112649 PMCID: PMC9481037 DOI: 10.1371/journal.pone.0267380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Accepted: 04/08/2022] [Indexed: 11/18/2022] Open
Abstract
We propose a stacked convolutional neural network incorporating a novel and efficient pyramid residual attention (PRA) module for the task of automatic segmentation of dermoscopic images. Precise segmentation is a significant and challenging step for computer-aided diagnosis technology in skin lesion diagnosis and treatment. The proposed PRA has the following characteristics: First, we concentrate on three widely used modules in the PRA. The purpose of the pyramid structure is to extract the feature information of the lesion area at different scales, the residual means is aimed to ensure the efficiency of model training, and the attention mechanism is used to screen effective features maps. Thanks to the PRA, our network can still obtain precise boundary information that distinguishes healthy skin from diseased areas for the blurred lesion areas. Secondly, the proposed PRA can increase the segmentation ability of a single module for lesion regions through efficient stacking. The third, we incorporate the idea of encoder-decoder into the architecture of the overall network. Compared with the traditional networks, we divide the segmentation procedure into three levels and construct the pyramid residual attention network (PRAN). The shallow layer mainly processes spatial information, the middle layer refines both spatial and semantic information, and the deep layer intensively learns semantic information. The basic module of PRAN is PRA, which is enough to ensure the efficiency of the three-layer architecture network. We extensively evaluate our method on ISIC2017 and ISIC2018 datasets. The experimental results demonstrate that PRAN can obtain better segmentation performance comparable to state-of-the-art deep learning models under the same experiment environment conditions.
Collapse
Affiliation(s)
- Yun Jiang
- College of Computer Science and Engineering, Lanzhou, Gansu, China
- * E-mail: (YJ); (TC)
| | - Tongtong Cheng
- College of Computer Science and Engineering, Lanzhou, Gansu, China
- * E-mail: (YJ); (TC)
| | - Jinkun Dong
- College of Computer Science and Engineering, Lanzhou, Gansu, China
| | - Jing Liang
- College of Computer Science and Engineering, Lanzhou, Gansu, China
| | - Yuan Zhang
- College of Computer Science and Engineering, Lanzhou, Gansu, China
| | - Xin Lin
- College of Computer Science and Engineering, Lanzhou, Gansu, China
| | - Huixia Yao
- College of Computer Science and Engineering, Lanzhou, Gansu, China
| |
Collapse
|
16
|
Alzheimer’s disease classification using distilled multi-residual network. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04084-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
17
|
Ukwuoma CC, Qin Z, Belal Bin Heyat M, Akhtar F, Bamisile O, Muad AY, Addo D, Al-Antari MA. A Hybrid Explainable Ensemble Transformer Encoder for Pneumonia Identification from Chest X-ray Images. J Adv Res 2022:S2090-1232(22)00202-8. [PMID: 36084812 DOI: 10.1016/j.jare.2022.08.021] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 08/11/2022] [Accepted: 08/31/2022] [Indexed: 10/14/2022] Open
Abstract
INTRODUCTION Pneumonia is a microorganism infection that causes chronic inflammation of the human lung cells. Chest X-ray imaging is the most well-known screening approach used for detecting pneumonia in the early stages. While chest-Xray images are mostly blurry with low illumination, a strong feature extraction approach is required for promising identification performance. OBJECTIVES A new hybrid explainable deep learning framework is proposed for accurate pneumonia disease identification using chest X-ray images. METHODS The proposed hybrid workflow is developed by fusing the capabilities of both ensemble convolutional networks and the Transformer Encoder mechanism. The ensemble learning backbone is used to extract strong features from the raw input X-ray images in two different scenarios: ensemble A (i.e., DenseNet201, VGG16, and GoogleNet) and ensemble B (i.e., DenseNet201, InceptionResNetV2, and Xception). Whereas, the Transformer Encoder is built based on the self-attention mechanism with multilayer perceptron (MLP) for accurate disease identification. The visual explainable saliency maps are derived to emphasize the crucial predicted regions on the input X-ray images. The end-to-end training process of the proposed deep learning models over all scenarios is performed for binary and multi-class classification tasks. RESULTS The proposed hybrid deep learning model recorded 99.21% classification performance in terms of overall accuracy and F1-score for the binary classification task, while it achieved 98.19% accuracy and 97.29% F1-score for multi-classification task. For the ensemble binary identification scenario, ensemble A recorded 97.22% accuracy and 97.14% F1-score, while ensemble B achieved 96.44% for both accuracy and F1-score. For the ensemble, multiclass identification scenario, ensemble A recorded 97.2% accuracy and 95.8% F1-score, while ensemble B recorded 96.4% accuracy and 94.9% F1-score. CONCLUSION The proposed hybrid deep learning framework could provide promising and encouraging explainable identification performance comparing with individual, ensemble models, or even the latest models in the literature. The code is available here: https://github.com/chiagoziemchima/Pneumonia_Identificaton.
Collapse
Affiliation(s)
- Chiagoziem C Ukwuoma
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Zhiguang Qin
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China.
| | - Md Belal Bin Heyat
- IoT Research Center, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong 518060, China; International Institute of Information Technology, Hyderabad, Telangana 500032, India; Department of Science and Engineering, Novel Global Community Educational Foundation, Hebersham, NSW 2770, Australia
| | - Faijan Akhtar
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan 611731, China
| | - Olusola Bamisile
- Sichuan Industrial Internet Intelligent Monitoring and Application Engineering Technology Research Center, Chengdu University of Technology, China
| | - Abdullah Y Muad
- Department of Studies in Computer Science, University of Mysore, Manasagangothri, Mysore, India
| | - Daniel Addo
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Mugahed A Al-Antari
- Department of Artificial Intelligence, College of Software & Convergence Technology, Daeyang AI Center, Sejong University, Seoul 05006, Korea.
| |
Collapse
|
18
|
Multi-modal medical image fusion based on densely-connected high-resolution CNN and hybrid transformer. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07635-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
19
|
Multi-level difference information replenishment for medical image fusion. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03819-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
20
|
Zhang Y, Jin M, Huang G. Medical image fusion based on improved multi-scale morphology gradient-weighted local energy and visual saliency map. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103535] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
21
|
Li W, Li R, Fu J, Peng X. MSENet: A multi-scale enhanced network based on unique features guidance for medical image fusion. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103534] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
22
|
Feng Y, Yang X, Qiu D, Zhang H, Wei D, Liu J. PCXRNet: Pneumonia diagnosis from Chest X-Ray Images using Condense attention block and Multiconvolution attention block. IEEE J Biomed Health Inform 2022; 26:1484-1495. [PMID: 35120015 DOI: 10.1109/jbhi.2022.3148317] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Coronavirus disease 2019 (COVID-19) has become a global pandemic. Many recognition approaches based on convolutional neural networks have been proposed for COVID-19 chest X-ray images. However, only a few of them make good use of the potential inter- and intra-relationships of feature maps. Considering the limitation mentioned above, this paper proposes an attention-based convolutional neural network, called PCXRNet, for diagnosis of pneumonia using chest X-ray images. To utilize the information from the channels of the feature maps, we added a novel condense attention module (CDSE) that comprised of two steps: condensation step and squeeze-excitation step. Unlike traditional channel attention modules, CDSE first downsamples the feature map channel by channel to condense the information, followed by the squeeze-excitation step, in which the channel weights are calculated. To make the model pay more attention to informative spatial parts in every feature map, we proposed a multi-convolution spatial attention module (MCSA). It reduces the number of parameters and introduces more nonlinearity. The CDSE and MCSA complement each other in series to tackle the problem of redundancy in feature maps and provide useful information from and between feature maps. We used the ChestXRay2017 dataset to explore the internal structure of PCXRNet, and the proposed network was applied to COVID-19 diagnosis. Additional experiments were conducted on a tuberculosis dataset to verify the effectiveness of PCXRNet. As a result, the network achieves an accuracy of 94.619%, recall of 94.753%, precision of 95.286%, and F1-score of 94.996% on the COVID-19 dataset.
Collapse
|
23
|
Zhu R, Li X, Zhang X, Wang J. HID: The Hybrid Image Decomposition Model for MRI and CT Fusion. IEEE J Biomed Health Inform 2021; 26:727-739. [PMID: 34270437 DOI: 10.1109/jbhi.2021.3097374] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Multimodal medical image fusion can combine salient information from different source images of the same part and reduce the redundancy of information. In this paper, an efficient hybrid image decomposition (HID) method is proposed. It combines the advantages of spatial domain and transform domain methods and breaks through the limitations of the algorithms based on single category features. The accurate separation of base layer and texture details is conducive to the better effect of the fusion rules. First, the source anatomical images are decomposed into a series of high frequencies and a low frequency via nonsubsampled shearlet transform (NSST). Second, the low frequency is further decomposed using the designed optimization model based on structural similarity and structure tensor to get an energy texture layer and a base layer. Then, the modified choosing maximum (MCM) is designed to fuse base layers. The sum of modified Laplacian (SML) is used to fuse high frequencies and energy texture layers. Finally, the fused low frequency can be obtained by adding fused energy texture layer and base layer. And the fused image is reconstructed by the inverse NSST. The superiority of the proposed method is verified by amounts of experiments on 50 pairs of magnetic resonance imaging (MRI) images and computed tomography (CT) images and others, and compared with 12 state-of-the-art medical image fusion methods. It is demonstrated that the proposed hybrid decomposition model has a better ability to extract texture information than conventional ones.
Collapse
|