1
|
Li S, Ji M, Chen M, Chen L. Facial length and angle feature recognition for digital libraries. PLoS One 2024; 19:e0306250. [PMID: 39046954 PMCID: PMC11268703 DOI: 10.1371/journal.pone.0306250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 06/13/2024] [Indexed: 07/27/2024] Open
Abstract
With the continuous progress of technology, facial recognition technology is widely used in various scenarios as a mature biometric technology. However, the accuracy of facial feature recognition has become a major challenge. This study proposes a face length feature and angle feature recognition method for digital libraries, targeting the recognition of different facial features. Firstly, an in-depth study is conducted on the architecture of facial action networks based on attention mechanisms to provide more accurate and comprehensive facial features. Secondly, a network architecture based on length and angle features of facial expressions, the expression recognition network is explored to improve the recognition rate of different expressions. Finally, an end-to-end network framework based on attention mechanism for facial feature points is constructed to improve the accuracy and stability of facial feature recognition network. To verify the effectiveness of the proposed method, experiments were conducted using the facial expression dataset FER-2013. The experimental results showed that the average recognition rate for the seven common expressions was 97.28% to 99.97%. The highest recognition rate for happiness and surprise was 99.97%, while the relatively low recognition rate for anger, fear, and neutrality was 97.18%. The data has verified that the research method can effectively recognize and distinguish different facial expressions, with high accuracy and robustness. The recognition method based on attention mechanism for facial feature points has effectively optimized the recognition process of facial length and angle features, significantly improving the stability of facial expression recognition, especially in complex environments, providing reliable technical support for digital libraries and other fields. This study aims to promote the development of facial recognition technology in digital libraries, improve the service quality and user experience of digital libraries.
Collapse
Affiliation(s)
- Shuangyan Li
- School of Art and Design, Qingdao University of Technology, Qingdao, China
| | - Min Ji
- Department of Fine Arts, Cangzhou Normal University, Cangzhou, China
| | - Ming Chen
- Department of Fine Arts, Cangzhou Normal University, Cangzhou, China
| | - Lanzhi Chen
- College of Arts and Sports, Dankook University, Yongin, South Korea
| |
Collapse
|
2
|
He D, Li W, Wang G, Huang Y, Liu S. LRFNet: A real-time medical image fusion method guided by detail information. Comput Biol Med 2024; 173:108381. [PMID: 38569237 DOI: 10.1016/j.compbiomed.2024.108381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2023] [Revised: 03/07/2024] [Accepted: 03/24/2024] [Indexed: 04/05/2024]
Abstract
Multimodal medical image fusion (MMIF) technology plays a crucial role in medical diagnosis and treatment by integrating different images to obtain fusion images with comprehensive information. Deep learning-based fusion methods have demonstrated superior performance, but some of them still encounter challenges such as imbalanced retention of color and texture information and low fusion efficiency. To alleviate the above issues, this paper presents a real-time MMIF method, called a lightweight residual fusion network. First, a feature extraction framework with three branches is designed. Two independent branches are used to fully extract brightness and texture information. The fusion branch enables different modal information to be interactively fused at a shallow level, thereby better retaining brightness and texture information. Furthermore, a lightweight residual unit is designed to replace the conventional residual convolution in the model, thereby improving the fusion efficiency and reducing the overall model size by approximately 5 times. Finally, considering that the high-frequency image decomposed by the wavelet transform contains abundant edge and texture information, an adaptive strategy is proposed for assigning weights to the loss function based on the information content in the high-frequency image. This strategy effectively guides the model toward preserving intricate details. The experimental results on MRI and functional images demonstrate that the proposed method exhibits superior fusion performance and efficiency compared to alternative approaches. The code of LRFNet is available at https://github.com/HeDan-11/LRFNet.
Collapse
Affiliation(s)
- Dan He
- School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Weisheng Li
- School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China; Chongqing Key Laboratory of Image Recognition, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China; Key Laboratory of Cyberspace Big Data Intelligent Security (Chongqing University of Posts and Telecommunications), Ministry of Education, Chongqing, 400065, China.
| | - Guofen Wang
- College of Computer and Information Science, Chongqing Normal University, Chongqing, 401331, China
| | - Yuping Huang
- School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Shiqiang Liu
- School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| |
Collapse
|
3
|
Safari M, Fatemi A, Archambault L. MedFusionGAN: multimodal medical image fusion using an unsupervised deep generative adversarial network. BMC Med Imaging 2023; 23:203. [PMID: 38062431 PMCID: PMC10704723 DOI: 10.1186/s12880-023-01160-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 11/23/2023] [Indexed: 12/18/2023] Open
Abstract
PURPOSE This study proposed an end-to-end unsupervised medical fusion generative adversarial network, MedFusionGAN, to fuse computed tomography (CT) and high-resolution isotropic 3D T1-Gd Magnetic resonance imaging (MRI) image sequences to generate an image with CT bone structure and MRI soft tissue contrast to improve target delineation and to reduce the radiotherapy planning time. METHODS We used a publicly available multicenter medical dataset (GLIS-RT, 230 patients) from the Cancer Imaging Archive. To improve the models generalization, we consider different imaging protocols and patients with various brain tumor types, including metastases. The proposed MedFusionGAN consisted of one generator network and one discriminator network trained in an adversarial scenario. Content, style, and L1 losses were used for training the generator to preserve the texture and structure information of the MRI and CT images. RESULTS The MedFusionGAN successfully generates fused images with MRI soft-tissue and CT bone contrast. The results of the MedFusionGAN were quantitatively and qualitatively compared with seven traditional and eight deep learning (DL) state-of-the-art methods. Qualitatively, our method fused the source images with the highest spatial resolution without adding the image artifacts. We reported nine quantitative metrics to quantify the preservation of structural similarity, contrast, distortion level, and image edges in fused images. Our method outperformed both traditional and DL methods on six out of nine metrics. And it got the second performance rank for three and two quantitative metrics when compared with traditional and DL methods, respectively. To compare soft-tissue contrast, intensity profile along tumor and tumor contours of the fusion methods were evaluated. MedFusionGAN provides a more consistent, better intensity profile, and a better segmentation performance. CONCLUSIONS The proposed end-to-end unsupervised method successfully fused MRI and CT images. The fused image could improve targets and OARs delineation, which is an important aspect of radiotherapy treatment planning.
Collapse
Affiliation(s)
- Mojtaba Safari
- Département de Physique, de génie Physique et d'Optique, et Centre de Recherche sur le Cancer, Université Laval, Québec City, QC, Canada.
- Service de Physique Médicale et Radioprotection, Centre Intégré de Cancérologie, CHU de Québec - Université Laval et Centre de recherche du CHU de Québec, Québec City, QC, Canada.
| | - Ali Fatemi
- Department of Physics, Jackson State University, Jackson, MS, USA
- Department of Radiation Oncology, Gamma Knife Center, Merit Health Central, Jackson, MS, USA
| | - Louis Archambault
- Département de Physique, de génie Physique et d'Optique, et Centre de Recherche sur le Cancer, Université Laval, Québec City, QC, Canada
- Service de Physique Médicale et Radioprotection, Centre Intégré de Cancérologie, CHU de Québec - Université Laval et Centre de recherche du CHU de Québec, Québec City, QC, Canada
| |
Collapse
|
4
|
Zhang W, Lu Y, Zheng H, Yu L. MBRARN: multibranch residual attention reconstruction network for medical image fusion. Med Biol Eng Comput 2023; 61:3067-3085. [PMID: 37624534 DOI: 10.1007/s11517-023-02902-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 08/01/2023] [Indexed: 08/26/2023]
Abstract
Medical image fusion aims to integrate complementary information from multimodal medical images and has been widely applied in the field of medicine, such as clinical diagnosis, pathology analysis, and healing examinations. For the fusion task, feature extraction is a crucial step. To obtain significant information embedded in medical images, many deep learning-based algorithms have been proposed recently and achieved good fusion results. However, most of them can hardly capture the independent and underlying features, which leads to unsatisfactory fusion results. To address these issues, a multibranch residual attention reconstruction network (MBRARN) is proposed for the medical image fusion task. The proposed network mainly consists of three parts: feature extraction, feature fusion, and feature reconstruction. Firstly, the input medical images are converted into three scales by image pyramid operation and then are input into three branches of the proposed network respectively. The purpose of this procedure is to capture the local detailed information and the global structural information. Then, convolutions with residual attention modules are designed, which can not only enhance the captured outstanding features, but also make the network converge fast and stably. Finally, feature fusion is performed with the designed fusion strategy. In this step, a new more effective fusion strategy is correspondently designed for MRI-SPECT based on the Euclidean norm, called feature distance ratio (FDR). The experimental results conducted on Harvard whole brain atlas dataset demonstrate that the proposed network can achieve better results in terms of both subjective and objective evaluation, compared with some state-of-the-art medical image fusion algorithms.
Collapse
Affiliation(s)
- Weihao Zhang
- College of Computer and Information Science, Chongqing Normal University, Chongqing, 401331, China
| | - Yuting Lu
- School of Big Data and Software Engineering, Chongqing University, Chongqing, 401331, China
| | - Haodong Zheng
- College of Computer and Information Science, Chongqing Normal University, Chongqing, 401331, China
| | - Lei Yu
- College of Computer and Information Science, Chongqing Normal University, Chongqing, 401331, China.
| |
Collapse
|
5
|
Zhang F, Wang L, Zhao J, Zhang X. Medical applications of generative adversarial network: a visualization analysis. Acta Radiol 2023; 64:2757-2767. [PMID: 37603577 DOI: 10.1177/02841851231189035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]
Abstract
BACKGROUND Deep learning (DL) is one of the latest approaches to artificial intelligence. As an unsupervised DL method, a generative adversarial network (GAN) can be used to synthesize new data. PURPOSE To explore GAN applications in medicine and point out the significance of its existence for clinical medical research, as well as to provide a visual bibliometric analysis of GAN applications in the medical field in combination with the scientometric software Citespace and statistical analysis methods. MATERIAL AND METHODS PubMed, MEDLINE, Web of Science, and Google Scholar were searched to identify studies of GAN in medical applications between 2017 and 2022. This study was performed and reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Citespace was used to analyze the number of publications, authors, institutions, and keywords of articles related to GAN in medical applications. RESULTS The applications of GAN in medicine are not limited to medical image processing, but will also penetrate wider and more complex fields, or may be applied to clinical medicine. Eligibility criteria were the full texts of peer-reviewed journals reporting the application of GANs in medicine. Research selections included material published in English between 1 January 2017 and 1 December 2022. CONCLUSION GAN has been fully applied to the medical field and will be more deeply and widely used in clinical medicine, especially in the field of privacy protection and medical diagnosis. However, clinical applications of GAN require consideration of ethical and legal issues. GAN-based applications should be well validated by expert radiologists.
Collapse
Affiliation(s)
- Fan Zhang
- Radiology department, Huaihe Hospital of Henan University, Kaifeng, PR China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, PR China
| | - Luyao Wang
- School of Computer and Information Engineering, Henan University, Kaifeng, PR China
| | - Jiayin Zhao
- School of Software, Henan University, Kaifeng, PR China
| | - Xinhong Zhang
- School of Software, Henan University, Kaifeng, PR China
| |
Collapse
|
6
|
Fu J, He B, Yang J, Liu J, Ouyang A, Wang Y. CDRNet: Cascaded dense residual network for grayscale and pseudocolor medical image fusion. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 234:107506. [PMID: 37003041 DOI: 10.1016/j.cmpb.2023.107506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 03/18/2023] [Accepted: 03/22/2023] [Indexed: 06/19/2023]
Abstract
OBJECTIVE Multimodal medical fusion images have been widely used in clinical medicine, computer-aided diagnosis and other fields. However, the existing multimodal medical image fusion algorithms generally have shortcomings such as complex calculations, blurred details and poor adaptability. To solve this problem, we propose a cascaded dense residual network and use it for grayscale and pseudocolor medical image fusion. METHODS The cascaded dense residual network uses a multiscale dense network and a residual network as the basic network architecture, and a multilevel converged network is obtained through cascade. The cascaded dense residual network contains 3 networks, the first-level network inputs two images with different modalities to obtain a fused Image 1, the second-level network uses fused Image 1 as the input image to obtain fused Image 2 and the third-level network uses fused Image 2 as the input image to obtain fused Image 3. The multimodal medical image is trained through each level of the network, and the output fusion image is enhanced step-by-step. RESULTS As the number of networks increases, the fusion image becomes increasingly clearer. Through numerous fusion experiments, the fused images of the proposed algorithm have higher edge strength, richer details, and better performance in the objective indicators than the reference algorithms. CONCLUSION Compared with the reference algorithms, the proposed algorithm has better original information, higher edge strength, richer details and an improvement of the four objective SF, AG, MZ and EN indicator metrics.
Collapse
Affiliation(s)
- Jun Fu
- School of Information Engineering, Zunyi Normal University, Zunyi, Guizhou, 563006, China.
| | - Baiqing He
- Nanchang Institute of Technology, Nanchang, Jiangxi, 330044, China
| | - Jie Yang
- School of Information Engineering, Zunyi Normal University, Zunyi, Guizhou, 563006, China
| | - Jianpeng Liu
- School of Science, East China Jiaotong University, Nanchang, Jiangxi, 330013, China
| | - Aijia Ouyang
- School of Information Engineering, Zunyi Normal University, Zunyi, Guizhou, 563006, China
| | - Ya Wang
- School of Information Engineering, Zunyi Normal University, Zunyi, Guizhou, 563006, China
| |
Collapse
|
7
|
Ding Z, Li H, Guo Y, Zhou D, Liu Y, Xie S. M 4FNet: Multimodal medical image fusion network via multi-receptive-field and multi-scale feature integration. Comput Biol Med 2023; 159:106923. [PMID: 37075601 DOI: 10.1016/j.compbiomed.2023.106923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 04/03/2023] [Accepted: 04/13/2023] [Indexed: 04/21/2023]
Abstract
The main purpose of multimodal medical image fusion is to aggregate the significant information from different modalities and obtain an informative image, which provides comprehensive content and may help to boost other image processing tasks. Many existing methods based on deep learning neglect the extraction and retention of multi-scale features of medical images and the construction of long-distance relationships between depth feature blocks. Therefore, a robust multimodal medical image fusion network via the multi-receptive-field and multi-scale feature (M4FNet) is proposed to achieve the purpose of preserving detailed textures and highlighting the structural characteristics. Specifically, the dual-branch dense hybrid dilated convolution blocks (DHDCB) is proposed to extract the depth features from multi-modalities by expanding the receptive field of the convolution kernel as well as reusing features, and establish long-range dependencies. In order to make full use of the semantic features of the source images, the depth features are decomposed into multi-scale domain by combining the 2-D scale function and wavelet function. Subsequently, the down-sampling depth features are fused by the proposed attention-aware fusion strategy and inversed to the feature space with equal size of source images. Ultimately, the fusion result is reconstructed by a deconvolution block. To force the fusion network balancing information preservation, a local standard deviation-driven structural similarity is proposed as the loss function. Extensive experiments prove that the performance of the proposed fusion network outperforms six state-of-the-art methods, which SD, MI, QABF and QEP are about 12.8%, 4.1%, 8.5% and 9.7% gains, respectively.
Collapse
Affiliation(s)
- Zhaisheng Ding
- School of Information and Artificial Intelligence, Yunnan University, Kunming, 650504, China
| | - Haiyan Li
- School of Information and Artificial Intelligence, Yunnan University, Kunming, 650504, China.
| | - Yi Guo
- School of Information and Artificial Intelligence, Yunnan University, Kunming, 650504, China
| | - Dongming Zhou
- School of Information and Artificial Intelligence, Yunnan University, Kunming, 650504, China
| | - Yanyu Liu
- School of Information and Artificial Intelligence, Yunnan University, Kunming, 650504, China
| | - Shidong Xie
- School of Information and Artificial Intelligence, Yunnan University, Kunming, 650504, China
| |
Collapse
|
8
|
FDGNet: A pair feature difference guided network for multimodal medical image fusion. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
9
|
Fan C, Lin H, Qiu Y, Yang L. DAGM-fusion: A dual-path CT-MRI image fusion model based multi-axial gated MLP. Comput Biol Med 2023; 155:106620. [PMID: 36774887 DOI: 10.1016/j.compbiomed.2023.106620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 12/28/2022] [Accepted: 01/28/2023] [Indexed: 02/04/2023]
Abstract
Medical imaging technology provides a good understanding of human tissue structure. MRI provides high-resolution soft tissue information, and CT provides high-quality bone density information. By creating CT-MRI fusion images of complex diagnostic situations, experts can develop diagnoses and treatment plans more quickly and precisely. We propose a dual-path CT-MRI image fusion model based on multi-axial gated MLP to create high-quality CT-MRI fusion images. The model employs the feature fusion module SFT-block to effectively integrate detailed Local-Path information guided by global Global-Path information. The fusion is completed through triple constraints, namely global constraints, local constraints, and overall constraints. We design a multi-axial gated MLP module (Ag-MLP). The multi-axial structure maintains the computational complexity linear and increases MLP's inductive bias, allowing MLP to work in shallower or pixel-level small dataset tasks. Ag-MLP and CNN are combined in the network so that the model has both globality and locality. In addition, we design a loss calculation method based on image patches that adaptively generates weights for each patch based on image pixel intensity. The details of the image are efficiently increased when patch-loss is used. Numerous studies demonstrate that the results of our model are superior to those of the latest mainstream fusion model, which are more in accordance with actual clinical diagnostic standards. The ablation studies successfully validate the performance of the model's constituent parts. It is worth mentioning that the model can also be excellently generalized to other modal image fusion tasks.
Collapse
Affiliation(s)
- Chao Fan
- School of Artificial Intelligence and Big Data, Henan University of Technology. Zhengzhou, Henan, China; Key Laboratory of Grain Information Processing and Control, Ministry of Education, Zhengzhou, Henan, China
| | - Hao Lin
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, Henan, China.
| | - Yingying Qiu
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, Henan, China
| | - Litao Yang
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, Henan, China
| |
Collapse
|
10
|
Li W, Zhang Y, Wang G, Huang Y, Li R. DFENet: A dual-branch feature enhanced network integrating transformers and convolutional feature learning for multimodal medical image fusion. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
11
|
Fan C, Lin H, Qiu Y. U-Patch GAN: A Medical Image Fusion Method Based on GAN. J Digit Imaging 2023; 36:339-355. [PMID: 36038702 PMCID: PMC9984622 DOI: 10.1007/s10278-022-00696-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 07/25/2022] [Accepted: 08/17/2022] [Indexed: 11/28/2022] Open
Abstract
Although medical imaging is frequently used to diagnose diseases, in complex diagnostic situations, specialists typically need to look at different modalities of image information. Creating a composite multimodal medical image can aid professionals in making quick and accurate diagnoses of diseases. The fused images of many medical image fusion algorithms, however, are frequently unable to precisely retain the functional and structural information of the source image. This work develops an end-to-end model based on GAN (U-Patch GAN) to implement the self-supervised fusion of multimodal brain images in order to enhance the fusion quality. The model uses the classical network U-net as the generator, and it uses the dual adversarial mechanism based on the Markovian discriminator (PatchGAN) to enhance the generator's attention to high-frequency information. To ensure that the network satisfies the Lipschitz continuity, we apply the spectral norm to each layer of the network. We also propose better adversarial loss and feature loss (feature matching loss and VGG-16 perceptual loss) based on the F-norm, which significantly enhance the quality of fused images. On public data sets, we performed a lot of tests. First, we studied how clinically useful the fused image was. The model's performance in single-slice images and continuous-slice images was then confirmed by comparison with other six most popular mainstream fusion approaches. Finally, we verify the effectiveness of the adversarial loss and feature loss.
Collapse
Affiliation(s)
- Chao Fan
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou City, 450001, Henan Province, China
- Key Laboratory of Grain Information Processing and Control, Ministry of Education, Zhengzhou City, 450001, Henan Province, China
| | - Hao Lin
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou City, 450001, Henan Province, China.
| | - Yingying Qiu
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou City, 450001, Henan Province, China
| |
Collapse
|
12
|
A Time Series Attention Mechanism Based Model For Tourism Demand Forecasting. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.01.095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
|
13
|
VANet: a medical image fusion model based on attention mechanism to assist disease diagnosis. BMC Bioinformatics 2022; 23:548. [PMID: 36536297 PMCID: PMC9762055 DOI: 10.1186/s12859-022-05072-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 11/22/2022] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Today's biomedical imaging technology has been able to present the morphological structure or functional metabolic information of organisms at different scale levels, such as organ, tissue, cell, molecule and gene. However, different imaging modes have different application scope, advantages and disadvantages. In order to improve the role of medical image in disease diagnosis, the fusion of biomedical image information at different imaging modes and scales has become an important research direction in medical image. Traditional medical image fusion methods are all designed to measure the activity level and fusion rules. They are lack of mining the context features of different modes of image, which leads to the obstruction of improving the quality of fused images. METHOD In this paper, an attention-multiscale network medical image fusion model based on contextual features is proposed. The model selects five backbone modules in the VGG-16 network to build encoders to obtain the contextual features of medical images. It builds the attention mechanism branch to complete the fusion of global contextual features and designs the residual multiscale detail processing branch to complete the fusion of local contextual features. Finally, it completes the cascade reconstruction of features by the decoder to obtain the fused image. RESULTS Ten sets of images related to five diseases are selected from the AANLIB database to validate the VANet model. Structural images are derived from MR images with high resolution and functional images are derived from SPECT and PET images that are good at describing organ blood flow levels and tissue metabolism. Fusion experiments are performed on twelve fusion algorithms including the VANet model. The model selects eight metrics from different aspects to build a fusion quality evaluation system to complete the performance evaluation of the fused images. Friedman's test and the post-hoc Nemenyi test are introduced to conduct professional statistical tests to demonstrate the superiority of VANet model. CONCLUSIONS The VANet model completely captures and fuses the texture details and color information of the source images. From the fusion results, the metabolism and structural information of the model are well expressed and there is no interference of color information on the structure and texture; in terms of the objective evaluation system, the metric value of the VANet model is generally higher than that of other methods.; in terms of efficiency, the time consumption of the model is acceptable; in terms of scalability, the model is not affected by the input order of source images and can be extended to tri-modal fusion.
Collapse
|
14
|
Kong W, Li C, Lei Y. Multimodal medical image fusion using convolutional neural network and extreme learning machine. Front Neurorobot 2022; 16:1050981. [PMID: 36467563 PMCID: PMC9708736 DOI: 10.3389/fnbot.2022.1050981] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Accepted: 10/28/2022] [Indexed: 08/27/2023] Open
Abstract
The emergence of multimodal medical imaging technology greatly increases the accuracy of clinical diagnosis and etiological analysis. Nevertheless, each medical imaging modal unavoidably has its own limitations, so the fusion of multimodal medical images may become an effective solution. In this paper, a novel fusion method on the multimodal medical images exploiting convolutional neural network (CNN) and extreme learning machine (ELM) is proposed. As a typical representative in deep learning, CNN has been gaining more and more popularity in the field of image processing. However, CNN often suffers from several drawbacks, such as high computational costs and intensive human interventions. To this end, the model of convolutional extreme learning machine (CELM) is constructed by incorporating ELM into the traditional CNN model. CELM serves as an important tool to extract and capture the features of the source images from a variety of different angles. The final fused image can be obtained by integrating the significant features together. Experimental results indicate that, the proposed method is not only helpful to enhance the accuracy of the lesion detection and localization, but also superior to the current state-of-the-art ones in terms of both subjective visual performance and objective criteria.
Collapse
Affiliation(s)
- Weiwei Kong
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an, China
- Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an, China
- Xi'an Key Laboratory of Big Data and Intelligent Computing, Xi'an, China
| | - Chi Li
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an, China
- Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an, China
- Xi'an Key Laboratory of Big Data and Intelligent Computing, Xi'an, China
| | - Yang Lei
- College of Cryptography Engineering, Engineering University of PAP, Xi'an, China
| |
Collapse
|
15
|
Wei L, Yan Q, Liu W, Luo D. Perceptual quality assessment for no-reference image via optimization-based meta-learning. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.07.163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
16
|
Zhao J, Hou X, Pan M, Zhang H. Attention-based generative adversarial network in medical imaging: A narrative review. Comput Biol Med 2022; 149:105948. [PMID: 35994931 DOI: 10.1016/j.compbiomed.2022.105948] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Revised: 07/24/2022] [Accepted: 08/06/2022] [Indexed: 11/18/2022]
Abstract
As a popular probabilistic generative model, generative adversarial network (GAN) has been successfully used not only in natural image processing, but also in medical image analysis and computer-aided diagnosis. Despite the various advantages, the applications of GAN in medical image analysis face new challenges. The introduction of attention mechanisms, which resemble the human visual system that focuses on the task-related local image area for certain information extraction, has drawn increasing interest. Recently proposed transformer-based architectures that leverage self-attention mechanism encode long-range dependencies and learn representations that are highly expressive. This motivates us to summarize the applications of using transformer-based GAN for medical image analysis. We reviewed recent advances in techniques combining various attention modules with different adversarial training schemes, and their applications in medical segmentation, synthesis and detection. Several recent studies have shown that attention modules can be effectively incorporated into a GAN model in detecting lesion areas and extracting diagnosis-related feature information precisely, thus providing a useful tool for medical image processing and diagnosis. This review indicates that research on the medical imaging analysis of GAN and attention mechanisms is still at an early stage despite the great potential. We highlight the attention-based generative adversarial network is an efficient and promising computational model advancing future research and applications in medical image analysis.
Collapse
Affiliation(s)
- Jing Zhao
- School of Engineering Medicine, Beihang University, Beijing, 100191, China; School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China
| | - Xiaoyuan Hou
- School of Engineering Medicine, Beihang University, Beijing, 100191, China; School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China
| | - Meiqing Pan
- School of Engineering Medicine, Beihang University, Beijing, 100191, China; School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China
| | - Hui Zhang
- School of Engineering Medicine, Beihang University, Beijing, 100191, China; Key Laboratory of Biomechanics and Mechanobiology (Beihang University), Ministry of Education, Beijing, 100191, China; Key Laboratory of Big Data-Based Precision Medicine (Beihang University), Ministry of Industry and Information Technology of the People's Republic of China, Beijing, 100191, China.
| |
Collapse
|
17
|
Multi-level difference information replenishment for medical image fusion. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03819-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
18
|
Multimodal medical image fusion based on multichannel coupled neural P systems and max-cloud models in spectral total variation domain. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.01.059] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
19
|
Li W, Li R, Fu J, Peng X. MSENet: A multi-scale enhanced network based on unique features guidance for medical image fusion. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103534] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
20
|
Kong W, Miao Q, Liu R, Lei Y, Cui J, Xie Q. Multimodal medical image fusion using gradient domain guided filter random walk and side window filtering in framelet domain. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2021.11.033] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|