1
|
Hu Y, Cheng M, Wei H, Liang Z. A joint learning framework for multisite CBCT-to-CT translation using a hybrid CNN-transformer synthesizer and a registration network. Front Oncol 2024; 14:1440944. [PMID: 39175474 PMCID: PMC11338897 DOI: 10.3389/fonc.2024.1440944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 07/19/2024] [Indexed: 08/24/2024] Open
Abstract
Background Cone-beam computed tomography (CBCT) is a convenient method for adaptive radiation therapy (ART), but its application is often hindered by its image quality. We aim to develop a unified deep learning model that can consistently enhance the quality of CBCT images across various anatomical sites by generating synthetic CT (sCT) images. Methods A dataset of paired CBCT and planning CT images from 135 cancer patients, including head and neck, chest and abdominal tumors, was collected. This dataset, with its rich anatomical diversity and scanning parameters, was carefully selected to ensure comprehensive model training. Due to the imperfect registration, the inherent challenge of local structural misalignment of paired dataset may lead to suboptimal model performance. To address this limitation, we propose SynREG, a supervised learning framework. SynREG integrates a hybrid CNN-transformer architecture designed for generating high-fidelity sCT images and a registration network designed to correct local structural misalignment dynamically during training. An independent test set of 23 additional patients was used to evaluate the image quality, and the results were compared with those of several benchmark models (pix2pix, cycleGAN and SwinIR). Furthermore, the performance of an autosegmentation application was also assessed. Results The proposed model disentangled sCT generation from anatomical correction, leading to a more rational optimization process. As a result, the model effectively suppressed noise and artifacts in multisite applications, significantly enhancing CBCT image quality. Specifically, the mean absolute error (MAE) of SynREG was reduced to 16.81 ± 8.42 HU, whereas the structural similarity index (SSIM) increased to 94.34 ± 2.85%, representing improvements over the raw CBCT data, which had the MAE of 26.74 ± 10.11 HU and the SSIM of 89.73 ± 3.46%. The enhanced image quality was particularly beneficial for organs with low contrast resolution, significantly increasing the accuracy of automatic segmentation in these regions. Notably, for the brainstem, the mean Dice similarity coefficient (DSC) increased from 0.61 to 0.89, and the MDA decreased from 3.72 mm to 0.98 mm, indicating a substantial improvement in segmentation accuracy and precision. Conclusions SynREG can effectively alleviate the differences in residual anatomy between paired datasets and enhance the quality of CBCT images.
Collapse
Affiliation(s)
- Ying Hu
- School of Mathematics and Statistics, Hubei University of Education, Wuhan, Hubei, China
- Bigdata Modeling and Intelligent Computing Research Institute, Hubei University of Education, Wuhan, Hubei, China
| | - Mengjie Cheng
- Nutrition Department, Renmin Hospital of Wuhan University, Wuhan, China
| | - Hui Wei
- Department of Radiotherapy, Affiliated Hospital of Hebei Engineering University, Handan, China
| | - Zhiwen Liang
- Cancer Center, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Key Laboratory of Precision Radiation Oncology, Wuhan, China
| |
Collapse
|
2
|
Duong MT, Nguyen Thi BT, Lee S, Hong MC. Multi-Branch Network for Color Image Denoising Using Dilated Convolution and Attention Mechanisms. SENSORS (BASEL, SWITZERLAND) 2024; 24:3608. [PMID: 38894398 PMCID: PMC11175289 DOI: 10.3390/s24113608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Revised: 05/28/2024] [Accepted: 05/31/2024] [Indexed: 06/21/2024]
Abstract
Image denoising is regarded as an ill-posed problem in computer vision tasks that removes additive noise from imaging sensors. Recently, several convolution neural network-based image-denoising methods have achieved remarkable advances. However, it is difficult for a simple denoising network to recover aesthetically pleasing images owing to the complexity of image content. Therefore, this study proposes a multi-branch network to improve the performance of the denoising method. First, the proposed network is designed based on a conventional autoencoder to learn multi-level contextual features from input images. Subsequently, we integrate two modules into the network, including the Pyramid Context Module (PCM) and the Residual Bottleneck Attention Module (RBAM), to extract salient information for the training process. More specifically, PCM is applied at the beginning of the network to enlarge the receptive field and successfully address the loss of global information using dilated convolution. Meanwhile, RBAM is inserted into the middle of the encoder and decoder to eliminate degraded features and reduce undesired artifacts. Finally, extensive experimental results prove the superiority of the proposed method over state-of-the-art deep-learning methods in terms of objective and subjective performances.
Collapse
Affiliation(s)
- Minh-Thien Duong
- Department of Information and Telecommunication Engineering, Soongsil University, Seoul 06978, Republic of Korea; (M.-T.D.); (B.-T.N.T.)
| | - Bao-Tran Nguyen Thi
- Department of Information and Telecommunication Engineering, Soongsil University, Seoul 06978, Republic of Korea; (M.-T.D.); (B.-T.N.T.)
| | - Seongsoo Lee
- Department of Intelligent Semiconductor, Soongsil University, Seoul 06978, Republic of Korea;
| | - Min-Cheol Hong
- School of Electronic Engineering, Soongsil University, Seoul 06978, Republic of Korea
| |
Collapse
|
3
|
Kuznetsova V, Coogan Á, Botov D, Gromova Y, Ushakova EV, Gun'ko YK. Expanding the Horizons of Machine Learning in Nanomaterials to Chiral Nanostructures. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2308912. [PMID: 38241607 PMCID: PMC11167410 DOI: 10.1002/adma.202308912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 01/10/2024] [Indexed: 01/21/2024]
Abstract
Machine learning holds significant research potential in the field of nanotechnology, enabling nanomaterial structure and property predictions, facilitating materials design and discovery, and reducing the need for time-consuming and labor-intensive experiments and simulations. In contrast to their achiral counterparts, the application of machine learning for chiral nanomaterials is still in its infancy, with a limited number of publications to date. This is despite the great potential of machine learning to advance the development of new sustainable chiral materials with high values of optical activity, circularly polarized luminescence, and enantioselectivity, as well as for the analysis of structural chirality by electron microscopy. In this review, an analysis of machine learning methods used for studying achiral nanomaterials is provided, subsequently offering guidance on adapting and extending this work to chiral nanomaterials. An overview of chiral nanomaterials within the framework of synthesis-structure-property-application relationships is presented and insights on how to leverage machine learning for the study of these highly complex relationships are provided. Some key recent publications are reviewed and discussed on the application of machine learning for chiral nanomaterials. Finally, the review captures the key achievements, ongoing challenges, and the prospective outlook for this very important research field.
Collapse
Affiliation(s)
- Vera Kuznetsova
- School of Chemistry, CRANN and AMBER Research Centres, Trinity College Dublin, College Green, Dublin, D02 PN40, Ireland
| | - Áine Coogan
- School of Chemistry, CRANN and AMBER Research Centres, Trinity College Dublin, College Green, Dublin, D02 PN40, Ireland
| | - Dmitry Botov
- Everypixel Media Innovation Group, 021 Fillmore St., PMB 15, San Francisco, CA, 94115, USA
- Neapolis University Pafos, 2 Danais Avenue, Pafos, 8042, Cyprus
| | - Yulia Gromova
- Department of Molecular and Cellular Biology, Harvard University, 52 Oxford St., Cambridge, MA, 02138, USA
| | - Elena V Ushakova
- Department of Materials Science and Engineering, and Centre for Functional Photonics (CFP), City University of Hong Kong, Hong Kong SAR, 999077, P. R. China
| | - Yurii K Gun'ko
- School of Chemistry, CRANN and AMBER Research Centres, Trinity College Dublin, College Green, Dublin, D02 PN40, Ireland
| |
Collapse
|
4
|
P L R, K S G. Revolutionizing dementia detection: Leveraging vision and Swin transformers for early diagnosis. Am J Med Genet B Neuropsychiatr Genet 2024:e32979. [PMID: 38619385 DOI: 10.1002/ajmg.b.32979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 02/01/2024] [Accepted: 03/04/2024] [Indexed: 04/16/2024]
Abstract
Dementia, an increasingly prevalent neurological disorder with a projected threefold rise globally by 2050, necessitates early detection for effective management. The risk notably increases after age 65. Dementia leads to a progressive decline in cognitive functions, affecting memory, reasoning, and problem-solving abilities. This decline can impact the individual's ability to perform daily tasks and make decisions, underscoring the crucial importance of timely identification. With the advent of technologies like computer vision and deep learning, the prospect of early detection becomes even more promising. Employing sophisticated algorithms on imaging data, such as positron emission tomography scans, facilitates the recognition of subtle structural brain changes, enabling diagnosis at an earlier stage for potentially more effective interventions. In an experimental study, the Swin transformer algorithm demonstrated superior overall accuracy compared to the vision transformer and convolutional neural network, emphasizing its efficiency. Detecting dementia early is essential for proactive management, personalized care, and implementing preventive measures, ultimately enhancing outcomes for individuals and lessening the overall burden on healthcare systems.
Collapse
Affiliation(s)
- Rini P L
- Department of Information Technology, Sri Sivasubramaniya Nadar College of Engineering, Kalavakkam, India
| | - Gayathri K S
- Department of Information Technology, Sri Sivasubramaniya Nadar College of Engineering, Kalavakkam, India
| |
Collapse
|
5
|
P. V, M. I. GNViT- An enhanced image-based groundnut pest classification using Vision Transformer (ViT) model. PLoS One 2024; 19:e0301174. [PMID: 38527074 PMCID: PMC10962840 DOI: 10.1371/journal.pone.0301174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 03/12/2024] [Indexed: 03/27/2024] Open
Abstract
Crop losses caused by diseases and pests present substantial challenges to global agriculture, with groundnut crops particularly vulnerable to their detrimental effects. This study introduces the Groundnut Vision Transformer (GNViT) model, a novel approach that harnesses a pre-trained Vision Transformer (ViT) on the ImageNet dataset. The primary goal is to detect and classify various pests affecting groundnut crops. Rigorous training and evaluation were conducted using a comprehensive dataset from IP102, encompassing pests such as Thrips, Aphids, Armyworms, and Wireworms. The GNViT model's effectiveness was assessed using reliability metrics, including the F1-score, recall, and overall accuracy. Data augmentation with GNViT resulted in a significant increase in training accuracy, achieving 99.52%. Comparative analysis highlighted the GNViT model's superior performance, particularly in accuracy, compared to state-of-the-art methodologies. These findings underscore the potential of deep learning models, such as GNViT, in providing reliable pest classification solutions for groundnut crops. The deployment of advanced technological solutions brings us closer to the overarching goal of reducing crop losses and enhancing global food security for the growing population.
Collapse
Affiliation(s)
- Venkatasaichandrakanth P.
- School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, Tamilnadu, India
| | - Iyapparaja M.
- School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, Tamilnadu, India
| |
Collapse
|
6
|
Ye L, Wang D, Yang D, Ma Z, Zhang Q. VELIE: A Vehicle-Based Efficient Low-Light Image Enhancement Method for Intelligent Vehicles. SENSORS (BASEL, SWITZERLAND) 2024; 24:1345. [PMID: 38400503 PMCID: PMC10892397 DOI: 10.3390/s24041345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 02/09/2024] [Accepted: 02/14/2024] [Indexed: 02/25/2024]
Abstract
In Advanced Driving Assistance Systems (ADAS), Automated Driving Systems (ADS), and Driver Assistance Systems (DAS), RGB camera sensors are extensively utilized for object detection, semantic segmentation, and object tracking. Despite their popularity due to low costs, RGB cameras exhibit weak robustness in complex environments, particularly underperforming in low-light conditions, which raises a significant concern. To address these challenges, multi-sensor fusion systems or specialized low-light cameras have been proposed, but their high costs render them unsuitable for widespread deployment. On the other hand, improvements in post-processing algorithms offer a more economical and effective solution. However, current research in low-light image enhancement still shows substantial gaps in detail enhancement on nighttime driving datasets and is characterized by high deployment costs, failing to achieve real-time inference and edge deployment. Therefore, this paper leverages the Swin Vision Transformer combined with a gamma transformation integrated U-Net for the decoupled enhancement of initial low-light inputs, proposing a deep learning enhancement network named Vehicle-based Efficient Low-light Image Enhancement (VELIE). VELIE achieves state-of-the-art performance on various driving datasets with a processing time of only 0.19 s, significantly enhancing high-dimensional environmental perception tasks in low-light conditions.
Collapse
Affiliation(s)
- Linwei Ye
- Department of Electrical and Electronic Engineering, University of Liverpool, Liverpool L69 3BX, UK; (L.Y.); (D.W.); (D.Y.)
- School of Advance Technology, Xi’an Jiaotong-Liverpool University, 111 Ren’ai Road, Suzhou 215123, China
| | - Dong Wang
- Department of Electrical and Electronic Engineering, University of Liverpool, Liverpool L69 3BX, UK; (L.Y.); (D.W.); (D.Y.)
- School of Advance Technology, Xi’an Jiaotong-Liverpool University, 111 Ren’ai Road, Suzhou 215123, China
| | - Dongyi Yang
- Department of Electrical and Electronic Engineering, University of Liverpool, Liverpool L69 3BX, UK; (L.Y.); (D.W.); (D.Y.)
- School of Advance Technology, Xi’an Jiaotong-Liverpool University, 111 Ren’ai Road, Suzhou 215123, China
| | - Zhiyuan Ma
- Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX 78712, USA
| | - Quan Zhang
- Department of Electrical and Electronic Engineering, University of Liverpool, Liverpool L69 3BX, UK; (L.Y.); (D.W.); (D.Y.)
- School of Advance Technology, Xi’an Jiaotong-Liverpool University, 111 Ren’ai Road, Suzhou 215123, China
| |
Collapse
|
7
|
Shah ZH, Müller M, Hübner W, Wang TC, Telman D, Huser T, Schenck W. Evaluation of Swin Transformer and knowledge transfer for denoising of super-resolution structured illumination microscopy data. Gigascience 2024; 13:giad109. [PMID: 38217407 PMCID: PMC10787368 DOI: 10.1093/gigascience/giad109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 07/13/2023] [Accepted: 12/05/2023] [Indexed: 01/15/2024] Open
Abstract
BACKGROUND Convolutional neural network (CNN)-based methods have shown excellent performance in denoising and reconstruction of super-resolved structured illumination microscopy (SR-SIM) data. Therefore, CNN-based architectures have been the focus of existing studies. However, Swin Transformer, an alternative and recently proposed deep learning-based image restoration architecture, has not been fully investigated for denoising SR-SIM images. Furthermore, it has not been fully explored how well transfer learning strategies work for denoising SR-SIM images with different noise characteristics and recorded cell structures for these different types of deep learning-based methods. Currently, the scarcity of publicly available SR-SIM datasets limits the exploration of the performance and generalization capabilities of deep learning methods. RESULTS In this work, we present SwinT-fairSIM, a novel method based on the Swin Transformer for restoring SR-SIM images with a low signal-to-noise ratio. The experimental results show that SwinT-fairSIM outperforms previous CNN-based denoising methods. Furthermore, as a second contribution, two types of transfer learning-namely, direct transfer and fine-tuning-were benchmarked in combination with SwinT-fairSIM and CNN-based methods for denoising SR-SIM data. Direct transfer did not prove to be a viable strategy, but fine-tuning produced results comparable to conventional training from scratch while saving computational time and potentially reducing the amount of training data required. As a third contribution, we publish four datasets of raw SIM images and already reconstructed SR-SIM images. These datasets cover two different types of cell structures, tubulin filaments and vesicle structures. Different noise levels are available for the tubulin filaments. CONCLUSION The SwinT-fairSIM method is well suited for denoising SR-SIM images. By fine-tuning, already trained models can be easily adapted to different noise characteristics and cell structures. Furthermore, the provided datasets are structured in a way that the research community can readily use them for research on denoising, super-resolution, and transfer learning strategies.
Collapse
Affiliation(s)
- Zafran Hussain Shah
- Faculty of Engineering and Mathematics, Bielefeld University of Applied Sciences and Arts, 33619 Bielefeld, Germany
| | - Marcel Müller
- Faculty of Physics, Bielefeld University, 33615 Bielefeld, Germany
| | - Wolfgang Hübner
- Faculty of Physics, Bielefeld University, 33615 Bielefeld, Germany
| | - Tung-Cheng Wang
- Faculty of Physics, Bielefeld University, 33615 Bielefeld, Germany
- Leica Microsystems CMS GmbH, 68165 Mannheim, Germany
| | - Daniel Telman
- Faculty of Engineering and Mathematics, Bielefeld University of Applied Sciences and Arts, 33619 Bielefeld, Germany
| | - Thomas Huser
- Faculty of Physics, Bielefeld University, 33615 Bielefeld, Germany
| | - Wolfram Schenck
- Faculty of Engineering and Mathematics, Bielefeld University of Applied Sciences and Arts, 33619 Bielefeld, Germany
| |
Collapse
|
8
|
Muksimova S, Umirzakova S, Mardieva S, Cho YI. Enhancing Medical Image Denoising with Innovative Teacher-Student Model-Based Approaches for Precision Diagnostics. SENSORS (BASEL, SWITZERLAND) 2023; 23:9502. [PMID: 38067873 PMCID: PMC10708859 DOI: 10.3390/s23239502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 11/21/2023] [Accepted: 11/27/2023] [Indexed: 12/18/2023]
Abstract
The realm of medical imaging is a critical frontier in precision diagnostics, where the clarity of the image is paramount. Despite advancements in imaging technology, noise remains a pervasive challenge that can obscure crucial details and impede accurate diagnoses. Addressing this, we introduce a novel teacher-student network model that leverages the potency of our bespoke NoiseContextNet Block to discern and mitigate noise with unprecedented precision. This innovation is coupled with an iterative pruning technique aimed at refining the model for heightened computational efficiency without compromising the fidelity of denoising. We substantiate the superiority and effectiveness of our approach through a comprehensive suite of experiments, showcasing significant qualitative enhancements across a multitude of medical imaging modalities. The visual results from a vast array of tests firmly establish our method's dominance in producing clearer, more reliable images for diagnostic purposes, thereby setting a new benchmark in medical image denoising.
Collapse
Affiliation(s)
| | - Sabina Umirzakova
- Department of IT Convergence Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Republic of Korea; (S.M.); (S.M.)
| | | | - Young-Im Cho
- Department of IT Convergence Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Republic of Korea; (S.M.); (S.M.)
| |
Collapse
|