101
|
Nassir A, Rosenthal G, Zadka Y, Houri S, Doron O, Barnea O. Estimating intracranial parameters using an inverse mathematical model with viscoelastic elements that closely predicts complex ICP morphologies. Comput Methods Biomech Biomed Engin 2025; 28:972-984. [PMID: 38303646 DOI: 10.1080/10255842.2024.2308695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 12/20/2023] [Accepted: 01/08/2024] [Indexed: 02/03/2024]
Abstract
The quantitative relationship between arterial blood pressure (ABP) and intracranial pressure (ICP) waveforms has not been adequately explained. We hypothesized that the ICP waveform results from interferences between propagating and reflected pressure waves occurring in the cranium following the initiating arterial waveform. To demonstrate cranial effects on interferences between waves and generation of an ICP waveform morphology, we modified our previously reported mathematical model to include viscoelastic elements that affect propagation velocity. Using patient data, we implemented an inverse model methodology to generate simulated ICP waveforms in response to given ABP waveforms. We used an open database of traumatic brain injury patients and studied 65 pairs of ICP and ABP waveforms from 13 patients (five pairs from each). Incorporating viscoelastic elements into the model resulted in model-generated ICP waveforms that very closely resembled the measured waveforms with a 16-fold increase in similarity index relative to the model with only pure elasticity elements. The mean similarity index for the pure elasticity model was 0.06 ± 0.12 SD, compared to 0.96 ± 0.28 SD for the model with viscoelastic components. The normalized root mean squared error (NRMSE) improved substantially for the model with viscoelastic elements compared to the model with pure elastic elements (NRMSE of 2.09% ± 0.62 vs. 15.2% ± 4.8, respectively). The ability of the model to generate complex ICP waveforms indicates that the model may indeed reflect intracranial dynamics. Our results suggest that the model may allow the estimation of intracranial biomechanical parameters with potential clinical significance. It represents a first step in the estimation of inaccessible intracranial parameters.
Collapse
Affiliation(s)
- Abed Nassir
- Department of Biomedical Engineering, Faculty of Engineering, Tel Aviv University, Tel Aviv, Israel
| | - Guy Rosenthal
- Department of Neurosurgery, Hadassah Hebrew University Medical Center, Jerusalem, Israel
| | - Yuliya Zadka
- Department of Biomedical Engineering, Faculty of Engineering, Tel Aviv University, Tel Aviv, Israel
| | - Saadit Houri
- Department of Neurosurgery, Hadassah Hebrew University Medical Center, Jerusalem, Israel
| | - Omer Doron
- Department of Neurosurgery, Massachusetts General Hospital, Boston, MA, USA
| | - Ofer Barnea
- Department of Biomedical Engineering, Faculty of Engineering, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
102
|
Xiang L, Zhao X, Wang J, Wang B. An Enhanced Human Evolutionary Optimization Algorithm for Global Optimization and Multi-Threshold Image Segmentation. Biomimetics (Basel) 2025; 10:282. [PMID: 40422113 DOI: 10.3390/biomimetics10050282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2025] [Revised: 04/23/2025] [Accepted: 04/27/2025] [Indexed: 05/28/2025] Open
Abstract
Thresholding image segmentation aims to divide an image into a number of regions with different feature attributes in order to facilitate the extraction of image features in the context of image detection and pattern recognition. However, existing threshold image-segmentation methods suffer from the problem of easily falling into locally optimal thresholds, resulting in poor image segmentation. In order to improve the image-segmentation performance, this study proposes an enhanced Human Evolutionary Optimization Algorithm (HEOA), known as CLNBHEOA, which incorporates Otsu's method as an objective function to significantly improve the image-segmentation performance. In the CLNBHEOA, firstly, population diversity is enhanced using the Chebyshev-Tent chaotic mapping refraction opposites-based learning strategy. Secondly, an adaptive learning strategy is proposed which combines differential learning and adaptive factors to improve the ability of the algorithm to jump out of the locally optimum threshold. In addition, a nonlinear control factor is proposed to better balance the global exploration phase and the local exploitation phase of the algorithm. Finally, a three-point guidance strategy based on Bernstein polynomials is proposed which enhances the local exploitation ability of the algorithm and effectively improves the efficiency of optimal threshold search. Subsequently, the optimization performance of the CLNBHEOA was evaluated on the CEC2017 benchmark functions. Experiments demonstrated that the CLNBHEOA outperformed the comparison algorithms by over 90%, exhibiting higher optimization performance and search efficiency. Finally, the CLNBHEOA was applied to solve six multi-threshold image-segmentation problems. The experimental results indicated that the CLNBHEOA achieved a winning rate of over 95% in terms of fitness function value, peak signal-to-noise ratio (PSNR), structural similarity (SSIM) and feature similarity (FSIM), suggesting that it can be considered a promising approach for multi-threshold image segmentation.
Collapse
Affiliation(s)
- Liang Xiang
- Department of Space and Culture Design Graduate School of Techno Design (TED), Kookmin University, Seoul 02707, Republic of Korea
| | - Xiajie Zhao
- College of Design, Hanyang University, Ansan 15588, Republic of Korea
| | - Jianfeng Wang
- College of Design, Hanyang University, Ansan 15588, Republic of Korea
| | - Bin Wang
- College of Design, Hanyang University, Ansan 15588, Republic of Korea
| |
Collapse
|
103
|
Khetan N, Mertz J. Plane wave compounding with adaptive joint coherence factor weighting. ULTRASONICS 2025; 149:107573. [PMID: 39893756 DOI: 10.1016/j.ultras.2025.107573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 01/12/2025] [Accepted: 01/12/2025] [Indexed: 02/04/2025]
Abstract
Coherent Plane Wave Compounding (CPWC) is widely used for ultrasound imaging. This technique involves transmitting plane waves into a sample at different transmit angles and recording the resultant backscattered echo at different receive positions. The time-delayed signals from the different combinations of transmit angles and receive positions are then coherently summed to produce a beamformed image. Various techniques have been developed to characterize the quality of CPWC beamforming based on the measured coherence across the transmit or receive apertures. Here, we propose a more granular approach where the signals from every transmit/receive combination are separately evaluated using a quality metric based on their joint spatio-angular coherence. The signals are then individually weighted according to their measured Joint Coherence Factor (JCF) prior to being coherently summed. To facilitate the comparison of JCF beamforming compared to alternative techniques, we further propose a method of image display standardization based on contrast matching. We show results from tissue-mimicking phantoms and human soft-tissue imaging. Fine-grained JCF weighting is found to improve CPWC image quality compared to alternative approaches.
Collapse
Affiliation(s)
- Nikunj Khetan
- Boston University Mechanical Engineering, 110 Cummington Mall, Boston, 02215, MA, USA.
| | - Jerome Mertz
- Boston University Biomedical Engineering, 44 Cummington Mall, Boston, 02215, MA, USA.
| |
Collapse
|
104
|
Chen Y, Wang Y, Zhang H. Unsupervised Range-Nullspace Learning Prior for Multispectral Images Reconstruction. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:2513-2528. [PMID: 40249693 DOI: 10.1109/tip.2025.3560430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/20/2025]
Abstract
Snapshot Spectral Imaging (SSI) techniques, with the ability to capture both spectral and spatial information in a single exposure, have been found useful in a wide range of applications. SSI systems generally operate within the 'encoding-decoding' framework, leveraging the synergism of optical hardware and reconstruction algorithms. Typically, reconstructing desired spectral images from SSI measurements is an ill-posed and challenging problem. Existing studies utilize either model-based or deep learning-based methods, but both have their drawbacks. Model-based algorithms suffer from high computational costs, while supervised learning-based methods rely on large paired training data. In this paper, we propose a novel Unsupervised range-Nullspace learning (UnNull) prior for spectral image reconstruction. UnNull explicitly models the data via subspace decomposition, offering enhanced interpretability and generalization ability. Specifically, UnNull considers that the spectral images can be decomposed into the range and null subspaces. The features projected onto the range subspace are mainly low-frequency information, while features in the nullspace represent high-frequency information. Comprehensive multispectral demosaicing and reconstruction experiments demonstrate the superior performance of our proposed algorithm.
Collapse
|
105
|
Ni Z, Xiao R, Yang W, Wang H, Wang Z, Xiang L, Sun L. M2Trans: Multi-Modal Regularized Coarse-to-Fine Transformer for Ultrasound Image Super-Resolution. IEEE J Biomed Health Inform 2025; 29:3112-3123. [PMID: 39226206 DOI: 10.1109/jbhi.2024.3454068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
Ultrasound image super-resolution (SR) aims to transform low-resolution images into high-resolution ones, thereby restoring intricate details crucial for improved diagnostic accuracy. However, prevailing methods relying solely on image modality guidance and pixel-wise loss functions struggle to capture the distinct characteristics of medical images, such as unique texture patterns and specific colors harboring critical diagnostic information. To overcome these challenges, this paper introduces the Multi-Modal Regularized Coarse-to-fine Transformer (M2Trans) for Ultrasound Image SR. By integrating the text modality, we establish joint image-text guidance during training, leveraging the medical CLIP model to incorporate richer priors from text descriptions into the SR optimization process, enhancing detail, structure, and semantic recovery. Furthermore, we propose a novel coarse-to-fine transformer comprising multiple branches infused with self-attention and frequency transforms to efficiently capture signal dependencies across different scales. Extensive experimental results demonstrate significant improvements over state-of-the-art methods on benchmark datasets, including CCA-US, US-CASE, and our newly created dataset MMUS1K, with a minimum improvement of 0.17dB, 0.30dB, and 0.28dB in terms of PSNR.
Collapse
|
106
|
Barbano R, Denker A, Chung H, Roh TH, Arridge S, Maass P, Jin B, Ye JC. Steerable Conditional Diffusion for Out-of-Distribution Adaptation in Medical Image Reconstruction. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:2093-2104. [PMID: 40030859 DOI: 10.1109/tmi.2024.3524797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Denoising diffusion models have emerged as the go-to generative framework for solving inverse problems in imaging. A critical concern regarding these models is their performance on out-of-distribution tasks, which remains an under-explored challenge. Using a diffusion model on an out-of-distribution dataset, realistic reconstructions can be generated, but with hallucinating image features that are uniquely present in the training dataset. To address this discrepancy and improve reconstruction accuracy, we introduce a novel test-time adaptation sampling framework called Steerable Conditional Diffusion. Specifically, this framework adapts the diffusion model, concurrently with image reconstruction, based solely on the information provided by the available measurement. Utilising the proposed method, we achieve substantial enhancements in out-of-distribution performance across diverse imaging modalities, advancing the robust deployment of denoising diffusion models in real-world applications.
Collapse
|
107
|
Tagawa H, Fushimi Y, Fujimoto K, Nakajima S, Okuchi S, Sakata A, Otani S, Wicaksono KP, Wang Y, Ikeda S, Ito S, Umehana M, Shimotake A, Kuzuya A, Nakamoto Y. Generation of high-resolution MPRAGE-like images from 3D head MRI localizer (AutoAlign Head) images using a deep learning-based model. Jpn J Radiol 2025; 43:761-769. [PMID: 39794660 PMCID: PMC12053187 DOI: 10.1007/s11604-024-01728-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Accepted: 12/22/2024] [Indexed: 01/13/2025]
Abstract
PURPOSE Magnetization prepared rapid gradient echo (MPRAGE) is a useful three-dimensional (3D) T1-weighted sequence, but is not a priority in routine brain examinations. We hypothesized that converting 3D MRI localizer (AutoAlign Head) images to MPRAGE-like images with deep learning (DL) would be beneficial for diagnosing and researching dementia and neurodegenerative diseases. We aimed to establish and evaluate a DL-based model for generating MPRAGE-like images from MRI localizers. MATERIALS AND METHODS Brain MRI examinations including MPRAGE taken at a single institution for investigation of mild cognitive impairment, dementia and epilepsy between January 2020 and December 2022 were included retrospectively. Images taken in 2020 or 2021 were assigned to training and validation datasets, and images from 2022 were used for the test dataset. Using the training and validation set, we determined one model using visual evaluation by radiologists with reference to image quality metrics of peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and Learned Perceptual Image Patch Similarity (LPIPS). The test dataset was evaluated by visual assessment and quality metrics. Voxel-based morphometric analysis was also performed, and we evaluated Dice score and volume differences between generated and original images of major structures were calculated as absolute symmetrized percent change. RESULTS Training, validation, and test datasets comprised 340 patients (mean age, 56.1 ± 24.4 years; 195 women), 36 patients (67.3 ± 18.3 years, 20 women), and 193 patients (59.5 ± 24.4 years; 111 women), respectively. The test dataset showed: PSNR, 35.4 ± 4.91; SSIM, 0.871 ± 0.058; and LPIPS 0.045 ± 0.017. No overfitting was observed. Dice scores for the segmentation of main structures ranged from 0.788 (left amygdala) to 0.926 (left ventricle). Quadratic weighted Cohen kappa values of visual score for medial temporal lobe between original and generated images were 0.80-0.88. CONCLUSION Images generated using our DL-based model can be used for post-processing and visual evaluation of medial temporal lobe atrophy.
Collapse
Affiliation(s)
- Hiroshi Tagawa
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-Cho, Sakyo-Ku, Kyoto, 606-8507, Japan
| | - Yasutaka Fushimi
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-Cho, Sakyo-Ku, Kyoto, 606-8507, Japan.
| | - Koji Fujimoto
- Department of Advanced Imaging in Medical Magnetic Resonance, Graduate School of Medicine, Kyoto University, Kyoto, 606-8507, Japan
| | - Satoshi Nakajima
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-Cho, Sakyo-Ku, Kyoto, 606-8507, Japan
| | - Sachi Okuchi
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-Cho, Sakyo-Ku, Kyoto, 606-8507, Japan
| | - Akihiko Sakata
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-Cho, Sakyo-Ku, Kyoto, 606-8507, Japan
| | - Sayo Otani
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-Cho, Sakyo-Ku, Kyoto, 606-8507, Japan
| | | | - Yang Wang
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-Cho, Sakyo-Ku, Kyoto, 606-8507, Japan
| | - Satoshi Ikeda
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-Cho, Sakyo-Ku, Kyoto, 606-8507, Japan
| | - Shuichi Ito
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-Cho, Sakyo-Ku, Kyoto, 606-8507, Japan
| | - Masaki Umehana
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-Cho, Sakyo-Ku, Kyoto, 606-8507, Japan
| | | | - Akira Kuzuya
- Department of Neurology, Graduate School of Medicine, Kyoto University, Kyoto, 606-8507, Japan
| | - Yuji Nakamoto
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-Cho, Sakyo-Ku, Kyoto, 606-8507, Japan
| |
Collapse
|
108
|
Gao M, Sun J, Li Q, Khan MA, Shang J, Zhu X, Jeon G. Towards trustworthy image super-resolution via symmetrical and recursive artificial neural network. IMAGE AND VISION COMPUTING 2025; 158:105519. [DOI: 10.1016/j.imavis.2025.105519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2025]
|
109
|
Xie X, Zhang X, Tang X, Zhao J, Xiong D, Ouyang L, Yang B, Zhou H, Ling BWK, Teo KL. MACTFusion: Lightweight Cross Transformer for Adaptive Multimodal Medical Image Fusion. IEEE J Biomed Health Inform 2025; 29:3317-3328. [PMID: 38640042 DOI: 10.1109/jbhi.2024.3391620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/21/2024]
Abstract
Multimodal medical image fusion aims to integrate complementary information from different modalities of medical images. Deep learning methods, especially recent vision Transformers, have effectively improved image fusion performance. However, there are limitations for Transformers in image fusion, such as lacks of local feature extraction and cross-modal feature interaction, resulting in insufficient multimodal feature extraction and integration. In addition, the computational cost of Transformers is higher. To address these challenges, in this work, we develop an adaptive cross-modal fusion strategy for unsupervised multimodal medical image fusion. Specifically, we propose a novel lightweight cross Transformer based on cross multi-axis attention mechanism. It includes cross-window attention and cross-grid attention to mine and integrate both local and global interactions of multimodal features. The cross Transformer is further guided by a spatial adaptation fusion module, which allows the model to focus on the most relevant information. Moreover, we design a special feature extraction module that combines multiple gradient residual dense convolutional and Transformer layers to obtain local features from coarse to fine and capture global features. The proposed strategy significantly boosts the fusion performance while minimizing computational costs. Extensive experiments, including clinical brain tumor image fusion, have shown that our model can achieve clearer texture details and better visual quality than other state-of-the-art fusion methods.
Collapse
|
110
|
Cai Y, Zhang W, Chen H, Cheng KT. MedIAnomaly: A comparative study of anomaly detection in medical images. Med Image Anal 2025; 102:103500. [PMID: 40009901 DOI: 10.1016/j.media.2025.103500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 02/05/2025] [Accepted: 02/06/2025] [Indexed: 02/28/2025]
Abstract
Anomaly detection (AD) aims at detecting abnormal samples that deviate from the expected normal patterns. Generally, it can be trained merely on normal data, without a requirement for abnormal samples, and thereby plays an important role in rare disease recognition and health screening in the medical domain. Despite the emergence of numerous methods for medical AD, the lack of a fair and comprehensive evaluation causes ambiguous conclusions and hinders the development of this field. To address this problem, this paper builds a benchmark with unified comparison. Seven medical datasets with five image modalities, including chest X-rays, brain MRIs, retinal fundus images, dermatoscopic images, and histopathology images, are curated for extensive evaluation. Thirty typical AD methods, including reconstruction and self-supervised learning-based methods, are involved in comparison of image-level anomaly classification and pixel-level anomaly segmentation. Furthermore, for the first time, we systematically investigate the effect of key components in existing methods, revealing unresolved challenges and potential future directions. The datasets and code are available at https://github.com/caiyu6666/MedIAnomaly.
Collapse
Affiliation(s)
- Yu Cai
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Weiwen Zhang
- Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Hao Chen
- Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China; Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Hong Kong, China; Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong, China.
| | - Kwang-Ting Cheng
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China; Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
| |
Collapse
|
111
|
Wang C, Qu K, Li S, Yu Y, He J, Zhang C, Shen Y. ArtiDiffuser: A unified framework for artifact restoration and synthesis for histology images via counterfactual diffusion model. Med Image Anal 2025; 102:103567. [PMID: 40188685 DOI: 10.1016/j.media.2025.103567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 03/24/2025] [Accepted: 03/25/2025] [Indexed: 04/15/2025]
Abstract
Artifacts in histology images pose challenges for accurate diagnosis with deep learning models, often leading to misinterpretations. Existing artifact restoration methods primarily rely on Generative Adversarial Networks (GANs), which approach the problem as image-to-image translation. However, those approaches are prone to mode collapse and can unexpectedly alter morphological features or staining styles. To address the issue, we propose ArtiDiffuser, a counterfactual diffusion model tailored to restore only artifact-distorted regions while preserving the integrity of the rest of the image. Additionally, we show an innovative perspective by addressing the misdiagnosis stemming from artifacts via artifact synthesis as data augmentation, and thereby leverage ArtiDiffuser to unify the artifact synthesis and the restoration capabilities. This synergy significantly surpasses the performance of conventional methods which separately handle artifact restoration or synthesis. We propose a Swin-Transformer denoising network backbone to capture both local and global attention, further enhanced with a class-guided Mixture of Experts (MoE) to process features related to specific artifact categories. Moreover, it utilizes adaptable class-specific tokens for enhanced feature discrimination and a mask-weighted loss function to specifically target and correct artifact-affected regions, thus addressing issues of data imbalance. In downstream applications, ArtiDiffuser employs a consistency regularization strategy that assures the model's predictive accuracy is maintained across original and artifact-augmented images. We also contribute the first comprehensive histology dataset, comprising 723 annotated patches across various artifact categories, to facilitate further research. Evaluations on four distinct datasets for both restoration and synthesis demonstrate ArtiDiffuser's effectiveness compared to GAN-based approaches, used for either pre-processing or augmentation. The code is available at https://github.com/wagnchogn/ArtiDiffuser.
Collapse
Affiliation(s)
- Chong Wang
- College of Medical Engineering, Xinxiang Medical University, Xinxiang 453000, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang 453000, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang 453000, China
| | - Kaili Qu
- College of Medical Engineering, Xinxiang Medical University, Xinxiang 453000, China
| | - Shuxin Li
- College of Medical Engineering, Xinxiang Medical University, Xinxiang 453000, China
| | - Yi Yu
- College of Medical Engineering, Xinxiang Medical University, Xinxiang 453000, China
| | - Junjun He
- Shanghai AI Laboratory, Shanghai, 200232, China
| | - Chen Zhang
- Department of Laboratory Animal Sciences, School of Basic Medical Sciences, Capital Medical University, Beijing, 100069, China; College of Basic Medicine, Inner Mongolia Medical University, Hohhot 010110, China; State Key Laboratory of Neurology and Oncology Drug Development, Nanjing 210000, China.
| | - Yiqing Shen
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.
| |
Collapse
|
112
|
Xiong Z, Li W, Zhao X, Zhang B, Tao R, Du Q. PRF-Net: A Progressive Remote Sensing Image Registration and Fusion Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:9437-9450. [PMID: 39042547 DOI: 10.1109/tnnls.2024.3429156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/25/2024]
Abstract
Most of the existing fusion algorithms are not robust to unregistered input images. Even after image registration, nonlinear nonregistration may persist in the local areas of the images, leading to poor quality in the fused image. So, as to tackle these challenges, a progressive remote sensing image registration and fusion network is proposed in this article, and named PRF-Net, which is particularly useful when two images are from different platforms. First, a registration network is designed to register the input image patches, which includes a global spatial transform network (GSTN) and a local spatial warp network (LSWN). The GSTN is primarily used for coarse registration, applying rigid transformation to globally align the input images. After coarse registration, the preliminarily registered moving image is input into the LSWN for local fine-tuning to maximize correlation between the input image patches. Subsequently, the fine registered images are degraded and input into the fusion network to generate the fused image. To maintain sufficient spectral and spatial information of the fused image, a multiscale feature extraction (MSFE) block with a highly interpretable spatial details attention (SDA) block is designed, which can enhance the ability of fusion network to extract and preserve spatial details and spectral information. Three groups of experiments conducted on four types of remote sensing images give evidence of that the proposed PRF-Net exhibits excellent performance in both reduced and full resolutions, showcasing its outstanding registration and fusion quality.
Collapse
|
113
|
Chen M, Wang K, Dohopolski M, Morgan H, Sher D, Wang J. TransAnaNet: Transformer-based anatomy change prediction network for head and neck cancer radiotherapy. Med Phys 2025; 52:3015-3029. [PMID: 39887473 PMCID: PMC12059511 DOI: 10.1002/mp.17655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 01/11/2025] [Accepted: 01/14/2025] [Indexed: 02/01/2025] Open
Abstract
BACKGROUND Adaptive radiotherapy (ART) can compensate for the dosimetric impact of anatomic change during radiotherapy of head-neck cancer (HNC) patients. However, implementing ART universally poses challenges in clinical workflow and resource allocation, given the variability in patient response and the constraints of available resources. Therefore, the prediction of anatomical change during radiotherapy for HNC patients is of importance to optimize patient clinical benefit and treatment resources. Current studies focus on developing binary ART eligibility classification models to identify patients who would experience significant anatomical change, but these models lack the ability to present the complex patterns and variations in anatomical changes over time. Vision Transformers (ViTs) represent a recent advancement in neural network architectures, utilizing self-attention mechanisms to process image data. Unlike traditional Convolutional Neural Networks (CNNs), ViTs can capture global contextual information more effectively, making them well-suited for image analysis and image generation tasks that involve complex patterns and structures, such as predicting anatomical changes in medical imaging. PURPOSE The purpose of this study is to assess the feasibility of using a ViT-based neural network to predict radiotherapy-induced anatomic change of HNC patients. METHODS We retrospectively included 121 HNC patients treated with definitive chemoradiotherapy (CRT) or radiation alone. We collected the planning computed tomography image (pCT), planned dose, cone beam computed tomography images (CBCTs) acquired at the initial treatment (CBCT01) and Fraction 21 (CBCT21), and primary tumor volume (GTVp) and involved nodal volume (GTVn) delineated on both pCT and CBCTs of each patient for model construction and evaluation. A UNet-style Swin-Transformer-based ViT network was designed to learn the spatial correspondence and contextual information from embedded image patches of CT, dose, CBCT01, GTVp, and GTVn. The deformation vector field between CBCT01 and CBCT21 was estimated by the model as the prediction of anatomic change, and deformed CBCT01 was used as the prediction of CBCT21. We also generated binary masks of GTVp, GTVn, and patient body for volumetric change evaluation. We used data from 101 patients for training and validation, and the remaining 20 patients for testing. Image and volumetric similarity metrics including mean square error (MSE), peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), Dice coefficient, and average surface distance were used to measure the similarity between the target image and predicted CBCT. Anatomy change prediction performance of the proposed model was compared to a CNN-based prediction model and a traditional ViT-based prediction model. RESULTS The predicted image from the proposed method yielded the best similarity to the real image (CBCT21) over pCT, CBCT01, and predicted CBCTs from other comparison models. The average MSE, PSNR, and SSIM between the normalized predicted CBCT and CBCT21 are 0.009, 20.266, and 0.933, while the average Dice coefficient between body mask, GTVp mask, and GTVn mask is 0.972, 0.792, and 0.821, respectively. CONCLUSIONS The proposed method showed promising performance for predicting radiotherapy-induced anatomic change, which has the potential to assist in the decision-making of HNC ART.
Collapse
Affiliation(s)
- Meixu Chen
- Medical Artificial Intelligence and Automation (MAIA) LabDepartment of Radiation OncologyUT Southwestern Medical CenterDallasTexasUSA
| | - Kai Wang
- Medical Artificial Intelligence and Automation (MAIA) LabDepartment of Radiation OncologyUT Southwestern Medical CenterDallasTexasUSA
- Department of Radiation OncologyUniversity of Maryland Medical CenterBaltimoreMarylandUSA
| | - Michael Dohopolski
- Medical Artificial Intelligence and Automation (MAIA) LabDepartment of Radiation OncologyUT Southwestern Medical CenterDallasTexasUSA
| | - Howard Morgan
- Medical Artificial Intelligence and Automation (MAIA) LabDepartment of Radiation OncologyUT Southwestern Medical CenterDallasTexasUSA
- Department of Radiation OncologyCentral Arkansas Radiation Therapy InstituteLittle RockArkansasUSA
| | - David Sher
- Medical Artificial Intelligence and Automation (MAIA) LabDepartment of Radiation OncologyUT Southwestern Medical CenterDallasTexasUSA
| | - Jing Wang
- Medical Artificial Intelligence and Automation (MAIA) LabDepartment of Radiation OncologyUT Southwestern Medical CenterDallasTexasUSA
| |
Collapse
|
114
|
Sun Q, He N, Yang P, Zhao X. Low dose computed tomography reconstruction with momentum-based frequency adjustment network. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 263:108673. [PMID: 40023964 DOI: 10.1016/j.cmpb.2025.108673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 11/29/2024] [Accepted: 02/13/2025] [Indexed: 03/04/2025]
Abstract
BACKGROUND AND OBJECTIVE Recent investigations into Low-Dose Computed Tomography (LDCT) reconstruction methods have brought Model-Based Data-Driven (MBDD) approaches to the forefront. One prominent architecture within MBDD entails the integration of Model-Based Iterative Reconstruction (MBIR) with Deep Learning (DL). While this approach offers the advantage of harnessing information from sinogram and image domains, it also reveals several deficiencies. First and foremost, the efficacy of DL methods within the realm of MBDD necessitates meticulous enhancement, as it directly impacts the computational cost and the quality of reconstructed images. Next, high computational costs and a high number of iterations limit the development of MBDD methods. Last but not least, CT reconstruction is sensitive to pixel accuracy, and the role of loss functions within DL methods is crucial for meeting this requirement. METHODS This paper advances MBDD methods through three principal contributions. Firstly, we introduce an innovative Frequency Adjustment Network (FAN) that effectively adjusts both high and low-frequency components during the inference phase, resulting in substantial enhancements in reconstruction performance. Second, we develop the Momentum-based Frequency Adjustment Network (MFAN), which leverages momentum terms as an extrapolation strategy to facilitate the amplification of changes throughout successive iterations, culminating in a rapid convergence framework. Lastly, we delve into the visual properties of CT images and present a unique loss function named Focal Detail Loss (FDL). The FDL function preserves fine details throughout the training phase, significantly improving reconstruction quality. RESULTS Through a series of experiments validation on the AAPM-Mayo public dataset and real-world piglet datasets, the aforementioned three contributions demonstrated superior performance. MFAN achieved convergence in 10 iterations as an iteration method, faster than other methods. Ablation studies further highlight the advanced performance of each contribution. CONCLUSIONS This paper presents an MBDD-based LDCT reconstruction method using a momentum-based frequency adjustment network with a focal detail loss function. This approach significantly reduces the number of iterations required for convergence while achieving superior reconstruction results in visual and numerical analyses.
Collapse
Affiliation(s)
- Qixiang Sun
- School of Mathematical Sciences, Capital Normal University, Beijing, 100048, China
| | - Ning He
- Smart City College, Beijing Union University, Beijing, 100101, China
| | - Ping Yang
- School of Mathematical Sciences, Capital Normal University, Beijing, 100048, China
| | - Xing Zhao
- School of Mathematical Sciences, Capital Normal University, Beijing, 100048, China.
| |
Collapse
|
115
|
Sun Y, Xu Z, Guo Y, Huang J, Huang G, Huang T, Zhao L, Jiang S, Zheng Z, Liu J, Zhang X, Huang X. Scale-Adaptive viable tumor burden estimation via histopathological microscopy image segmentation. Comput Biol Med 2025; 189:109915. [PMID: 40088715 DOI: 10.1016/j.compbiomed.2025.109915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2024] [Revised: 01/02/2025] [Accepted: 02/21/2025] [Indexed: 03/17/2025]
Abstract
Cancer segmentation in whole-slide images is a fundamental step for estimating tumor burden, which is crucial for cancer assessment. However, challenges such as vague boundaries and small regions dissociated from viable tumor areas make it a complex task. Considering the usefulness of multi-scale features in various vision-related tasks, we present a structure-aware, scale-adaptive feature selection method for efficient and accurate cancer segmentation. Built on a segmentation network with a popular encoder-decoder architecture, a scale-adaptive module is proposed to select more robust features that better represent vague, non-rigid boundaries. Furthermore, a structural similarity metric is introduced to enhance tissue structure awareness and improve small region segmentation. Additionally, advanced designs, including several attention mechanisms and selective-kernel convolutions, are incorporated into the baseline network for comparative study purposes. Extensive experimental results demonstrate that the proposed structure-aware, scale-adaptive network achieves outstanding performance in liver cancer segmentation compared to the top submitted results in the PAIP 2019 challenge. Further evaluation of colorectal cancer segmentation shows that the scale-adaptive module either improves the baseline network or outperforms other advanced attention mechanism designs, particularly when considering the trade-off between efficiency and accuracy. The source code is publicly available at https://github.com/IMOP-lab/Scale-Adaptive-Net.
Collapse
Affiliation(s)
- Yibao Sun
- Pengcheng Laboratory, Nanshan District, Shenzhen, 518055, Guangdong, China
| | - Zhaoyang Xu
- University of Cambridge, Department of Paediatrics, Cambridge, CB2 0QQ, United Kingdom
| | - Yihao Guo
- Hangzhou Dianzi University, Hangzhou, 310018, Zhejiang, China
| | - Jian Huang
- Hangzhou Dianzi University, Hangzhou, 310018, Zhejiang, China
| | - Gaopeng Huang
- Hangzhou Dianzi University, Hangzhou, 310018, Zhejiang, China
| | - Tangsen Huang
- Hangzhou Dianzi University, Hangzhou, 310018, Zhejiang, China
| | - Lou Zhao
- Hangzhou Dianzi University, Hangzhou, 310018, Zhejiang, China
| | - Shaowei Jiang
- Hangzhou Dianzi University, Hangzhou, 310018, Zhejiang, China
| | - Zhiwen Zheng
- Hangzhou Dianzi University, Hangzhou, 310018, Zhejiang, China
| | - Jin Liu
- Hangzhou Dianzi University, Hangzhou, 310018, Zhejiang, China
| | - Xiaoshuai Zhang
- Department of Information Science and Engineering, Ocean University of China, Qingdao, 266100, Shandong, China
| | - Xingru Huang
- School of Electronic Engineering and Computer Science, Queen Mary University of London, Mile End Road, London, E1 4NS, United Kingdom; Hangzhou Dianzi University, Hangzhou, 310018, Zhejiang, China.
| |
Collapse
|
116
|
Ma X, Wu J, Liu W. SAC-BL: A hypothesis testing framework for unsupervised visual anomaly detection and location. Neural Netw 2025; 185:107147. [PMID: 39892355 DOI: 10.1016/j.neunet.2025.107147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 10/28/2024] [Accepted: 01/10/2025] [Indexed: 02/03/2025]
Abstract
Reconstruction-based methods achieve promising performance for visual anomaly detection (AD), relying on the underlying assumption that the anomalies cannot be accurately reconstructed. However, this assumption does not always hold, especially when suffering weak anomalous (a.k.a. normal-like) examples. More significantly, the existing methods primarily devote to obtaining the strong discriminative score functions, but neglecting the systematic investigation of the decision rule based on the proposed score function. Unlike previous work, this paper solves the AD issue starting from the decision rule within the statistical framework, providing a new insight for AD community. Specifically, we frame the AD task as a multiple hypothesis testing problem, Then, we propose a novel betting-like (BL) procedure with an embedding of strong anomaly constraint network (SACNet), called SAC-BL, to address this testing problem. In SAC-BL, BL procedure serves as the decision rule and SACNet is trained to capture the critical discriminative information from weak anomalies. Theoretically, our SAC-BL can control false discovery rate (FDR) at the prescribed level. Finally, we conduct extensive experiments to verify the superiority of SAC-BL over previous method.
Collapse
Affiliation(s)
- Xinsong Ma
- School of Computer Science, Wuhan University, 299 Ba Yi Road, Wuchang District, Wuhan, 430072, Hubei, China.
| | - Jie Wu
- School of Computer Science, Wuhan University, 299 Ba Yi Road, Wuchang District, Wuhan, 430072, Hubei, China.
| | - Weiwei Liu
- School of Computer Science, Wuhan University, 299 Ba Yi Road, Wuchang District, Wuhan, 430072, Hubei, China.
| |
Collapse
|
117
|
Huang Y, Liu C, Li B, Huang H, Zhang R, Ke W, Jing X. Frequency-Aware Divide-and-Conquer for Efficient Real Noise Removal. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:8429-8441. [PMID: 39178076 DOI: 10.1109/tnnls.2024.3439591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/25/2024]
Abstract
Deep-learning-based approaches have achieved remarkable progress for complex real scenario denoising, yet their accuracy-efficiency tradeoff is still understudied, particularly critical for mobile devices. As real noise is unevenly distributed relative to underlay signals in different frequency bands, we introduce a frequency-aware divide-and-conquer strategy to develop a frequency-aware denoising network (FADN). FADN is materialized by stacking frequency-aware denoising blocks (FADBs), in which a denoised image is progressively predicted by a series of frequency-aware noise dividing and conquering operations. For noise dividing, FADBs decompose the noisy and clean image pairs into low- and high-frequency representations via a wavelet transform (WT) followed by an invertible network and recover the final denoised image by integrating the denoised information from different frequency bands. For noise conquering, the separated low-frequency representation of the noisy image is kept as clean as possible by the supervision of the clean counterpart, while the high-frequency representation combining the estimated residual from the successive FADB is purified under the corresponding accompanied supervision for residual compensation. Since our FADN progressively and pertinently denoises from frequency bands, the accuracy-efficiency tradeoff can be controlled as a requirement by the number of FADBs. Experimental results on the SIDD, DND, and NAM datasets show that our FADN outperforms the state-of-the-art methods by improving the peak signal-to-noise ratio (PSNR) and decreasing the model parameters. The code is released at https://github.com/NekoDaiSiki/FADN.
Collapse
|
118
|
Moncomble A, Alloyeau D, Moreaud M, Khelfa A, Wang G, Ortiz-Peña N, Amara H, Gatti R, Moreau R, Ricolleau C, Nelayah J. aquaDenoising: AI-enhancement of in situ liquid phase STEM video for automated quantification of nanoparticles growth. Ultramicroscopy 2025; 271:114121. [PMID: 40058164 DOI: 10.1016/j.ultramic.2025.114121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Revised: 02/21/2025] [Accepted: 02/24/2025] [Indexed: 03/22/2025]
Abstract
Automatic processing and full analysis of in situ liquid phase scanning transmission electron microscopy (LP-STEM) acquisitions are yet to be achievable with available techniques. This is particularly true for the extraction of information related to the nucleation and growth of nanoparticles (NPs) in liquid as several parasitic processes degrade the signal of interest. These degradations hinder the use of classical or state-of-the-art techniques making the understanding of NPs formation difficult to access. In this context, we propose aquaDenoising, a novel simulation-based deep neural framework to address the challenges of denoising LP-STEM images and videos. Trained on synthetic pairs of clean and noisy images obtained from kinematic-model-based simulations, we show that our model is able to achieve a fifteen-fold improvement in the signal-to-noise ratio of videos of gold NPs growing in water. The enhanced data unleash unprecedented possibilities for automatic segmentation and extraction of structures at different scales, from assemblies of objects down to the individual NPs with the same precision as manual segmentation performed by experts, but with higher throughput. The present denoising method can be easily adapted to other nanomaterials imaged in liquid media. All the codes developed in the present work are open and freely available.
Collapse
Affiliation(s)
- Adrien Moncomble
- Université Paris Cité, Laboratoire Matériaux et Phénomènes Quantiques, CNRS, F-75013 Paris, France
| | - Damien Alloyeau
- Université Paris Cité, Laboratoire Matériaux et Phénomènes Quantiques, CNRS, F-75013 Paris, France.
| | - Maxime Moreaud
- IFP Energies nouvelles, Rond-point de l'échangeur de Solaize BP 3 69360 Solaize, France
| | - Abdelali Khelfa
- Université Paris Cité, Laboratoire Matériaux et Phénomènes Quantiques, CNRS, F-75013 Paris, France
| | - Guillaume Wang
- Université Paris Cité, Laboratoire Matériaux et Phénomènes Quantiques, CNRS, F-75013 Paris, France
| | - Nathaly Ortiz-Peña
- Université Paris Cité, Laboratoire Matériaux et Phénomènes Quantiques, CNRS, F-75013 Paris, France
| | - Hakim Amara
- Université Paris Cité, Laboratoire Matériaux et Phénomènes Quantiques, CNRS, F-75013 Paris, France; Université Paris-Saclay, ONERA, CNRS, Laboratoire d'étude des microstructures (LEM), F-92322 Châtillon, France
| | - Riccardo Gatti
- Université Paris-Saclay, ONERA, CNRS, Laboratoire d'étude des microstructures (LEM), F-92322 Châtillon, France
| | - Romain Moreau
- Université Paris-Saclay, ONERA, CNRS, Laboratoire d'étude des microstructures (LEM), F-92322 Châtillon, France
| | - Christian Ricolleau
- Université Paris Cité, Laboratoire Matériaux et Phénomènes Quantiques, CNRS, F-75013 Paris, France
| | - Jaysen Nelayah
- Université Paris Cité, Laboratoire Matériaux et Phénomènes Quantiques, CNRS, F-75013 Paris, France.
| |
Collapse
|
119
|
Sastry K, Aborahama Y, Luo Y, Zhang Y, Cui M, Cao R, Ku G, Wang LV. Transcranial Photoacoustic Tomography De-Aberrated Using Boundary Elements. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:2068-2078. [PMID: 40030863 DOI: 10.1109/tmi.2025.3526000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Photoacoustic tomography holds tremendous potential for neuroimaging due to its functional magnetic resonance imaging (fMRI)-like functional contrast and greater specificity, richer contrast, portability, open platform, faster imaging, magnet-free and quieter operation, and lower cost. However, accounting for the skull-induced acoustic distortion remains a long-standing challenge due to the problem size. This is aggravated in functional imaging, where high accuracy is needed to detect minuscule functional changes. Here, we develop an acoustic solver based on the boundary-element method (BEM) to model the skull and de-aberrate the images. BEM uses boundary meshes and compression for superior computational efficiency compared to volumetric discretization-based methods. We demonstrate BEM's higher accuracy and favorable scalability relative to the widely used pseudo-spectral time-domain method (PSTD). In imaging through an ex-vivo adult human skull, BEM outperforms PSTD in several metrics. Our work establishes BEM as a valuable and naturally suited technique in photoacoustic tomography and lays the foundation for BEM-based de-aberration methods.
Collapse
|
120
|
Liebert A, Schreiter H, Kapsner LA, Eberle J, Ehring CM, Hadler D, Brock L, Erber R, Emons J, Laun FB, Uder M, Wenkel E, Ohlmeyer S, Bickelhaupt S. Impact of non-contrast-enhanced imaging input sequences on the generation of virtual contrast-enhanced breast MRI scans using neural network. Eur Radiol 2025; 35:2603-2616. [PMID: 39455455 PMCID: PMC12021982 DOI: 10.1007/s00330-024-11142-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 06/25/2024] [Accepted: 08/31/2024] [Indexed: 10/28/2024]
Abstract
OBJECTIVE To investigate how different combinations of T1-weighted (T1w), T2-weighted (T2w), and diffusion-weighted imaging (DWI) impact the performance of virtual contrast-enhanced (vCE) breast MRI. MATERIALS AND METHODS The IRB-approved, retrospective study included 1064 multiparametric breast MRI scans (age: 52 ± 12 years) obtained from 2017 to 2020 (single site, two 3-T MRI). Eleven independent neural networks were trained to derive vCE images from varying input combinations of T1w, T2w, and multi-b-value DWI sequences (b-value = 50-1500 s/mm2). Three readers evaluated the vCE images with regard to qualitative scores of diagnostic image quality, image sharpness, satisfaction with contrast/signal-to-noise ratio, and lesion/non-mass enhancement conspicuity. Quantitative metrics (SSIM, PSNR, NRMSE, and median symmetrical accuracy) were analyzed and statistically compared between the input combinations for the full breast volume and both enhancing and non-enhancing target findings. RESULTS The independent test set consisted of 187 cases. The quantitative metrics significantly improved in target findings when multi-b-value DWI sequences were included during vCE training (p < 0.05). Non-significant effects (p > 0.05) were observed for the quantitative metrics on the full breast volume when comparing input combinations including T1w. Using T1w and DWI acquisitions during vCE training is necessary to achieve high satisfaction with contrast/SNR and good conspicuity of the enhancing findings. The input combination of T1w, T2w, and DWI sequences with three b-values showed the best qualitative performance. CONCLUSION vCE breast MRI performance is significantly influenced by input sequences. Quantitative metrics and visual quality of vCE images significantly benefit when multi b-value DWI is added to morphologic T1w-/T2w sequences as input for model training. KEY POINTS Question How do different MRI sequences impact the performance of virtual contrast-enhanced (vCE) breast MRI? Findings The input combination of T1-weighted, T2-weighted, and diffusion-weighted imaging sequences with three b-values showed the best qualitative performance. Clinical relevance While in the future neural networks providing virtual contrast-enhanced images might further improve accessibility to breast MRI, the significant influence of input data needs to be considered during translational research.
Collapse
Affiliation(s)
- Andrzej Liebert
- Institute of Radiology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany.
| | - Hannes Schreiter
- Institute of Radiology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Lorenz A Kapsner
- Institute of Radiology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
- Lehrstuhl für Medizinische Informatik, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Jessica Eberle
- Institute of Radiology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Chris M Ehring
- Institute of Radiology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Dominique Hadler
- Institute of Radiology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Luise Brock
- Institute of Radiology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Ramona Erber
- Institute of Pathology, Universitätsklinikum Erlangen, Erlangen, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Julius Emons
- Department of Gynecology and Obstetrics, Erlangen University Hospital, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Frederik B Laun
- Institute of Radiology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Michael Uder
- Institute of Radiology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Evelyn Wenkel
- Medizinische Fakultät, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
- Radiologie München, München, Germany
| | - Sabine Ohlmeyer
- Institute of Radiology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Sebastian Bickelhaupt
- Institute of Radiology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
- German Cancer Research Center (DKFZ), Heidelberg, Germany
| |
Collapse
|
121
|
Lin H, Seitz S, Tan Y, Lugagne JB, Wang L, Ding G, He H, Rauwolf TJ, Dunlop MJ, Connor JH, Porco JA, Tian L, Cheng JX. Label-free nanoscopy of cell metabolism by ultrasensitive reweighted visible stimulated Raman scattering. Nat Methods 2025; 22:1040-1050. [PMID: 39820753 PMCID: PMC12074879 DOI: 10.1038/s41592-024-02575-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Accepted: 11/26/2024] [Indexed: 01/19/2025]
Abstract
Super-resolution imaging of cell metabolism is hindered by the incompatibility of small metabolites with fluorescent dyes and the limited resolution of imaging mass spectrometry. We present ultrasensitive reweighted visible stimulated Raman scattering (URV-SRS), a label-free vibrational imaging technique for multiplexed nanoscopy of intracellular metabolites. We developed a visible SRS microscope with extensive pulse chirping to improve the detection limit to ~4,000 molecules and introduced a self-supervised multi-agent denoiser to suppress non-independent noise in SRS by over 7.2 dB, resulting in a 50-fold sensitivity enhancement over near-infrared SRS. Leveraging the enhanced sensitivity, we employed Fourier reweighting to amplify sub-100-nm spatial frequencies that were previously overwhelmed by noise. Validated by Fourier ring correlation, we achieved a lateral resolution of 86 nm in cell imaging. We visualized the reprogramming of metabolic nanostructures associated with virus replication in host cells and subcellular fatty acid synthesis in engineered bacteria, demonstrating its capability towards nanoscopic spatial metabolomics.
Collapse
Affiliation(s)
- Haonan Lin
- Department of Electrical and Computer Engineering, Boston University, Boston, MA, USA
- Department of Biomedical Engineering, Boston University, Boston, MA, USA
- Photonics Center, Boston University, Boston, MA, USA
| | - Scott Seitz
- Department of Virology, Immunology, and Microbiology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- National Emerging Infectious Diseases Laboratories, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Yuying Tan
- Department of Biomedical Engineering, Boston University, Boston, MA, USA
- Photonics Center, Boston University, Boston, MA, USA
| | - Jean-Baptiste Lugagne
- Department of Biomedical Engineering, Boston University, Boston, MA, USA
- Biological Design Center, Boston University, Boston, MA, USA
| | - Le Wang
- Department of Electrical and Computer Engineering, Boston University, Boston, MA, USA
- Photonics Center, Boston University, Boston, MA, USA
| | - Guangrui Ding
- Department of Electrical and Computer Engineering, Boston University, Boston, MA, USA
- Photonics Center, Boston University, Boston, MA, USA
| | - Hongjian He
- Department of Electrical and Computer Engineering, Boston University, Boston, MA, USA
- Photonics Center, Boston University, Boston, MA, USA
| | - Tyler J Rauwolf
- Department of Chemistry, Boston University, Boston, MA, USA
- Center for Molecular Discovery (BU-CMD), Boston University, Boston, MA, USA
| | - Mary J Dunlop
- Department of Biomedical Engineering, Boston University, Boston, MA, USA
- Biological Design Center, Boston University, Boston, MA, USA
| | - John H Connor
- Department of Virology, Immunology, and Microbiology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- National Emerging Infectious Diseases Laboratories, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - John A Porco
- Department of Chemistry, Boston University, Boston, MA, USA
- Center for Molecular Discovery (BU-CMD), Boston University, Boston, MA, USA
| | - Lei Tian
- Department of Electrical and Computer Engineering, Boston University, Boston, MA, USA
- Department of Biomedical Engineering, Boston University, Boston, MA, USA
- Photonics Center, Boston University, Boston, MA, USA
| | - Ji-Xin Cheng
- Department of Electrical and Computer Engineering, Boston University, Boston, MA, USA.
- Department of Biomedical Engineering, Boston University, Boston, MA, USA.
- Photonics Center, Boston University, Boston, MA, USA.
- Department of Chemistry, Boston University, Boston, MA, USA.
| |
Collapse
|
122
|
Liu W, Duinkharjav B, Sun Q, Zhang SQ. FovealNet: Advancing AI-Driven Gaze Tracking Solutions for Efficient Foveated Rendering in Virtual Reality. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:3183-3193. [PMID: 40067704 DOI: 10.1109/tvcg.2025.3549577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/14/2025]
Abstract
Leveraging real-time eye tracking, foveated rendering optimizes hardware efficiency and enhances visual quality virtual reality (VR). This approach leverages eye-tracking techniques to determine where the user is looking, allowing the system to render high-resolution graphics only in the foveal region-the small area of the retina where visual acuity is highest, while the peripheral view is rendered at lower resolution. However, modern deep learning-based gaze-tracking solutions often exhibit a long-tail distribution of tracking errors, which can degrade user experience and reduce the benefits of foveated rendering by causing misalignment and decreased visual quality. This paper introduces FovealNet, an advanced AI-driven gaze tracking framework designed to optimize system performance by strategically enhancing gaze tracking accuracy. To further reduce the implementation cost of the gaze tracking algorithm, FovealNet employs an event-based cropping method that eliminates over 64.8% of irrelevant pixels from the input image. Additionally, it incorporates a simple yet effective token-pruning strategy that dynamically removes tokens on the fly without compromising tracking accuracy. Finally, to support different runtime rendering configurations, we propose a system performance-aware multi-resolution training strategy, allowing the gaze tracking DNN to adapt and optimize overall system performance more effectively. Evaluation results demonstrate that FovealNet achieves at least 1.42× speed up compared to previous methods and 13% increase in perceptual quality for foveated output. The code is available at https://github.com/wl3181/FovealNet.
Collapse
|
123
|
Lee J, Kim S, Ahn J, Wang AS, Baek J. X-ray CT metal artifact reduction using neural attenuation field prior. Med Phys 2025. [PMID: 40305006 DOI: 10.1002/mp.17859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Revised: 03/26/2025] [Accepted: 04/14/2025] [Indexed: 05/02/2025] Open
Abstract
BACKGROUND The presence of metal objects in computed tomography (CT) imaging introduces severe artifacts that degrade image quality and hinder accurate diagnosis. While several deep learning-based metal artifact reduction (MAR) methods have been proposed, they often exhibit poor performance on unseen data and require large datasets to train neural networks. PURPOSE In this work, we propose a sinogram inpainting method for metal artifact reduction that leverages a neural attenuation field (NAF) as a prior. This new method, dubbed NAFMAR, operates in a self-supervised manner by optimizing a model-based neural field, thus eliminating the need for large training datasets. METHODS NAF is optimized to generate prior images, which are then used to inpaint metal traces in the original sinogram. To address the corruption of x-ray projections caused by metal objects, a 3D forward projection of the original corrupted image is performed to identify metal traces. Consequently, NAF is optimized using a metal trace-masked ray sampling strategy that selectively utilizes uncorrupted rays to supervise the network. Moreover, a metal-aware loss function is proposed to prioritize metal-associated regions during optimization, thereby enhancing the network to learn more informed representations of anatomical features. After optimization, the NAF images are rendered to generate NAF prior images, which serve as priors to correct original projections through interpolation. Experiments are conducted to compare NAFMAR with other prior-based inpainting MAR methods. RESULTS The proposed method provides an accurate prior without requiring extensive datasets. Images corrected using NAFMAR showed sharp features and preserved anatomical structures. Our comprehensive evaluation, involving simulated dental CT and clinical pelvic CT images, demonstrated the effectiveness of NAF prior compared to other prior information, including the linear interpolation and data-driven convolutional neural networks (CNNs). NAFMAR outperformed all compared baselines in terms of structural similarity index measure (SSIM) values, and its peak signal-to-noise ratio (PSNR) value was comparable to that of the dual-domain CNN method. CONCLUSIONS NAFMAR presents an effective, high-fidelity solution for metal artifact reduction in 3D tomographic imaging without the need for large datasets.
Collapse
Affiliation(s)
- Jooho Lee
- Department of Artificial Intelligence, Yonsei University, Seoul, Republic of Korea
| | - Seongjun Kim
- School of Integrated Technology, Yonsei University, Seoul, Republic of Korea
| | - Junhyun Ahn
- School of Integrated Technology, Yonsei University, Seoul, Republic of Korea
| | - Adam S Wang
- Department of Radiology, Stanford University, California, USA
| | - Jongduk Baek
- Department of Artificial Intelligence, Yonsei University, Seoul, Republic of Korea
| |
Collapse
|
124
|
Li Z, Sun Z, Lv L, Liu Y, Wang X, Xu J, Xing J, Babyn P, Sun FR. Ultra-sparse view lung CT image reconstruction using generative adversarial networks and compressed sensing. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2025:8953996251329214. [PMID: 40296779 DOI: 10.1177/08953996251329214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/30/2025]
Abstract
X-ray ionizing radiation from Computed Tomography (CT) scanning increases cancer risk for patients, thus making sparse view CT, which diminishes X-ray exposure by lowering the number of projections, highly significant in diagnostic imaging. However, reducing the number of projections inherently degrades image quality, negatively impacting clinical diagnosis. Consequently, attaining reconstructed images that meet diagnostic imaging criteria for sparse view CT is challenging. This paper presents a novel network (CSUF), specifically designed for ultra-sparse view lung CT image reconstruction. The CSUF network consists of three cohesive components including (1) a compressed sensing-based CT image reconstruction module (VdCS module), (2) a U-shaped end-to-end network, CT-RDNet, enhanced with a self-attention mechanism, acting as the generator in a Generative Adversarial Network (GAN) for CT image restoration and denoising, and (3) a feedback loop. The VdCS module enriches CT-RDNet with enhanced features, while CT-RDNet supplies the VdCS module with prior images infused with rich details and minimized artifacts, facilitated by the feedback loop. Engineering simulation experimental results demonstrate the robustness of the CSUF network and its potential to deliver lung CT images with diagnostic imaging quality even under ultra-sparse view conditions.
Collapse
Affiliation(s)
- Zhaoguang Li
- School of Integrated Circuits, Shandong University, Jinan, China
| | - Zhengxiang Sun
- Faculty of Science, The University of Sydney, NSW, Australia
| | - Lin Lv
- School of Integrated Circuits, Shandong University, Jinan, China
| | - Yuhan Liu
- School of Integrated Circuits, Shandong University, Jinan, China
| | - Xiuying Wang
- Faculty of Engineering, The University of Sydney, NSW, Australia
| | - Jingjing Xu
- School of Integrated Circuits, Shandong University, Jinan, China
| | - Jianping Xing
- School of Integrated Circuits, Shandong University, Jinan, China
| | - Paul Babyn
- Department of Medical Imaging, University of Saskatchewan and Saskatoon Health Region, Saskatoon, Canada
| | - Feng-Rong Sun
- School of Integrated Circuits, Shandong University, Jinan, China
| |
Collapse
|
125
|
Goodwin E, Davies M, Bakiro M, Desroche E, Tumino F, Aloisio M, Crudden CM, Ragogna PJ, Karttunen M, Barry ST. Atomic Layer Restructuring of Gold Surfaces by N-Heterocyclic Carbenes over Large Surface Areas. ACS NANO 2025; 19:15617-15626. [PMID: 40239036 DOI: 10.1021/acsnano.4c17517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2025]
Abstract
Even highly planar, polished metal surfaces display varying levels of roughness that can affect their optical and electronic properties, impacting performance in state-of-the-art microelectronics. Current methods for smoothing rough metallic surfaces require either the removal or addition of substantial amounts of material using complex processes that are incompatible with 3-dimensional nanoscale features needed for state-of-the-art applications. We present a vapor-phase process that results in up to a 60% smoothing of nanometer-scale rough gold surfaces through a single exposure to a class of ligands called N-heterocyclic carbenes. This process does not require removal or addition of metal from the surface and provides smoothing at the Ångström scale. Smoothing occurs in a single deposition, giving quantifiable differences in the adsorption behavior of the resulting surfaces. The process takes place through an adatom-extraction-driven destabilization and restructuring of the surface in a self-limiting manner. This process is achieved without the use of harsh chemical etchants or mechanical intervention, takes only minutes, and can easily be integrated with vapor-phase processing in situ in microfabrication workflows. Our observations demonstrate atomic layer restructuring, a technique that compliments atomic layer deposition and atomic layer etching in the fabrication and processing of high-precision materials.
Collapse
Affiliation(s)
- Eden Goodwin
- Department of Chemistry, Carleton University, Ottawa, Ontario K1S 5B6, Canada
- Carbon to Metal Coating Institute, Queen's University, Kingston, Ontario K7L 3N6, Canada
| | - Matthew Davies
- Carbon to Metal Coating Institute, Queen's University, Kingston, Ontario K7L 3N6, Canada
- Department of Physics and Astronomy, Western University, London, Ontario N6A 3K7, Canada
- Department of Chemistry, Western University, London, Ontario N6A 3K7, Canada
| | - Maram Bakiro
- Department of Chemistry, Carleton University, Ottawa, Ontario K1S 5B6, Canada
- Carbon to Metal Coating Institute, Queen's University, Kingston, Ontario K7L 3N6, Canada
| | - Emmett Desroche
- Carbon to Metal Coating Institute, Queen's University, Kingston, Ontario K7L 3N6, Canada
- Department of Chemistry, Queen's University, Kingston, Ontario K7L 3N6, Canada
| | - Francesco Tumino
- Carbon to Metal Coating Institute, Queen's University, Kingston, Ontario K7L 3N6, Canada
- Department of Chemistry, Queen's University, Kingston, Ontario K7L 3N6, Canada
| | - Mark Aloisio
- Carbon to Metal Coating Institute, Queen's University, Kingston, Ontario K7L 3N6, Canada
- Department of Chemistry, Queen's University, Kingston, Ontario K7L 3N6, Canada
| | - Cathleen M Crudden
- Carbon to Metal Coating Institute, Queen's University, Kingston, Ontario K7L 3N6, Canada
- Department of Chemistry, Queen's University, Kingston, Ontario K7L 3N6, Canada
| | - Paul J Ragogna
- Carbon to Metal Coating Institute, Queen's University, Kingston, Ontario K7L 3N6, Canada
- Department of Chemistry, Western University, London, Ontario N6A 3K7, Canada
- Surface Science Western, 999 Collip Circle, London, ON N6G 0J3, Canada
| | - Mikko Karttunen
- Carbon to Metal Coating Institute, Queen's University, Kingston, Ontario K7L 3N6, Canada
- Department of Physics and Astronomy, Western University, London, Ontario N6A 3K7, Canada
- Department of Chemistry, Western University, London, Ontario N6A 3K7, Canada
| | - Seán T Barry
- Department of Chemistry, Carleton University, Ottawa, Ontario K1S 5B6, Canada
- Carbon to Metal Coating Institute, Queen's University, Kingston, Ontario K7L 3N6, Canada
| |
Collapse
|
126
|
Ding L, Zhang C, Lyu X, Cheng D, Xu S. Unified Framework for Enhancement of Low-Quality Fundus Images. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01509-3. [PMID: 40301293 DOI: 10.1007/s10278-025-01509-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2025] [Revised: 04/14/2025] [Accepted: 04/15/2025] [Indexed: 05/01/2025]
Abstract
Compared to desktop fundus cameras, handheld ones offer portability and affordability, although they often produce lower-quality images. This paper primarily addresses the issue of reduced image quality commonly associated with images captured by handheld fundus cameras. We first collected 538 fundus images obtained from handheld devices to form a dataset called Mule. A unified framework that consists of three main modules is then proposed to enhance the quality of fundus images. The Light Balance Module is employed first to suppress overexposure and underexposure. This is followed by the Super Resolution Module to enhance vascular details. Finally, the Vessel Enhancement Module is applied to improve image contrast. And a special preservation strategy is additionally applied to retain mocular features in the final fundus image. Objective evaluations demonstrate that the proposed framework yields the most promising results. Further experiments also suggest that it improves accuracy in downstream tasks, such as vessel segmentation, optic disc/optic cup detection, macula detection, and fundus image quality assessment. Our code is available at: https://github.com/Alen880/UFELQ.
Collapse
Affiliation(s)
- Lihua Ding
- School of Information Science and Technology, HangZhou Normal University, Hangzhou, 311100, Zhejiang, China
| | - Chengyi Zhang
- School of Information Science and Technology, HangZhou Normal University, Hangzhou, 311100, Zhejiang, China
| | - Xingzheng Lyu
- Hangzhou Mocular Medical Technology Inc., Lin'an District Future Eye Valley, Hangzhou, 311100, Zhejiang, China
| | - Deji Cheng
- Hangzhou Mocular Medical Technology Inc., Lin'an District Future Eye Valley, Hangzhou, 311100, Zhejiang, China
| | - Shuchang Xu
- School of Information Science and Technology, HangZhou Normal University, Hangzhou, 311100, Zhejiang, China.
| |
Collapse
|
127
|
Demofonti A, Germanotta M, Zingaro A, Bailo G, Insalaco S, Cordella F, Aprile IG, Zollo L. Restoring Somatotopic Sensory Feedback in Lower Limb Amputees through Noninvasive Nerve Stimulation. CYBORG AND BIONIC SYSTEMS 2025; 6:0243. [PMID: 40302942 PMCID: PMC12038349 DOI: 10.34133/cbsystems.0243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Revised: 02/10/2025] [Accepted: 02/18/2025] [Indexed: 05/02/2025] Open
Abstract
Patients with lower limb amputation experience ambulation disorders since they rely exclusively on visual information in addition to the tactile information they receive from stump-socket interface. The lack of sensory feedback in commercial lower limb prostheses is essential in their abandonment by patients with transtibial amputation (TTA) or transfemoral amputation (TFA). Recent studies have obtained promising results using invasive interfaces with peripheral nervous system presenting drawbacks related to surgery. This paper aims to (a) investigate the potential of transcutaneous electrical nerve stimulation (TENS) as noninvasive means for restoring somatotopic sensory feedback in lower limb amputees and (b) evaluate the effect of the system over a 4-week experimental protocol. The first phase of the study involved 13 participants (6 with TTA and 7 with TFA), and the second one evaluated the long-term effect of TENS on ambulation performance of 2 participants (S1 with TTA and S7 with TFA). The proposed system enhanced participant's ambulation significantly increasing the body weight distribution between legs (S1: from 134% to 143%, P < 0.0055; S7: from 66% to 72%, P < 0.0055) and gait symmetry (S1: step length symmetry index from 11% to 5%, P < 0.0055; S7: stance phase symmetry index from -4% to -2%, P < 0.0055). It led to a postamputation neuropathic pain reduction in S1 (neuropathic pain symptom inventory score diminished from 6 to 0). This demonstrates how TENS enhanced prosthesis embodiment, enabling greater load bearing and more physiological gait patterns. This study highlights TENS as noninvasive solution for restoring somatotopic sensory feedback, addressing the current limitations and paving the way for further research.
Collapse
Affiliation(s)
- Andrea Demofonti
- Research Unit of Advanced Robotics and Human-Centred Technologies (CREO Lab),
Università Campus Bio-Medico di Roma, 00121 Rome, Italy
- IRCCS Fondazione Don Carlo Gnocchi ONLUS, 50143 Florence, Italy
| | | | - Andrea Zingaro
- Research Unit of Advanced Robotics and Human-Centred Technologies (CREO Lab),
Università Campus Bio-Medico di Roma, 00121 Rome, Italy
| | - Gaia Bailo
- IRCCS Fondazione Don Carlo Gnocchi ONLUS, 50143 Florence, Italy
| | - Sabina Insalaco
- IRCCS Fondazione Don Carlo Gnocchi ONLUS, 50143 Florence, Italy
| | - Francesca Cordella
- Research Unit of Advanced Robotics and Human-Centred Technologies (CREO Lab),
Università Campus Bio-Medico di Roma, 00121 Rome, Italy
| | | | - Loredana Zollo
- Research Unit of Advanced Robotics and Human-Centred Technologies (CREO Lab),
Università Campus Bio-Medico di Roma, 00121 Rome, Italy
| |
Collapse
|
128
|
Zuo B, Sun W, Zhao Z, Yuan X, Wang Y. NP-Hand: Novel Perspective Hand Image Synthesis Guided by Normals. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:2435-2449. [PMID: 40249692 DOI: 10.1109/tip.2025.3560241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/20/2025]
Abstract
Synthesizing multi-view images that are geometrically consistent with a given single-view image is one of the hot issues in AIGC in recent years. Existing methods have achieved impressive performance on objects with symmetry or rigidity, but they are inappropriate for the human hand. Because an image-captured human hand has more diverse poses and less attractive textures. In this paper, we propose NP-Hand, a framework that elegantly combines the diffusion model and generative adversarial network: The multi-step diffusion is trained to synthesize low-resolution novel perspective, while the single-step generator is exploited to further enhance synthesis quality. To maintain the consistency between inputs and synthesis, we creatively introduce normal maps into NP-Hand to guide the whole synthesizing process. Comprehensive evaluations have demonstrated that the proposed framework is superior to existing state-of-the-art models and more suitable for synthesizing hand images with faithful structures and realistic appearance details. The code will be released on our website.
Collapse
|
129
|
Du Y, Liu Y, Wu H, Kang J, Gui Z, Zhang P, Ren Y. Combination of edge enhancement and cold diffusion model for low dose CT image denoising. BIOMED ENG-BIOMED TE 2025; 70:157-169. [PMID: 39501464 DOI: 10.1515/bmt-2024-0362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Accepted: 10/16/2024] [Indexed: 04/05/2025]
Abstract
OBJECTIVES Since the quality of low dose CT (LDCT) images is often severely affected by noise and artifacts, it is very important to maintain high quality CT images while effectively reducing the radiation dose. METHODS In recent years, the representation of diffusion models to produce high quality images and stable trainability has attracted wide attention. With the extension of the cold diffusion model to the classical diffusion model, its application has greater flexibility. Inspired by the cold diffusion model, we proposes a low dose CT image denoising method, called CECDM, based on the combination of edge enhancement and cold diffusion model. The LDCT image is taken as the end point (forward) of the diffusion process and the starting point (reverse) of the sampling process. Improved sobel operator and Convolution Block Attention Module are added to the network, and compound loss function is adopted. RESULTS The experimental results show that CECDM can effectively remove noise and artifacts from LDCT images while the inference time of a single image is reduced to 0.41 s. CONCLUSIONS Compared with the existing LDCT image post-processing methods, CECDM has a significant improvement in all indexes.
Collapse
Affiliation(s)
- Yinglin Du
- State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| | - Yi Liu
- State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| | - Han Wu
- State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| | - Jiaqi Kang
- State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| | - Zhiguo Gui
- State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| | - Pengcheng Zhang
- State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| | - Yali Ren
- State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, China
| |
Collapse
|
130
|
Chen J, Ye Z, Zhang R, Li H, Fang B, Zhang LB, Wang W. Medical image translation with deep learning: Advances, datasets and perspectives. Med Image Anal 2025; 103:103605. [PMID: 40311301 DOI: 10.1016/j.media.2025.103605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2024] [Revised: 03/07/2025] [Accepted: 04/12/2025] [Indexed: 05/03/2025]
Abstract
Traditional medical image generation often lacks patient-specific clinical information, limiting its clinical utility despite enhancing downstream task performance. In contrast, medical image translation precisely converts images from one modality to another, preserving both anatomical structures and cross-modal features, thus enabling efficient and accurate modality transfer and offering unique advantages for model development and clinical practice. This paper reviews the latest advancements in deep learning(DL)-based medical image translation. Initially, it elaborates on the diverse tasks and practical applications of medical image translation. Subsequently, it provides an overview of fundamental models, including convolutional neural networks (CNNs), transformers, and state space models (SSMs). Additionally, it delves into generative models such as Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), Autoregressive Models (ARs), diffusion Models, and flow Models. Evaluation metrics for assessing translation quality are discussed, emphasizing their importance. Commonly used datasets in this field are also analyzed, highlighting their unique characteristics and applications. Looking ahead, the paper identifies future trends, challenges, and proposes research directions and solutions in medical image translation. It aims to serve as a valuable reference and inspiration for researchers, driving continued progress and innovation in this area.
Collapse
Affiliation(s)
- Junxin Chen
- School of Software, Dalian University of Technology, Dalian 116621, China.
| | - Zhiheng Ye
- School of Software, Dalian University of Technology, Dalian 116621, China.
| | - Renlong Zhang
- Institute of Research and Clinical Innovations, Neusoft Medical Systems Co., Ltd., Beijing, China.
| | - Hao Li
- School of Computing Science, University of Glasgow, Glasgow G12 8QQ, United Kingdom.
| | - Bo Fang
- School of Computer Science, The University of Sydney, Sydney, NSW 2006, Australia.
| | - Li-Bo Zhang
- Department of Radiology, General Hospital of Northern Theater Command, Shenyang 110840, China.
| | - Wei Wang
- Guangdong-Hong Kong-Macao Joint Laboratory for Emotion Intelligence and Pervasive Computing, Artificial Intelligence Research Institute, Shenzhen MSU-BIT University, Shenzhen 518172, China; School of Medical Technology, Beijing Institute of Technology, Beijing 100081, China.
| |
Collapse
|
131
|
Jiang C, Xing X, Nan Y, Fang Y, Zhang S, Walsh S, Yang G, Shen D. A lung structure and function information-guided residual diffusion model for predicting idiopathic pulmonary fibrosis progression. Med Image Anal 2025; 103:103604. [PMID: 40315576 DOI: 10.1016/j.media.2025.103604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2024] [Revised: 03/30/2025] [Accepted: 04/12/2025] [Indexed: 05/04/2025]
Abstract
Idiopathic Pulmonary Fibrosis (IPF) is a progressive lung disease that continuously scars and thickens lung tissue, leading to respiratory difficulties. Timely assessment of IPF progression is essential for developing treatment plans and improving patient survival rates. However, current clinical standards require multiple (usually two) CT scans at certain intervals to assess disease progression. This presents a dilemma: the disease progression is identified only after the disease has already progressed. To address this issue, a feasible solution is to generate the follow-up CT image from the patient's initial CT image to achieve early prediction of IPF. To this end, we propose a lung structure and function information-guided residual diffusion model. The key components of our model include (1) using a 2.5D generation strategy to reduce computational cost of generating 3D images with the diffusion model; (2) designing structural attention to mitigate negative impact of spatial misalignment between the two CT images on generation performance; (3) employing residual diffusion to accelerate model training and inference while focusing more on differences between the two CT images (i.e., the lesion areas); and (4) developing a CLIP-based text extraction module to extract lung function test information and further using such extracted information to guide the generation. Extensive experiments demonstrate that our method can effectively predict IPF progression and achieve superior generation performance compared to state-of-the-art methods.
Collapse
Affiliation(s)
- Caiwen Jiang
- School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai, China; Bioengineering Department and Imperial-X, Imperial College London, London, UK.
| | - Xiaodan Xing
- Bioengineering Department and Imperial-X, Imperial College London, London, UK
| | - Yang Nan
- Bioengineering Department and Imperial-X, Imperial College London, London, UK
| | - Yingying Fang
- Bioengineering Department and Imperial-X, Imperial College London, London, UK
| | - Sheng Zhang
- Bioengineering Department and Imperial-X, Imperial College London, London, UK
| | - Simon Walsh
- Bioengineering Department and Imperial-X, Imperial College London, London, UK
| | - Guang Yang
- Bioengineering Department and Imperial-X, Imperial College London, London, UK; National Heart and Lung Institute, Imperial College London, London, UK; Cardiovascular Research Centre, Royal Brompton Hospital, London, UK; School of Biomedical Engineering & Imaging Sciences, King's College London, London, UK.
| | - Dinggang Shen
- School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai, China; Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China; Shanghai Clinical Research and Trial Center, Shanghai, China.
| |
Collapse
|
132
|
Yang B, Han H, Zhang W, Li H. General retinal image enhancement via reconstruction: Bridging distribution shifts using latent diffusion adaptors. Med Image Anal 2025; 103:103603. [PMID: 40300379 DOI: 10.1016/j.media.2025.103603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 01/21/2025] [Accepted: 04/12/2025] [Indexed: 05/01/2025]
Abstract
Deep learning-based fundus image enhancement has attracted extensive research attention recently, which has shown remarkable effectiveness in improving the visibility of low-quality images. However, these methods are often constrained to specific datasets and degradations, leading to poor generalization capabilities and having challenges in the fine-tuning process. Therefore, a general method for fundus image enhancement is proposed for improved generalizability and flexibility, which decomposes the enhancement task into reconstruction and adaptation phases. In the reconstruction phase, self-supervised training with unpaired data is employed, allowing the utilization of extensive public datasets to improve the generalizability of the model. During the adaptation phase, the model is fine-tuned according to the target datasets and their degradations, utilizing the pre-trained weights from the reconstruction. The proposed method improves the feasibility of latent diffusion models for retinal image enhancement. Adaptation loss and enhancement adaptor are proposed in autoencoders and diffusion networks for fewer paired training data, fewer trainable parameters, and faster convergence compared with training from scratch. The proposed method can be easily fine-tuned and experiments demonstrate the adaptability for different datasets and degradations. Additionally, the reconstruction-adaptation framework can be utilized in different backbones and other modalities, which shows its generality.
Collapse
Affiliation(s)
- Bingyu Yang
- Beijing Institute of Technology, Beijing, 100081, China
| | - Haonan Han
- Beijing Institute of Technology, Beijing, 100081, China
| | - Weihang Zhang
- Beijing Institute of Technology, Beijing, 100081, China
| | - Huiqi Li
- Beijing Institute of Technology, Beijing, 100081, China.
| |
Collapse
|
133
|
Zariry Z, Lamberton F, Frost R, Gaass T, Troalen T, Rayson H, Slipsager JM, Richard N, van der Kouwe A, Bonaiuto J, Hiba B. An in-vivo approach to quantify in-MRI head motion tracking accuracy: comparison of markerless optical tracking versus fat-navigators. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.04.23.25326185. [PMID: 40313280 PMCID: PMC12045414 DOI: 10.1101/2025.04.23.25326185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2025]
Abstract
Purpose Head-motion tracking and correction remains a key area of research in MRI, but the lack of rigorous evaluation approaches hinders their optimization and comparison. This study introduces an in-vivo framework to assess head-motion tracking methods and compares a markerless optical system (MOS) to a fat-signal navigator (FatNav). Methods Six participants underwent 3T brain MRI using a T1-weighted (T1w) pulse-sequence with a fat-navigator module. Participants performed head-rotations of 2° or 4°, each visually guided by MOS feedback around a single primary axis (X or Z). MOS and FatNav estimations were evaluated against T1w-images rigid-registration, as gold-standard, across seven different head positions. Results The MOS outperformed FatNav in tracking primary head-rotations and unintentional translations, while FatNav showed marginally better accuracy to subtle secondary rotations. Neck-masking of fat-navigators further improved FatNav estimates of pitch + rotations. The quality of T1w-images collected motionless and with head-rotations of 2° and 4° were also investigated. The MOS outperformed FatNav in restoring image fidelity, evidenced by higher structural similarity index, peak signal-to-noise ratio and focus measure. However, image quality evaluation lacked sensitivity to subtle improvements in FatNav with neck-masking. Conclusion The proposed in-vivo framework enables direct quantitative evaluation of intra-MRI head-motion tracking methods. MOS outperformed FatNav in estimating primary head rotations and unintentional translations, with a moderate FatNav advantage for small unintentional secondary rotations. Quality assessment of the motion-corrected images confirmed the superiority of MOS in practice. However, it proved less sensitive than direct comparison of movement estimates in detecting the nuanced improvements of FatNav with neck-masking.
Collapse
|
134
|
Sippel F, Seiler J, Kaup A. Multispectral Snapshot Image Registration Using Learned Cross Spectral Disparity Estimation and a Deep Guided Occlusion Reconstruction Network. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:2338-2350. [PMID: 40193269 DOI: 10.1109/tip.2025.3556602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2025]
Abstract
Multispectral imaging aims at recording images in different spectral bands. This is extremely beneficial in diverse discrimination applications, for example in agriculture, recycling or healthcare. One approach for snapshot multispectral imaging, which is capable of recording multispectral videos, is by using camera arrays, where each camera records a different spectral band. Since the cameras are at different spatial positions, a registration procedure is necessary to map every camera to the same view. In this paper, we present a multispectral snapshot image registration with three novel components. First, a cross spectral disparity estimation network is introduced, which is trained on a popular stereo database using pseudo spectral data augmentation. Subsequently, this disparity estimation is used to accurately detect occlusions by warping the disparity map in a layer-wise manner. Finally, these detected occlusions are reconstructed by a learned deep guided neural network, which leverages the structure from other spectral components. It is shown that each element of this registration process as well as the final result is superior to the current state of the art. In terms of PSNR, our registration achieves an improvement of over 3 dB. At the same time, the runtime is decreased by a factor of over 3 on a CPU. Additionally, the registration is executable on a GPU, where the runtime can be decreased by a factor of 113. The source code and the data is available at https://github.com/FAU-LMS/MSIR.
Collapse
|
135
|
Li W, Xia J, Gao W, Hu Z, Nie S, Li Y. Dual-way magnetic resonance image translation with transformer-based adversarial network. Med Phys 2025. [PMID: 40270088 DOI: 10.1002/mp.17837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2024] [Accepted: 04/05/2025] [Indexed: 04/25/2025] Open
Abstract
BACKGROUND The magnetic resonance (MR) image translation model is designed to generate MR images of required sequence from the images of existing sequence. However, the generalization performance of MR image generation models on external datasets tends to be unsatisfactory due to the inconsistency in the data distribution of MR images across different centers or scanners. PURPOSE The aim of this study is to propose a cross-sequence MR image synthesis model that could generate high-quality MR synthetic images with high transferability for small-sized external datasets. METHODS We proposed a dual-way magnetic resonance image translation model using transformer-based adversarial network (DMTrans) for MR image synthesis across sequences. It integrates a transformer-based generative architecture with an innovative discriminator design. The shifted window-based multi-head self-attention mechanism in DMTrans enables efficient capture of global and local features from MR images. The sequential dual-scale discriminator is designed to distinguish features of the generated images at multi-scale. RESULTS We pre-trained DMTrans model for bi-directional image synthesis on a T1/T2-weighted MR image dataset comprising 4229 slices. It demonstrates superior performance to baseline methods on both qualitative and quantitative measurements. The SSIM, PSNR, and MAE metrics for synthetic T1 images generation based on T2 images are 0.91 ± 0.04, 25.30 ± 2.40, and 24.65 ± 10.46, while the metric values are 0.90 ± 0.04, 24.72 ± 1.62, and 23.28 ± 7.40 for the opposite direction. Fine-tuning is then utilized to adapt the model to another public dataset with T1/T2/proton-weighted (PD) images, so that only 6 patients of 500 slices are required for model adaptation to achieve high-quality T1/T2, T1/PD, and T2/PD image translation results. CONCLUSIONS The proposed DMTrans achieves the state-of-the-art performance for cross-sequence MR image conversion, which could provide more information assisting clinical diagnosis and treatment. It also offered a versatile and efficient solution to the needs of high-quality MR image synthesis in data-scarce conditions at different centers.
Collapse
Affiliation(s)
- Wenxin Li
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, PR China
| | - Jun Xia
- Department of Radiology, The First Affiliated Hospital of Shenzhen University, Shenzhen University, Shenzhen Second People's Hospital, Shenzhen, PR China
| | - Weilin Gao
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, PR China
| | - Zaiqi Hu
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, PR China
| | - Shengdong Nie
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, PR China
| | - Yafen Li
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, PR China
| |
Collapse
|
136
|
Mendoza KL, Ni H, Varnavides G, Chi M, Ophus C, Petford-Long A, Phatak C. Quantitative phase retrieval and characterization of magnetic nanostructures via Lorentz (scanning) transmission electron microscopy. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2025; 37:205301. [PMID: 40153940 DOI: 10.1088/1361-648x/adc6e3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2024] [Accepted: 03/28/2025] [Indexed: 04/01/2025]
Abstract
Magnetic materials phase reconstruction using Lorentz transmission electron microscopy (LTEM) measurements have traditionally been achieved using longstanding methods such as off-axis holography (OAH) fast-Fourier transform technique and the transport-of-intensity equation (TIE). The increase in access to processing power alongside the development of advanced algorithms have allowed for phase retrieval of nanoscale magnetic materials with greater efficacy and resolution. Specifically, reverse-mode automatic differentiation (RMAD) and the extended electron ptychography iterative engine (ePIE) are two recent developments of phase retrieval that can be applied to analyzing micro-to-nano- scale magnetic materials. This work evaluates phase retrieval using TIE, RMAD, and ePIE in simulations of Permalloy (Ni80Fe20) nanoscale islands, or nanomagnets. Extending beyond simulations, we demonstrate total phase retrieval and image reconstructions of a NiFe nanowire using OAH and RMAD in LTEM and ePIE in Lorentz-mode-4D scanning transmission electron microscopy experiments and determine the saturation magnetization through corroborations with micromagnetic modeling. Finally, we demonstrate the efficacy of these methods in retrieving the total phase and highlight its use in characterizing and analyzing the proximity effect of the magnetic nanostructures.
Collapse
Affiliation(s)
- Kayna L Mendoza
- Department of Materials Science and Engineering, Northwestern University, Evanston, IL, United States of America
- Materials Science Division, Argonne National Laboratory, Lemont, IL, United States of America
| | - Haoyang Ni
- Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, United States of America
| | - Georgios Varnavides
- Miller Institute for Basic Research in Science, University of California, Berkeley, CA, United States of America
- National Center for Electron Microscopy, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
| | - Miaofang Chi
- Center for Nanophase Materials Sciences, Oak Ridge National Laboratory, Oak Ridge, TN, United States of America
| | - Colin Ophus
- National Center for Electron Microscopy, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
| | - Amanda Petford-Long
- Department of Materials Science and Engineering, Northwestern University, Evanston, IL, United States of America
- Materials Science Division, Argonne National Laboratory, Lemont, IL, United States of America
| | - Charudatta Phatak
- Department of Materials Science and Engineering, Northwestern University, Evanston, IL, United States of America
- Materials Science Division, Argonne National Laboratory, Lemont, IL, United States of America
| |
Collapse
|
137
|
Sun H, Qin J, Liu Z, Jia X, Yan K, Wang L, Liu Z, Gong S. Generation driven understanding of localized 3D scenes with 3D diffusion model. Sci Rep 2025; 15:14385. [PMID: 40274914 PMCID: PMC12022287 DOI: 10.1038/s41598-025-98705-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2025] [Accepted: 04/14/2025] [Indexed: 04/26/2025] Open
Abstract
In recent years, diffusion models have been widely used in 3D scenes-related work. However, the existing diffusion models primarily focus on the global structure and are constrained by predefined dataset categories, which are unable to accurately resolve the detailed structure of complex 3D scenes. This study therefore integrates Denoising Diffusion Probabilistic Models (DDPM) with Learning Dense Volumetric Segmentation from Sparse Annotation (3D U-Net) architecture fusion, a novel approach to local 3D scenes generation-driven understanding is proposed, namely a customized 3D diffusion model (3D-UDDPM) for local cubes. In contrast to conventional global or local single-structure analysis techniques, the 3D-UDDPM framework is designed to prioritize the capture and recovery of local details during the generation of localized 3D scenes. In addition to accurately predicting the distribution of the noise tensor, the framework significantly enhances the understanding of localized scenes by effectively integrating spatial context information. Specifically, 3D-UDDPM combines Markov chain Monte Carlo (MCMC) sampling and variational inference methods to reconstruct clear structural details in a stepwise backward inference manner, thereby driving the generation and understanding of local 3D scenes by internalizing geometric features as a priori knowledge. The innovative diffusion process enables the model to recover fine local details while maintaining global structural coherence during the gradual denoising process. When combined with the spatial convolutional properties of the 3D U-Net architecture, the modelling accuracy and generation quality of complex 3D shapes are further enhanced, ensuring excellent performance in complex environments. The results demonstrate superior performance on two benchmark datasets in comparison to existing methodologies.
Collapse
Affiliation(s)
- Hao Sun
- College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010080, China
| | - Junping Qin
- College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010080, China.
| | - Zheng Liu
- College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010080, China
| | - Xinglong Jia
- College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010080, China
| | - Kai Yan
- College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010080, China
| | - Lei Wang
- College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010080, China
| | - Zhiqiang Liu
- College of Information Engineering, Inner Mongolia University of Technology, Hohhot, 010080, China
| | - Shaofei Gong
- Inner Mongolia Smart Animal Husbandry Information Technology Group, Hohhot, 010013, China
| |
Collapse
|
138
|
Wang X, Lv X, Shi T, Bu L, Bai W, Peng K. The reconstruction method for static exterior model of digital twin railway station based on mobile vehicle. Sci Rep 2025; 15:14222. [PMID: 40274960 PMCID: PMC12022246 DOI: 10.1038/s41598-025-96535-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Accepted: 03/28/2025] [Indexed: 04/26/2025] Open
Abstract
Constructing a static exterior model of railway passenger station is a preliminary and crucial step in achieving a digital twin station. Station modeling often relies on manual techniques, requiring significant labor for stations spanning tens of thousands of square meters. While the interior decoration is symmetrical, the asymmetrical placement of equipment often leads to confusion in manual modeling. This paper proposes a novel station static model reconstruction method - MSCRAGS (Mobile vehicle-Sparse Sampling-Colmap-Resolution adjustment-Gaussian Splatting). The MSCRAGS integrates the requirements for flexible control of static exterior models by utilizing mobile vehicle for data collection and incorporates the sparse multi-view spatial sampling approach. It involves collecting multi-height and multi-angle appearance color data of passenger operation elements to construct preliminary point clouds. Subsequently, 3d Gaussian splatting is employed for rendering, achieving high-fidelity reconstruction of production elements. Moreover, according to the requirements for the precision control of static exterior models, the rendering effects are reshaped with resolution adjustment to obtain static exterior models of station production elements at various resolutions. Experiments conducted at Qinghe Station demonstrate that compared to other state-of-art modeling methods, our approach significantly reduces modeling time and improves modeling accuracy, showing superior performance in modeling high-fidelity indices.
Collapse
Affiliation(s)
- Xiaoshu Wang
- Institute of Computing Technology, China Academy of Railway Sciences Corporation Limited, Beijing, 100081, China
| | - Xiaojun Lv
- Institute of Computing Technology, China Academy of Railway Sciences Corporation Limited, Beijing, 100081, China.
| | - Tianyun Shi
- Department of Science, Technology and Information Technology, China Academy of Railway Sciences Corporation Limited, Beijing, 100081, China
| | - Lingbin Bu
- School of Information Network Security, People's Public Security University of China, Beijing, 100038, China
| | - Wei Bai
- Institute of Computing Technology, China Academy of Railway Sciences Corporation Limited, Beijing, 100081, China
| | - Kaibei Peng
- Institute of Computing Technology, China Academy of Railway Sciences Corporation Limited, Beijing, 100081, China
| |
Collapse
|
139
|
Chen J, Yue J, Zhou H, Hu Z. NAF-MEEF: A Nonlinear Activation-Free Network Based on Multi-Scale Edge Enhancement and Fusion for Railway Freight Car Image Denoising. SENSORS (BASEL, SWITZERLAND) 2025; 25:2672. [PMID: 40363110 PMCID: PMC12074444 DOI: 10.3390/s25092672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2025] [Revised: 04/20/2025] [Accepted: 04/21/2025] [Indexed: 05/15/2025]
Abstract
Railwayfreight cars operating in heavy-load and complex outdoor environments are frequently subject to adverse conditions such as haze, temperature fluctuations, and transmission interference, which significantly degrade the quality of the acquired images and introduce substantial noise. Furthermore, the structural complexity of freight cars, coupled with the small size, diversity, and complex structure of defect areas, poses serious challenges for image denoising. Specifically, it becomes extremely difficult to remove noise while simultaneously preserving fine-grained textures and edge details. These challenges distinguish railway freight car image denoising from conventional image restoration tasks, necessitating the design of specialized algorithms that can achieve both effective noise suppression and precise structural detail preservation. To address the challenges of incomplete denoising and poor preservation of details and edge information in railway freight car images, this paper proposes a novel image denoising algorithm named the Nonlinear Activation-Free Network based on Multi-Scale Edge Enhancement and Fusion (NAF-MEEF). The algorithm constructs a Multi-scale Edge Enhancement Initialization Layer to strengthen edge information at multiple scales. Additionally, it employs a Nonlinear Activation-Free feature extractor that effectively captures local and global image information. Leveraging the network's multi-branch parallelism, a Multi-scale Rotation Fusion Attention Mechanism is developed to perform weight analysis on information across various scales and dimensions. To ensure consistency in image details and structure, this paper introduces a fusion loss function. The experimental results show that compared with recent advanced methods, the proposed algorithm has better noise suppression and edge preservation performance. The proposed method achieves significant denoising performance on railway freight car images affected by Gaussian, composite, and simulated real-world noise, with PSNR gains of 1.20 dB, 1.45 dB, and 0.69 dB, and SSIM improvements of 2.23%, 2.72%, and 1.08%, respectively. On public benchmarks, it attains average PSNRs of 30.34 dB (Set12) and 28.94 dB (BSD68), outperforming several state-of-the-art methods. In addition, this method also performs well in railway image dehazing tasks and demonstrates good generalization ability in denoising tests of remote sensing ship images, further proving its robustness and practical application value in diverse image restoration tasks.
Collapse
Affiliation(s)
- Jiawei Chen
- School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing 100044, China; (J.C.); (Z.H.)
| | - Jianhai Yue
- School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing 100044, China; (J.C.); (Z.H.)
| | - Hang Zhou
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China;
| | - Zhunqing Hu
- School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing 100044, China; (J.C.); (Z.H.)
| |
Collapse
|
140
|
Tatana MM, Tsoeu MS, Maswanganyi RC. Low-Light Image and Video Enhancement for More Robust Computer Vision Tasks: A Review. J Imaging 2025; 11:125. [PMID: 40278041 PMCID: PMC12027663 DOI: 10.3390/jimaging11040125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2024] [Revised: 01/27/2025] [Accepted: 03/25/2025] [Indexed: 04/26/2025] Open
Abstract
Computer vision aims to enable machines to understand the visual world. Computer vision encompasses numerous tasks, namely action recognition, object detection and image classification. Much research has been focused on solving these tasks, but one that remains relatively uncharted is light enhancement (LE). Low-light enhancement (LLE) is crucial as computer vision tasks fail in the absence of sufficient lighting, having to rely on the addition of peripherals such as sensors. This review paper will shed light on this (focusing on video enhancement) subfield of computer vision, along with the other forementioned computer vision tasks. The review analyzes both traditional and deep learning-based enhancers and provides a comparative analysis on recent models in the field. The review also analyzes how popular computer vision tasks are improved and made more robust when coupled with light enhancement algorithms. Results show that deep learners outperform traditional enhancers, with supervised learners obtaining the best results followed by zero-shot learners, while computer vision tasks are improved with light enhancement coupling. The review concludes by highlighting major findings such as that although supervised learners obtain the best results, due to a lack of real-world data and robustness to new data, a shift to zero-shot learners is required.
Collapse
Affiliation(s)
- Mpilo M. Tatana
- Department of Electronic and Computer Engineering, Durban University of Technology, Durban 4001, South Africa;
| | - Mohohlo S. Tsoeu
- Steve Biko Campus, Durban University of Technology, Durban 4001, South Africa;
| | - Rito C. Maswanganyi
- Department of Electronic and Computer Engineering, Durban University of Technology, Durban 4001, South Africa;
| |
Collapse
|
141
|
Hwang I, Oh T. Design and experimental research of on device style transfer models for mobile environments. Sci Rep 2025; 15:13724. [PMID: 40259046 PMCID: PMC12012059 DOI: 10.1038/s41598-025-98545-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2025] [Accepted: 04/14/2025] [Indexed: 04/23/2025] Open
Abstract
This study develops a neural style transfer (NST) model optimized for real-time execution on mobile devices through on-device AI, eliminating reliance on cloud servers. By embedding AI models directly into mobile hardware, this approach reduces operational costs and enhances user privacy. However, designing deep learning models for mobile deployment presents a trade-off between computational efficiency and visual quality, as reducing model size often leads to performance degradation. To address this challenge, we propose a set of lightweight NST models incorporating depthwise separable convolutions, residual bottlenecks, and optimized upsampling techniques inspired by MobileNet and ResNet architectures. Five model variations are designed and evaluated based on parameters, floating-point operations, memory usage, and image transformation quality. Experimental results demonstrate that our optimized models achieve a balance between efficiency and performance, enabling high-quality real-time style transfer on resource-constrained mobile environments. These findings highlight the feasibility of deploying NST applications on mobile devices, paving the way for advancements in real-time artistic image processing in mobile photography, augmented reality, and creative applications.
Collapse
Affiliation(s)
- Igeon Hwang
- Seoul AI School, aSSIST University, 46 Ewhayeodae-gil, Seodaemun-gu, Seoul, 03767, Republic of Korea
| | - Taeyeon Oh
- Seoul AI School, aSSIST University, 46 Ewhayeodae-gil, Seodaemun-gu, Seoul, 03767, Republic of Korea.
| |
Collapse
|
142
|
Zhu P, Liu C, Fu Y, Chen N, Qiu A. Cycle-conditional diffusion model for noise correction of diffusion-weighted images using unpaired data. Med Image Anal 2025; 103:103579. [PMID: 40273728 DOI: 10.1016/j.media.2025.103579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Revised: 03/04/2025] [Accepted: 03/31/2025] [Indexed: 04/26/2025]
Abstract
Diffusion-weighted imaging (DWI) is a key modality for studying brain microstructure, but its signals are highly susceptible to noise due to the thermal motion of water molecules and interactions with tissue microarchitecture, leading to significant signal attenuation and a low signal-to-noise ratio (SNR). In this paper, we propose a novel approach, a Cycle-Conditional Diffusion Model (Cycle-CDM) using unpaired data learning, aimed at improving DWI quality and reliability through noise correction. Cycle-CDM leverages a cycle-consistent translation architecture to bridge the domain gap between noise-contaminated and noise-free DWIs, enabling the restoration of high-quality images without requiring paired datasets. By utilizing two conditional diffusion models, Cycle-CDM establishes data interrelationships between the two types of DWIs, while incorporating synthesized anatomical priors from the cycle translation process to guide noise removal. In addition, we introduce specific constraints to preserve anatomical fidelity, allowing Cycle-CDM to effectively learn the underlying noise distribution and achieve accurate denoising. Our experiments conducted on simulated datasets, as well as children and adolescents' datasets with strong clinical relevance. Our results demonstrate that Cycle-CDM outperforms comparative methods, such as U-Net, CycleGAN, Pix2Pix, MUNIT and MPPCA, in terms of noise correction performance. We demonstrated that Cycle-CDM can be generalized to DWIs with head motion when they were acquired using different MRI scannsers. Importantly, the denoised DWI data produced by Cycle-CDM exhibit accurate preservation of underlying tissue microstructure, thus substantially improving their medical applicability.
Collapse
Affiliation(s)
- Pengli Zhu
- Department of Health Technology and Informatics, Hong Kong Polytechnic University, Hong Kong
| | - Chaoqiang Liu
- Department of Biomedical Engineering, National University of Singapore, Singapore
| | - Yingji Fu
- Department of Health Technology and Informatics, Hong Kong Polytechnic University, Hong Kong
| | - Nanguang Chen
- Department of Biomedical Engineering, National University of Singapore, Singapore
| | - Anqi Qiu
- Department of Health Technology and Informatics, Hong Kong Polytechnic University, Hong Kong; Department of Biomedical Engineering, National University of Singapore, Singapore; Department of Biomedical Engineering, the Johns Hopkins University, USA.
| |
Collapse
|
143
|
Wang Z, Yu X, Wang C, Chen W, Wang J, Chu YH, Sun H, Li R, Li P, Yang F, Han H, Kang T, Lin J, Yang C, Chang S, Shi Z, Hua S, Li Y, Hu J, Zhu L, Zhou J, Lin M, Guo J, Cai C, Chen Z, Guo D, Yang G, Qu X. One for multiple: Physics-informed synthetic data boosts generalizable deep learning for fast MRI reconstruction. Med Image Anal 2025; 103:103616. [PMID: 40279827 DOI: 10.1016/j.media.2025.103616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Revised: 02/27/2025] [Accepted: 04/18/2025] [Indexed: 04/29/2025]
Abstract
Magnetic resonance imaging (MRI) is a widely used radiological modality renowned for its radiation-free, comprehensive insights into the human body, facilitating medical diagnoses. However, the drawback of prolonged scan times hinders its accessibility. The k-space undersampling offers a solution, yet the resultant artifacts necessitate meticulous removal during image reconstruction. Although deep learning (DL) has proven effective for fast MRI image reconstruction, its broader applicability across various imaging scenarios has been constrained. Challenges include the high cost and privacy restrictions associated with acquiring large-scale, diverse training data, coupled with the inherent difficulty of addressing mismatches between training and target data in existing DL methodologies. Here, we present a novel Physics-Informed Synthetic data learning Framework for fast MRI, called PISF. PISF marks a breakthrough by enabling generalizable DL for multi-scenario MRI reconstruction through a single trained model. Our approach separates the reconstruction of a 2D image into many 1D basic problems, commencing with 1D data synthesis to facilitate generalization. We demonstrate that training DL models on synthetic data, coupled with enhanced learning techniques, yields in vivo MRI reconstructions comparable to or surpassing those of models trained on matched realistic datasets, reducing the reliance on real-world MRI data by up to 96 %. With a single trained model, our PISF supports the high-quality reconstruction under 4 sampling patterns, 5 anatomies, 6 contrasts, 5 vendors, and 7 centers, exhibiting remarkable generalizability. Its adaptability to 2 neuro and 2 cardiovascular patient populations has been validated through evaluations by 10 experienced medical professionals. In summary, PISF presents a feasible and cost-effective way to significantly boost the widespread adoption of DL in various fast MRI applications.
Collapse
Affiliation(s)
- Zi Wang
- Department of Electronic Science, Xiamen University-Neusoft Medical Magnetic Resonance Imaging Joint Research and Development Center, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, National Institute for Data Science in Health and Medicine, Xiamen University, China; Department of Bioengineering and Imperial-X, Imperial College London, United Kingdom
| | - Xiaotong Yu
- Department of Electronic Science, Xiamen University-Neusoft Medical Magnetic Resonance Imaging Joint Research and Development Center, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, National Institute for Data Science in Health and Medicine, Xiamen University, China
| | - Chengyan Wang
- Shanghai Pudong Hospital and Human Phenome Institute, Fudan University, China; International Human Phenome Institute (Shanghai), China
| | | | | | | | - Hongwei Sun
- United Imaging Research Institute of Intelligent Imaging, China
| | - Rushuai Li
- Department of Nuclear Medicine, Nanjing First Hospital, China
| | - Peiyong Li
- Shandong Aoxin Medical Technology Company, China
| | - Fan Yang
- Department of Radiology, The First Affiliated Hospital of Xiamen University, China
| | - Haiwei Han
- Department of Radiology, The First Affiliated Hospital of Xiamen University, China
| | - Taishan Kang
- Department of Radiology, Zhongshan Hospital Affiliated to Xiamen University, China
| | - Jianzhong Lin
- Department of Radiology, Zhongshan Hospital Affiliated to Xiamen University, China
| | - Chen Yang
- Department of Neurosurgery, Zhongshan Hospital, Fudan University (Xiamen Branch), China
| | - Shufu Chang
- Department of Cardiology, Shanghai Institute of Cardiovascular Diseases, Zhongshan Hospital, Fudan University, China
| | - Zhang Shi
- Department of Radiology, Zhongshan Hospital, Fudan University, China
| | - Sha Hua
- Department of Cardiovascular Medicine, Heart Failure Center, Ruijin Hospital Lu Wan Branch, Shanghai Jiaotong University School of Medicine, China
| | - Yan Li
- Department of Radiology, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, China
| | - Juan Hu
- Medical Imaging Department, The First Affiliated Hospital of Kunming Medical University, China
| | - Liuhong Zhu
- Department of Radiology, Zhongshan Hospital, Fudan University (Xiamen Branch), Fujian Province Key Clinical Specialty Construction Project (Medical Imaging Department), Xiamen Key Laboratory of Clinical Transformation of Imaging Big Data and Artificial Intelligence, China
| | - Jianjun Zhou
- Department of Radiology, Zhongshan Hospital, Fudan University (Xiamen Branch), Fujian Province Key Clinical Specialty Construction Project (Medical Imaging Department), Xiamen Key Laboratory of Clinical Transformation of Imaging Big Data and Artificial Intelligence, China
| | - Meijing Lin
- Department of Applied Marine Physics and Engineering, Xiamen University, China
| | - Jiefeng Guo
- Department of Microelectronics and Integrated Circuit, Xiamen University, China
| | - Congbo Cai
- Department of Electronic Science, Xiamen University-Neusoft Medical Magnetic Resonance Imaging Joint Research and Development Center, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, National Institute for Data Science in Health and Medicine, Xiamen University, China
| | - Zhong Chen
- Department of Electronic Science, Xiamen University-Neusoft Medical Magnetic Resonance Imaging Joint Research and Development Center, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, National Institute for Data Science in Health and Medicine, Xiamen University, China
| | - Di Guo
- School of Computer and Information Engineering, Xiamen University of Technology, China
| | - Guang Yang
- Department of Bioengineering and Imperial-X, Imperial College London, United Kingdom; National Heart and Lung Institute, Imperial College London, United Kingdom; Cardiovascular Research Centre, Royal Brompton Hospital, United Kingdom; School of Biomedical Engineering & Imaging Sciences, King's College London, United Kingdom
| | - Xiaobo Qu
- Department of Electronic Science, Xiamen University-Neusoft Medical Magnetic Resonance Imaging Joint Research and Development Center, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, National Institute for Data Science in Health and Medicine, Xiamen University, China.
| |
Collapse
|
144
|
Wen W, Zhao Q, Shao X. MambaOSR: Leveraging Spatial-Frequency Mamba for Distortion-Guided Omnidirectional Image Super-Resolution. ENTROPY (BASEL, SWITZERLAND) 2025; 27:446. [PMID: 40282681 PMCID: PMC12025934 DOI: 10.3390/e27040446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2025] [Revised: 04/15/2025] [Accepted: 04/18/2025] [Indexed: 04/29/2025]
Abstract
Omnidirectional image super-resolution (ODISR) is critical for VR/AR applications, as high-quality 360° visual content significantly enhances immersive experiences. However, existing ODISR methods suffer from limited receptive fields and high computational complexity, which restricts their ability to model long-range dependencies and extract global structural features. Consequently, these limitations hinder the effective reconstruction of high-frequency details. To address these issues, we propose a novel Mamba-based ODISR network, termed MambaOSR, which consists of three key modules working collaboratively for accurate reconstruction. Specifically, we first introduce a spatial-frequency visual state space model (SF-VSSM) to capture global contextual information via dual-domain representation learning, thereby enhancing the preservation of high-frequency details. Subsequently, we design a distortion-guided module (DGM) that leverages distortion map priors to adaptively model geometric distortions, effectively suppressing artifacts resulting from equirectangular projections. Finally, we develop a multi-scale feature fusion module (MFFM) that integrates complementary features across multiple scales, further improving reconstruction quality. Extensive experiments conducted on the SUN360 dataset demonstrate that our proposed MambaOSR achieves a 0.16 dB improvement in WS-PSNR and increases the mutual information by 1.99% compared with state-of-the-art methods, significantly enhancing both visual quality and the information richness of omnidirectional images.
Collapse
Affiliation(s)
- Weilei Wen
- VCIP, College of Computer Science, Nankai University, Tianjin 300350, China;
| | | | - Xiuli Shao
- VCIP, College of Computer Science, Nankai University, Tianjin 300350, China;
| |
Collapse
|
145
|
Lyoo YW, Lee H, Lee J, Park JH, Hwang I, Chung JW, Choi SH, Yoo J, Choi KS. Deep learning enhances reliability of dynamic contrast-enhanced MRI in diffuse gliomas: bypassing post-processing and providing uncertainty maps. Eur Radiol 2025:10.1007/s00330-025-11588-z. [PMID: 40252095 DOI: 10.1007/s00330-025-11588-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2024] [Revised: 02/18/2025] [Accepted: 03/15/2025] [Indexed: 04/21/2025]
Abstract
OBJECTIVES To propose and evaluate a novel deep learning model for directly estimating pharmacokinetic (PK) parameter maps and uncertainty estimation from DCE-MRI. METHODS In this single-center study, patients with adult-type diffuse gliomas who underwent preoperative DCE-MRI from Apr 2010 to Feb 2020 were retrospectively enrolled. A spatiotemporal probabilistic model was used to create synthetic PK maps. Structural Similarity Index Measure (SSIM) to ground truth (GT) maps were calculated. Reliability was evaluated using the intraclass correlation coefficient (ICC) for synthetic and GT PK maps. For clinical validation, Area Under the Receiver Operating Characteristic Curve (AUROC) was obtained for predicting WHO low vs high grade and IDH-wildtype vs mutant. RESULTS 329 patients (mean age, 55 ± 15 years, 197 men) were eligible. Synthetic Ktrans, Vp, Ve maps showed high SSIM (0.961, 0.962, 0.890) compared to the GT maps. The ICC of PK maps was significantly higher in synthetic PK maps compared to the conventional approach: 1.00 vs 0.68 (p < 0.001) for Ktrans, 1.00 vs 0.59 (p < 0.001) for Vp, 1.00 vs 0.64 (p < 0.001) for Ve. PK values of enhancing tumor portion obtained from synthetic and GT maps were comparable in AUROC: (1) Ktrans, 0.857 vs 0.842 (p = 0.57); Vp, 0.864 vs 0.835 (p = 0.31); and Ve, 0.835 vs 0.830 (p = 0.88) for mutation prediction. (2) Ktrans, 0.934 vs 0.907 (p = 0.50); Vp, 0.927 vs 0.899 (p = 0.24); and Ve, 0.945 vs 0.910 (p = 0.24) for glioma grading. CONCLUSION Synthetic PK maps generated from DCE-MRI using a spatiotemporal probabilistic deep-learning model showed improved reliability without compromising diagnostic performance in glioma grading. KEY POINTS Question Can a deep learning model enhance the reliability of dynamic contrast-enhanced MRI (DCE-MRI) for more consistent and clinically acceptable glioma imaging? Findings A spatiotemporal deep learning model outperformed the Tofts model in Ktrans reliability and preserved diagnostic performance for IDH mutation and glioma grade, bypassing arterial input function estimation. Clinical relevance Enhancing DCE-MRI reliability with deep learning improves imaging consistency, supports molecular tumor characterization through reproducible pharmacokinetic maps, and enables personalized treatment planning, which might lead to better clinical outcomes for patients with diffuse gliomas.
Collapse
Affiliation(s)
- Young Wook Lyoo
- Department of Radiology, Seoul National University Hospital, Seoul, Republic of Korea
| | - Haneol Lee
- Graduate School of Artificial Intelligence, Ulsan National Institute of Science and Technology (UNIST), Ulsan, Republic of Korea
| | - Junhyeok Lee
- Interdisciplinary Programs in Cancer Biology Major, Seoul National University Graduate School, Seoul, Republic of Korea
| | - Jung Hyun Park
- Department of Radiology, Seoul Metropolitan Government Seoul National University Boramae Medical Center, Seoul, Republic of Korea
| | - Inpyeong Hwang
- Department of Radiology, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Radiology, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Jin Wook Chung
- Department of Radiology, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Radiology, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Seung Hong Choi
- Department of Radiology, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Radiology, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Jaejun Yoo
- Graduate School of Artificial Intelligence, Ulsan National Institute of Science and Technology (UNIST), Ulsan, Republic of Korea.
| | - Kyu Sung Choi
- Department of Radiology, Seoul National University Hospital, Seoul, Republic of Korea.
- Department of Radiology, Seoul National University College of Medicine, Seoul, Republic of Korea.
| |
Collapse
|
146
|
Li Q, Yang X, Li B, Wang J. Self-Supervised Multiscale Contrastive and Attention-Guided Gradient Projection Network for Pansharpening. SENSORS (BASEL, SWITZERLAND) 2025; 25:2560. [PMID: 40285249 PMCID: PMC12031081 DOI: 10.3390/s25082560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2025] [Revised: 04/09/2025] [Accepted: 04/14/2025] [Indexed: 04/29/2025]
Abstract
Pansharpening techniques are crucial in remote sensing image processing, with deep learning emerging as the mainstream solution. In this paper, the pansharpening problem is formulated as two optimization subproblems with a solution proposed based on multiscale contrastive learning combined with attention-guided gradient projection networks. First, an efficient and generalized Spectral-Spatial Universal Module (SSUM) is designed and applied to spectral and spatial enhancement modules (SpeEB and SpaEB). Then, the multiscale high-frequency features of PAN and MS images are extracted using discrete wavelet transform (DWT). These features are combined with contrastive learning and residual connection to progressively balance spectral and spatial information. Finally, high-resolution multispectral images are generated through multiple iterations. Experimental results verify that the proposed method outperforms existing approaches in both visual quality and quantitative evaluation metrics.
Collapse
Affiliation(s)
| | - Xiaomin Yang
- College of Electronic Information, Sichuan University, Chengdu 610017, China; (Q.L.); (B.L.); (J.W.)
| | | | | |
Collapse
|
147
|
Chen Z, Wang J, Venkataraman A. QID 2: An Image-Conditioned Diffusion Model for Q-space Up-sampling of DWI Data. COMPUTATIONAL DIFFUSION MRI : MICCAI WORKSHOP 2025; 15171:119-131. [PMID: 40444168 PMCID: PMC12122016 DOI: 10.1007/978-3-031-86920-4_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 06/02/2025]
Abstract
We propose an image-conditioned diffusion model to estimate high angular resolution diffusion weighted imaging (DWI) from a low angular resolution acquisition. Our model, which we call QID2, takes as input a set of low angular resolution DWI data and uses this information to estimate the DWI data associated with a target gradient direction. We leverage a U-Net architecture with cross-attention to preserve the positional information of the reference images, further guiding the target image generation. We train and evaluate QID2 on single-shell DWI samples curated from the Human Connectome Project (HCP) dataset. Specifically, we sub-sample the HCP gradient directions to produce low angular resolution DWI data and train QID2 to reconstruct the missing high angular resolution samples. We compare QID2 with two state-of-the-art GAN models. Our results demonstrate that QID2 not only achieves higher-quality generated images, but it consistently outperforms state-of-the-art baseline methods in downstream tensor estimation across multiple metrics and in generalizing to downsampling scenario during testing. Taken together, this study highlights the potential of diffusion models, and QID2 in particular, for q-space up-sampling, thus offering a promising toolkit for clinical and research applications.
Collapse
Affiliation(s)
- Zijian Chen
- Department of Electrical and Computer Engineering, Boston University
| | - Jueqi Wang
- Department of Electrical and Computer Engineering, Boston University
| | | |
Collapse
|
148
|
Zhang C, Gao X, Zheng X, Xie J, Feng G, Bao Y, Gu P, He C, Wang R, Tian J. A fully automated, expert-perceptive image quality assessment system for whole-body [18F]FDG PET/CT. EJNMMI Res 2025; 15:42. [PMID: 40249445 PMCID: PMC12008089 DOI: 10.1186/s13550-025-01238-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2024] [Accepted: 04/05/2025] [Indexed: 04/19/2025] Open
Abstract
BACKGROUND The quality of clinical PET/CT images is critical for both accurate diagnosis and image-based research. However, current image quality assessment (IQA) methods predominantly rely on handcrafted features and region-specific analyses, thereby limiting automation in whole-body and multicenter evaluations. This study aims to develop an expert-perceptive deep learning-based IQA system for [18F]FDG PET/CT to tackle the lack of automated, interpretable assessments of clinical whole-body PET/CT image quality. METHODS This retrospective multicenter study included clinical whole-body [18F]FDG PET/CT scans from 718 patients. Automated identification and localization algorithms were applied to select predefined pairs of PET and CT slices from whole-body images. Fifteen experienced experts, trained to conduct blinded slice-level subjective assessments, provided average visual scores as reference standards. Using the MANIQA framework, the developed IQA model integrates the Vision Transformer, Transposed Attention, and Scale Swin Transformer Blocks to categorize PET and CT images into five quality classes. The model's correlation, consistency, and accuracy with expert evaluations on both PET and CT test sets were statistically analysed to assess the system's IQA performance. Additionally, the model's ability to distinguish high-quality images was evaluated using receiver operating characteristic (ROC) curves. RESULTS The IQA model demonstrated high accuracy in predicting image quality categories and showed strong concordance with expert evaluations of PET/CT image quality. In predicting slice-level image quality across all body regions, the model achieved an average accuracy of 0.832 for PET and 0.902 for CT. The model's scores showed substantial agreement with expert assessments, achieving average Spearman coefficients (ρ) of 0.891 for PET and 0.624 for CT, while the average Intraclass Correlation Coefficient (ICC) reached 0.953 for PET and 0.92 for CT. The PET IQA model demonstrated strong discriminative performance, achieving an area under the curve (AUC) of ≥ 0.88 for both the thoracic and abdominal regions. CONCLUSIONS This fully automated IQA system provides a robust and comprehensive framework for the objective evaluation of clinical image quality. Furthermore, it demonstrates significant potential as an impartial, expert-level tool for standardised multicenter clinical IQA.
Collapse
Affiliation(s)
- Cong Zhang
- Medical School of Chinese PLA, Beijing, China
- Department of Nuclear Medicine, The First Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Xin Gao
- Shanghai Universal Medical Imaging Diagnostic Center, Shanghai, China
| | - Xuebin Zheng
- Department of Scientific Research, Shanghai Aitrox Technology Corporation Limited, Shanghai, China
| | - Jun Xie
- Department of Scientific Research, Shanghai Aitrox Technology Corporation Limited, Shanghai, China
| | - Gang Feng
- Shanghai Universal Medical Imaging Diagnostic Center, Shanghai, China
| | - Yunchao Bao
- Department of Scientific Research, Shanghai Aitrox Technology Corporation Limited, Shanghai, China
| | - Pengchen Gu
- Department of Scientific Research, Shanghai Aitrox Technology Corporation Limited, Shanghai, China
| | - Chuan He
- Department of Scientific Research, Shanghai Aitrox Technology Corporation Limited, Shanghai, China
| | - Ruimin Wang
- Medical School of Chinese PLA, Beijing, China.
| | - Jiahe Tian
- Medical School of Chinese PLA, Beijing, China.
| |
Collapse
|
149
|
Xu C, Sun Y, Zhang Y, Liu T, Wang X, Hu D, Huang S, Li J, Zhang F, Li G. Stain Normalization of Histopathological Images Based on Deep Learning: A Review. Diagnostics (Basel) 2025; 15:1032. [PMID: 40310413 PMCID: PMC12077256 DOI: 10.3390/diagnostics15081032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2025] [Revised: 04/06/2025] [Accepted: 04/10/2025] [Indexed: 05/02/2025] Open
Abstract
Histopathological images stained with hematoxylin and eosin (H&E) are crucial for cancer diagnosis and prognosis. However, color variations caused by differences in tissue preparation and scanning devices can lead to data distribution discrepancies, adversely affecting the performance of downstream algorithms in tasks like classification, segmentation, and detection. To address these issues, stain normalization methods have been developed to standardize color distributions across images from various sources. Recent advancements in deep learning-based stain normalization methods have shown significant promise due to their minimal preprocessing requirements, independence from reference templates, and robustness. This review examines 115 publications to explore the latest developments in this field. We first outline the evaluation metrics and publicly available datasets used for assessing stain normalization methods. Next, we systematically review deep learning-based approaches, including supervised, unsupervised, and self-supervised methods, categorizing them by core technologies and analyzing their contributions and limitations. Finally, we discuss current challenges and future directions, aiming to provide researchers with a comprehensive understanding of the field, promote further development, and accelerate the progress of intelligent cancer diagnosis.
Collapse
Affiliation(s)
- Chuanyun Xu
- School of Computer & Information Science, Chongqing Normal University, Chongqing 401331, China; (C.X.); (Y.S.)
| | - Yisha Sun
- School of Computer & Information Science, Chongqing Normal University, Chongqing 401331, China; (C.X.); (Y.S.)
| | - Yang Zhang
- School of Computer & Information Science, Chongqing Normal University, Chongqing 401331, China; (C.X.); (Y.S.)
| | - Tianqi Liu
- School of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, China
| | - Xiao Wang
- School of Computer & Information Science, Chongqing Normal University, Chongqing 401331, China; (C.X.); (Y.S.)
| | - Die Hu
- School of Computer & Information Science, Chongqing Normal University, Chongqing 401331, China; (C.X.); (Y.S.)
| | - Shuaiye Huang
- School of Computer & Information Science, Chongqing Normal University, Chongqing 401331, China; (C.X.); (Y.S.)
| | - Junjie Li
- School of Computer & Information Science, Chongqing Normal University, Chongqing 401331, China; (C.X.); (Y.S.)
| | - Fanghong Zhang
- National Center for Applied Mathematics, Chongqing Normal University, Chongqing 401331, China
| | - Gang Li
- School of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, China
| |
Collapse
|
150
|
Dai S, Wang S. HR-NeRF: advancing realism and accuracy in highlight scene representation. Front Neurorobot 2025; 19:1558948. [PMID: 40308477 PMCID: PMC12041011 DOI: 10.3389/fnbot.2025.1558948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2025] [Accepted: 03/26/2025] [Indexed: 05/02/2025] Open
Abstract
NeRF and its variants excel in novel view synthesis but struggle with scenes featuring specular highlights. To address this limitation, we introduce the Highlight Recovery Network (HRNet), a new architecture that enhances NeRF's ability to capture specular scenes. HRNet incorporates Swish activation functions, affine transformations, multilayer perceptrons (MLPs), and residual blocks, which collectively enable smooth non-linear transformations, adaptive feature scaling, and hierarchical feature extraction. The residual connections help mitigate the vanishing gradient problem, ensuring stable training. Despite the simplicity of HRNet's components, it achieves impressive results in recovering specular highlights. Additionally, a density voxel grid enhances model efficiency. Evaluations on four inward-facing benchmarks demonstrate that our approach outperforms NeRF and its variants, achieving a 3-5 dB PSNR improvement on each dataset while accurately capturing scene details. Furthermore, our method effectively preserves image details without requiring positional encoding, rendering a single scene in ~18 min on an NVIDIA RTX 3090 Ti GPU.
Collapse
|