251
|
Oguzhan A, Peskersoy C, Devrimci EE, Kemaloglu H, Onder TK. Implementation of machine learning models as a quantitative evaluation tool for preclinical studies in dental education. J Dent Educ 2025; 89:383-397. [PMID: 39327675 DOI: 10.1002/jdd.13722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 08/02/2024] [Accepted: 09/04/2024] [Indexed: 09/28/2024]
Abstract
PURPOSE AND OBJECTIVE Objective, valid, and reliable evaluations are needed in order to develop haptic skills in dental education. The aim of this study is to investigate the validity and reliability of the machine learning method in evaluating the haptic skills of dentistry students. MATERIALS AND METHODS One-hundred fifty 6th semester dental students have performed Class II amalgam (C2A) and composite resin restorations (C2CR), in which all stages were evaluated with Direct Observation Practical Skills forms. The final phase was graded by three trainers and supervisors separately. Standard photographs of the restorations in the final stage were taken from different angles in a special setup and transferred to the Python program which utilized the Structural Similarity algorithm to calculate both the quantitative (numerical) and qualitative (visual) differences of each restoration. The validity and reliability analyses of inter-examiner evaluation were tested by Cronbach's Alpha and Kappa statistics (p = 0.05). RESULTS The intra-examiner reliability between Structural Similarity Index (SSIM) and examiners was found highly reliable in both C2A (α = 0.961) and C2CR (α = 0.856). The compatibility of final grades given by SSIM (53.07) and examiners (56.85) was statistically insignificant (p > 0.05). A significant difference was found between the examiners and SSIM when grading the occlusal surfaces in C2A and on the palatal surfaces of C2CR (p < 0.05). The concordance of observer assessments was found almost perfect in C2A (κ = 0.806), and acceptable in C2CR (κ = 0.769). CONCLUSION Although deep machine learning is a promising tool in the evaluation of haptic skills, further improvement and alignments are required for fully objective and reliable validation in all cases of dental training in restorative dentistry.
Collapse
Affiliation(s)
- Aybeniz Oguzhan
- Department of Restorative Dentistry, Faculty of Dentistry, Ege University, Izmir, Turkey
| | - Cem Peskersoy
- Department of Restorative Dentistry, Faculty of Dentistry, Ege University, Izmir, Turkey
| | - Elif Ercan Devrimci
- Department of Restorative Dentistry, Faculty of Dentistry, Ege University, Izmir, Turkey
| | - Hande Kemaloglu
- Department of Restorative Dentistry, Faculty of Dentistry, Ege University, Izmir, Turkey
| | - Tolga Kagan Onder
- Department of Mechanical Engineering, ARQUQ Project Partnership, Izmir, Turkey
| |
Collapse
|
252
|
Wu X, Cao ZH, Huang TZ, Deng LJ, Chanussot J, Vivone G. Fully-Connected Transformer for Multi-Source Image Fusion. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2025; 47:2071-2088. [PMID: 40031431 DOI: 10.1109/tpami.2024.3523364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Multi-source image fusion combines the information coming from multiple images into one data, thus improving imaging quality. This topic has aroused great interest in the community. How to integrate information from different sources is still a big challenge, although the existing self-attention based transformer methods can capture spatial and channel similarities. In this paper, we first discuss the mathematical concepts behind the proposed generalized self-attention mechanism, where the existing self-attentions are considered basic forms. The proposed mechanism employs multilinear algebra to drive the development of a novel fully-connected self-attention (FCSA) method to fully exploit local and non-local domain-specific correlations among multi-source images. Moreover, we propose a multi-source image representation embedding it into the FCSA framework as a non-local prior within an optimization problem. Some different fusion problems are unfolded into the proposed fully-connected transformer fusion network (FC-Former). More specifically, the concept of generalized self-attention can promote the potential development of self-attention. Hence, the FC-Former can be viewed as a network model unifying different fusion tasks. Compared with state-of-the-art methods, the proposed FC-Former method exhibits robust and superior performance, showing its capability of faithfully preserving information.
Collapse
|
253
|
Dian R, Liu Y, Li S. Spectral Super-Resolution via Deep Low-Rank Tensor Representation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:5140-5150. [PMID: 38466604 DOI: 10.1109/tnnls.2024.3359852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Spectral super-resolution has attracted the attention of more researchers for obtaining hyperspectral images (HSIs) in a simpler and cheaper way. Although many convolutional neural network (CNN)-based approaches have yielded impressive results, most of them ignore the low-rank prior of HSIs resulting in huge computational and storage costs. In addition, the ability of CNN-based methods to capture the correlation of global information is limited by the receptive field. To surmount the problem, we design a novel low-rank tensor reconstruction network (LTRN) for spectral super-resolution. Specifically, we treat the features of HSIs as 3-D tensors with low-rank properties due to their spectral similarity and spatial sparsity. Then, we combine canonical-polyadic (CP) decomposition with neural networks to design an adaptive low-rank prior learning (ALPL) module that enables feature learning in a 1-D space. In this module, there are two core modules: the adaptive vector learning (AVL) module and the multidimensionwise multihead self-attention (MMSA) module. The AVL module is designed to compress an HSI into a 1-D space by using a vector to represent its information. The MMSA module is introduced to improve the ability to capture the long-range dependencies in the row, column, and spectral dimensions, respectively. Finally, our LTRN, mainly cascaded by several ALPL modules and feedforward networks (FFNs), achieves high-quality spectral super-resolution with fewer parameters. To test the effect of our method, we conduct experiments on two datasets: the CAVE dataset and the Harvard dataset. Experimental results show that our LTRN not only is as effective as state-of-the-art methods but also has fewer parameters. The code is available at https://github.com/renweidian/LTRN.
Collapse
|
254
|
Zhang Z, Wang R, Zhang H, Zuo W. Self-Supervised Learning for Real-World Super-Resolution From Dual and Multiple Zoomed Observations. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2025; 47:1348-1361. [PMID: 38507386 DOI: 10.1109/tpami.2024.3379736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/22/2024]
Abstract
In this paper, we consider two challenging issues in reference-based super-resolution (RefSR) for smartphone, (i) how to choose a proper reference image, and (ii) how to learn RefSR in a self-supervised manner. Particularly, we propose a novel self-supervised learning approach for real-world RefSR from observations at dual and multiple camera zooms. Firstly, considering the popularity of multiple cameras in modern smartphones, the more zoomed (telephoto) image can be naturally leveraged as the reference to guide the super-resolution (SR) of the lesser zoomed (ultra-wide) image, which gives us a chance to learn a deep network that performs SR from the dual zoomed observations (DZSR). Secondly, for self-supervised learning of DZSR, we take the telephoto image instead of an additional high-resolution image as the supervision information, and select a center patch from it as the reference to super-resolve the corresponding ultra-wide image patch. To mitigate the effect of the misalignment between ultra-wide low-resolution (LR) patch and telephoto ground-truth (GT) image during training, we propose a two-stage alignment method, including patch-based optical flow alignment and auxiliary-LR guiding alignment. To generate visually pleasing results, we present local overlapped sliced Wasserstein loss. Furthermore, we take multiple zoomed observations to explore self-supervised RefSR, and present a progressive fusion scheme for the effective utilization of reference images. Experiments show that our methods achieve better quantitative and qualitative performance against state-of-the-arts.
Collapse
|
255
|
Jiang Y, Wu F, Fang X, Wang H, Xie Y, Yu C. Effective palynological diversity indices for reconstructing angiosperm diversity in China. PLANT DIVERSITY 2025; 47:244-254. [PMID: 40182490 PMCID: PMC11963190 DOI: 10.1016/j.pld.2025.01.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/28/2024] [Revised: 01/16/2025] [Accepted: 01/21/2025] [Indexed: 04/05/2025]
Abstract
The utilization of palynological data for plant diversity reconstructions offers notable advantages in addressing the discontinuity of plant fossils in the stratigraphic record. However, additional studies of modern processes are required to validate or refine the accuracy of diversity results obtained from palynological data. In this study, we used a modern pollen dataset of China to compare the accuracy of plant diversity reconstructions using five different palynological diversity indices (i.e., the pollen species number, Berger-Parker index, Simpson diversity index, Hill index, and Shannon-Wiener index) over a large spatial scale. We then identified climate factors that are most strongly correlated with these patterns of plant diversity. We found that the index that most accurately reflects plant diversity is the Shannon-Wiener index. Our analyses indicated that the most effective indices at reflecting plant diversity are the Shannon-Wiener index and Berger-Parker index. Numerical analysis revealed that palynological diversity (measured using the Shannon-Wiener index) was strongly correlated with climatic parameters, in particular average temperature in the coldest month and annual precipitation, suggesting these factors may be primary determinants of plant diversity distribution. We also found that a threshold value of the normalized Shannon-Wiener index (NH = 0.4) approximately aligns with the contour line specifying 400 mm annual precipitation, serving as a rudimentary indicator for assessing plant diversity in arid versus humid climates. This study suggests that pollen diversity indices have remarkable potential for quantitatively reconstructing paleoclimatic parameters.
Collapse
Affiliation(s)
- Yuxuan Jiang
- State Key Laboratory of Tibetan Plateau Earth System, Environment and Resources (TPESER), Institute of Tibetan Plateau Research, Chinese Academy of Sciences, Lincui Road 16, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100101, China
| | - Fuli Wu
- State Key Laboratory of Tibetan Plateau Earth System, Environment and Resources (TPESER), Institute of Tibetan Plateau Research, Chinese Academy of Sciences, Lincui Road 16, Beijing 100101, China
| | - Xiaomin Fang
- State Key Laboratory of Tibetan Plateau Earth System, Environment and Resources (TPESER), Institute of Tibetan Plateau Research, Chinese Academy of Sciences, Lincui Road 16, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100101, China
| | - Haitao Wang
- State Key Laboratory of Tibetan Plateau Earth System, Environment and Resources (TPESER), Institute of Tibetan Plateau Research, Chinese Academy of Sciences, Lincui Road 16, Beijing 100101, China
- Key Laboratory of Ecological Safety and Sustainable Development in Arid Lands, Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences, Lanzhou 730000, China
| | - Yulong Xie
- State Key Laboratory of Tibetan Plateau Earth System, Environment and Resources (TPESER), Institute of Tibetan Plateau Research, Chinese Academy of Sciences, Lincui Road 16, Beijing 100101, China
- Key Laboratory of Palaeobiology and Petroleum Stratigraphy, Nanjing Institute of Geology and Palaeontology, Chinese Academy of Sciences, Nanjing 210008, China
| | - Cuirong Yu
- State Key Laboratory of Tibetan Plateau Earth System, Environment and Resources (TPESER), Institute of Tibetan Plateau Research, Chinese Academy of Sciences, Lincui Road 16, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
256
|
Changcheng G, Qiang S. Orthogonal limited-angle CT reconstruction method based on anisotropic self-guided image filtering. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2025; 33:325-339. [PMID: 39973796 DOI: 10.1177/08953996241300013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Computed tomography (CT) reconstruction from incomplete projection data is significant for reducing radiation dose or scanning time. In this work, we investigate a special sampling strategy, which performs two limited-angle scans. We call it orthogonal limited-angle sampling. The X-ray source trajectory covers two limited-angle ranges, and the angle bisectors of the two angular ranges are orthogonal. This sampling method avoids rapid switching of tube voltage in few-view sampling, and reduces data correlation of projections in limited-angle sampling. It has the potential to become a practical imaging strategy. Then we propose a new reconstruction model based on anisotropic self-guided image filtering (ASGIF) and present an algorithm to solve this model. We construct adaptive weights to guide image reconstruction using the gradient information of reconstructed image itself. Additionally, since the shading artifacts are related to the scanning angular ranges and distributed in two orthogonal directions, anisotropic image filtering is used to preserve image edges. Experiments on a digital phantom and real CT data demonstrate that ASGIF method can effectively suppress shading artifacts and preserve image edges, outperforming other competing methods.
Collapse
Affiliation(s)
- Gong Changcheng
- School of Mathematics and Statistics, Chongqing Technology and Business University, Chongqing, China
- Chongqing Key Laboratory of Statistical Intelligent Computing and Monitoring, Chongqing Technology and Business University, Chongqing, China
| | - Song Qiang
- School of Mathematics and Statistics, Chongqing Technology and Business University, Chongqing, China
| |
Collapse
|
257
|
Guo R, Wang J, Miao Y, Zhang X, Xue S, Zhang Y, Shi K, Li B, Zheng G. 3D full-dose brain-PET volume recovery from low-dose data through deep learning: quantitative assessment and clinical evaluation. Eur Radiol 2025; 35:1133-1145. [PMID: 39609283 DOI: 10.1007/s00330-024-11225-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Revised: 10/07/2024] [Accepted: 10/17/2024] [Indexed: 11/30/2024]
Abstract
OBJECTIVES Low-dose (LD) PET imaging would lead to reduced image quality and diagnostic efficacy. We propose a deep learning (DL) method to reduce radiotracer dosage for PET studies while maintaining diagnostic quality. METHODS This retrospective study was performed on 456 participants respectively scanned by three different PET scanners with two different tracers. A DL method called spatially aware noise reduction network (SANR) was proposed to recover 3D full-dose (FD) PET volumes from LD data. The performance of SANR was compared with a 2D DL method taking regular FD PET volumes as the reference. Wilcoxon signed-rank test was conducted to compare the image quality metrics across different DL denoising methods. For clinical evaluation, two nuclear medicine physicians examined the recovered FD PET volumes using a 5-point grading scheme (5 = excellent) and gave a binary decision (negative or positive) for diagnostic quality assessment. RESULTS Statistically significant differences (p < 0.05) were found in terms of image quality metrics when SANR was compared with the 2D DL method. For clinical evaluation, SANR achieved a lesion detection accuracy of 95.3% (95% CI: 90.1%, 100%), while the reference full-dose PET volumes obtained a lesion detection accuracy of 98.4% (95% CI: 95.4%, 100%). In Alzheimer's disease diagnosis, both the reference FD PET volumes and the FD PET volumes recovered by SANR exhibited the same accuracy. CONCLUSION Compared with reference FD PET, LD PET denoised by the proposed approach significantly reduced radiotracer dosage and showed noninferior diagnostic performance in brain lesion detection and Alzheimer's disease diagnosis. KEY POINTS Question The current trend in PET imaging is to reduce injected dosage, which leads to low-quality PET images and reduces diagnostic efficacy. Findings The proposed deep learning method could recover diagnostic quality PET images from data acquired with a markedly reduced radiotracer dosage. Clinical relevance The proposed method would enhance the utility of PET scanning at lower radiotracer dosage and inform future workflows for brain lesion detection and Alzheimer's disease diagnosis, especially for those patients who need multiple examinations.
Collapse
Affiliation(s)
- Rui Guo
- Department of Nuclear Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Collaborative Innovation Center for Molecular Imaging of Precision Medicine, Shanxi Medical University, Taiyuan, Shanxi, China
| | - Jiale Wang
- Department of Nuclear Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Ying Miao
- Department of Nuclear Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Collaborative Innovation Center for Molecular Imaging of Precision Medicine, Shanxi Medical University, Taiyuan, Shanxi, China
| | - Xinyu Zhang
- Department of Nuclear Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Collaborative Innovation Center for Molecular Imaging of Precision Medicine, Shanxi Medical University, Taiyuan, Shanxi, China
| | - Song Xue
- Department of Nuclear Medicine, University of Bern, Bern, Switzerland
| | - Yu Zhang
- Department of Nuclear Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Collaborative Innovation Center for Molecular Imaging of Precision Medicine, Shanxi Medical University, Taiyuan, Shanxi, China
| | - Kuangyu Shi
- Department of Nuclear Medicine, University of Bern, Bern, Switzerland
- Department of Informatics, Technical University of Munich, Munich, Germany
| | - Biao Li
- Department of Nuclear Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
- Collaborative Innovation Center for Molecular Imaging of Precision Medicine, Shanxi Medical University, Taiyuan, Shanxi, China.
| | - Guoyan Zheng
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
258
|
Wang Z, Wang F, Qin C, Lyu J, Ouyang C, Wang S, Li Y, Yu M, Zhang H, Guo K, Shi Z, Li Q, Xu Z, Zhang Y, Li H, Hua S, Chen B, Sun L, Sun M, Li Q, Chu YH, Bai W, Qin J, Zhuang X, Prieto C, Young A, Markl M, Wang H, Wu LM, Yang G, Qu X, Wang C. CMRxRecon2024: A Multimodality, Multiview k-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI. Radiol Artif Intell 2025; 7:e240443. [PMID: 39878610 PMCID: PMC11950877 DOI: 10.1148/ryai.240443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 12/07/2024] [Accepted: 01/15/2025] [Indexed: 01/31/2025]
Abstract
The released CMRxRecon2024 dataset is currently the largest and most protocol-diverse publicly available k-space dataset including multimodality and multiview cardiac MRI data from 330 healthy volunteers, and each one covers standardized and commonly used clinical protocols.
Collapse
Affiliation(s)
- Zi Wang
- Department of Electronic Science, Fujian Provincial Key
Laboratory of Plasma and Magnetic Resonance, National Institute for Data Science
in Health and Medicine, Xiamen University, Xiamen, China
- Department of Bioengineering and Imperial-X, Imperial
College London, London, United Kingdom
| | - Fanwen Wang
- Department of Bioengineering and Imperial-X, Imperial
College London, London, United Kingdom
- Cardiovascular Research Centre, Royal Brompton Hospital,
London, United Kingdom
| | - Chen Qin
- Department of Electrical and Electronic Engineering
& Imperial-X, Imperial College London, London, United Kingdom
| | - Jun Lyu
- School of Computer and Control Engineering, Yantai
University, Yantai, China
| | - Cheng Ouyang
- Department of Computing & Department of Brain
Sciences, Imperial College London, London, United Kingdom
| | - Shuo Wang
- Digital Medical Research Center, School of Basic Medical
Sciences, Fudan University, Shanghai, China
| | - Yan Li
- Department of Radiology, Ruijin Hospital, Shanghai Jiao
Tong University School of Medicine, Shanghai, China
| | - Mengyao Yu
- Human Phenome Institute, Fudan University, Shanghai,
China
| | - Haoyu Zhang
- Department of Electronic Science, Fujian Provincial Key
Laboratory of Plasma and Magnetic Resonance, National Institute for Data Science
in Health and Medicine, Xiamen University, Xiamen, China
| | - Kunyuan Guo
- Department of Electronic Science, Fujian Provincial Key
Laboratory of Plasma and Magnetic Resonance, National Institute for Data Science
in Health and Medicine, Xiamen University, Xiamen, China
| | - Zhang Shi
- Department of Radiology, Zhongshan Hospital, Fudan
University, Shanghai, China
| | - Qirong Li
- Human Phenome Institute, Fudan University, Shanghai,
China
| | - Ziqiang Xu
- School of Health Science and Engineering, University of
Shanghai for Science and Technology, Shanghai, China
| | | | - Hao Li
- Institute of Science and Technology for Brain-Inspired
Intelligence, Fudan University, Shanghai, China
| | - Sha Hua
- Department of Cardiovascular Medicine, Ruijin Hospital
Lu Wan Branch, Shanghai Jiao Tong University School of Medicine, Shanghai,
China
| | - Binghua Chen
- Department of Radiology, Ren Ji Hospital, School of
Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Longyu Sun
- Human Phenome Institute, Fudan University, Shanghai,
China
| | - Mengting Sun
- Human Phenome Institute, Fudan University, Shanghai,
China
| | - Qing Li
- Human Phenome Institute, Fudan University, Shanghai,
China
| | | | - Wenjia Bai
- Department of Computing & Department of Brain
Sciences, Imperial College London, London, United Kingdom
| | - Jing Qin
- School of Nursing, The Hong Kong Polytechnic
University, Hong Kong, China
| | - Xiahai Zhuang
- School of Data Science, Fudan University, Shanghai,
China
| | - Claudia Prieto
- School of Engineering, Pontificia Universidad
Católica de Chile, Santiago, Chile
- School of Biomedical Engineering and Imaging Sciences,
King’s College London, London, United Kingdom
- Millenium Institute for Intelligent Health care
Engineering, Santiago, Chile
| | - Alistair Young
- School of Biomedical Engineering and Imaging Sciences,
King’s College London, London, United Kingdom
| | - Michael Markl
- Department of Radiology, Feinberg School of Medicine,
Northwestern University, Chicago, Ill
| | - He Wang
- Institute of Science and Technology for Brain-Inspired
Intelligence, Fudan University, Shanghai, China
| | - Lian-Ming Wu
- Department of Radiology, Ren Ji Hospital, School of
Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Guang Yang
- Department of Bioengineering and Imperial-X, Imperial
College London, London, United Kingdom
- Cardiovascular Research Centre, Royal Brompton Hospital,
London, United Kingdom
- School of Biomedical Engineering and Imaging Sciences,
King’s College London, London, United Kingdom
| | - Xiaobo Qu
- Department of Electronic Science, Fujian Provincial Key
Laboratory of Plasma and Magnetic Resonance, National Institute for Data Science
in Health and Medicine, Xiamen University, Xiamen, China
- Department of Radiology, the First Affiliated Hospital
of Xiamen University, School of Medicine, Xiamen University, Xiamen, China
| | - Chengyan Wang
- Shanghai Pudong Hospital and Human Phenome Institute, Fudan University, Shanghai,
China
- International Human Phenome Institute (Shanghai),
Shanghai, China
| |
Collapse
|
259
|
Huang H, Balaji S, Aslan B, Wen Y, Selim M, Thomas AJ, Filippidis A, Spincemaille P, Wang Y, Soman S. Quantitative Susceptibility Mapping MRI with Computer Vision Metrics to Reduce Scan Time for Brain Hemorrhage Assessment. INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY 2025; 35:e70070. [PMID: 40161447 PMCID: PMC11951294 DOI: 10.1002/ima.70070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/26/2024] [Accepted: 03/09/2025] [Indexed: 04/02/2025]
Abstract
Purpose Optimizing clinical imaging parameters balances scan time and image quality. Quantitative Susceptibility Mapping (QSM) MRI, particularly for detecting intracranial hemorrhage (ICH), involves multiple echo times (TEs), leading to longer scan durations that can impact patient comfort and imaging efficiency. This study evaluates the necessity of specific TEs for QSM MRI in ICH patients and identifies shorter scan protocols using Computer Vision Metrics (CVMs) to maintain diagnostic accuracy. Approach 54 patients with suspected ICH were retrospectively recruited. Multi-echo Gradient Recalled Echo (mGRE) sequences with 11 TEs were used for QSM MRI (reference). Subsets of TEs compatible with producing QSM MRI images were generated, producing 71 subgroups per patient. QSM images from each subgroup were compared to reference images using 14 CVMs. Linear regression and Wilcoxon Signed-Rank tests identified optimal subgroups minimizing scan time while preserving image quality as part of the computer vision Optimized Rapid Imaging (CORI) method described. Results CVM based analysis demonstrated subgroup 1 (TE1-3) to be optimal using several CVMs, supporting a reduction in scan time from 4.5 to 1.23 minutes (73% reduction). Other CVMs suggested longer maximum TE subgroups as optimal, achieving scan time reductions of 9% to 37%. Visual assessments by a neuroradiologist and trained research assistant confirmed no significant difference in ICH area measurements between reference and CORI identified optimal subgroup derived QSM, while CORI identified worst subgroups derived QSM differed significantly (p < 0.05). Conclusions The findings support using shorter QSM MRI protocols for ICH evaluation and suggest CVMs may aid optimization efforts for other clinical imaging protocols.
Collapse
Affiliation(s)
- Huiyu Huang
- Department of Radiology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
- Department of Mechanical and Industrial Engineering, Northeastern University, Boston, Massachusetts, USA
| | - Shreyas Balaji
- Department of Radiology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Bulent Aslan
- Department of Radiology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Yan Wen
- GE Healthcare, Lincoln Medical Center, New York, NY, United States
| | - Magdy Selim
- Department of Neurology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Ajith J Thomas
- Department of Neurosurgery, Cooper University Health Care, Cooper Medical School of Rowan University, Camden, NJ USA
| | | | - Pascal Spincemaille
- Department of Radiology, Weill Cornell Medicine, New York, NY, United States
| | - Yi Wang
- Department of Radiology, Weill Cornell Medicine, New York, NY, United States
| | - Salil Soman
- Department of Radiology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
260
|
Shi S, Wang C, Xiao S, Li H, Zhao X, Guo F, Shi L, Zhou X. Magnetic resonance image denoising for Rician noise using a novel hybrid transformer-CNN network (HTC-net) and self-supervised pretraining. Med Phys 2025; 52:1643-1660. [PMID: 39641989 DOI: 10.1002/mp.17562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 11/10/2024] [Accepted: 11/14/2024] [Indexed: 12/07/2024] Open
Abstract
BACKGROUND Magnetic resonance imaging (MRI) is a crucial technique for both scientific research and clinical diagnosis. However, noise generated during MR data acquisition degrades image quality, particularly in hyperpolarized (HP) gas MRI. While deep learning (DL) methods have shown promise for MR image denoising, most of them fail to adequately utilize the long-range information which is important to improve denoising performance. Furthermore, the sample size of paired noisy and noise-free MR images also limits denoising performance. PURPOSE To develop an effective DL method that enhances denoising performance and reduces the requirement of paired MR images by utilizing the long-range information and pretraining. METHODS In this work, a hybrid Transformer-convolutional neural network (CNN) network (HTC-net) and a self-supervised pretraining strategy are proposed, which effectively enhance the denoising performance. In HTC-net, a CNN branch is exploited to extract the local features. Then a Transformer-CNN branch with two parallel encoders is designed to capture the long-range information. Within this branch, a residual fusion block (RFB) with a residual feature processing module and a feature fusion module is proposed to aggregate features at different resolutions extracted by two parallel encoders. After that, HTC-net exploits the comprehensive features from the CNN branch and the Transformer-CNN branch to accurately predict noise-free MR images through a reconstruction module. To further enhance the performance on limited MRI datasets, a self-supervised pretraining strategy is proposed. This strategy employs self-supervised denoising to equip the HTC-net with denoising capabilities during pretraining, and then the pre-trained parameters are transferred to facilitate subsequent supervised training. RESULTS Experimental results on the pulmonary HP 129Xe MRI dataset (1059 images) and IXI dataset (5000 images) all demonstrate the proposed method outperforms the state-of-the-art methods, exhibiting superior preservation of edges and structures. Quantitatively, on the pulmonary HP 129Xe MRI dataset, the proposed method outperforms the state-of-the-art methods by 0.254-0.597 dB in PSNR and 0.007-0.013 in SSIM. On the IXI dataset, the proposed method outperforms the state-of-the-art methods by 0.3-0.927 dB in PSNR and 0.003-0.016 in SSIM. CONCLUSIONS The proposed method can effectively enhance the quality of MR images, which helps improve the diagnosis accuracy in clinical.
Collapse
Affiliation(s)
- Shengjie Shi
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences-Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
| | - Cheng Wang
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences-Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
- School of Physics and Optoelectronic Engineering, Yangtze University, Jingzhou, China
| | - Sa Xiao
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences-Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Haidong Li
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences-Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xiuchao Zhao
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences-Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Fumin Guo
- Wuhan National Laboratory for Optoelectronics, Department of Biomedical Engineering, Huazhong University of Science and Technology, Wuhan, China
| | - Lei Shi
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences-Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xin Zhou
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences-Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
- Key Laboratory of Biomedical Engineering of Hainan Province, School of Biomedical Engineering, Hainan University, Haikou, China
| |
Collapse
|
261
|
Aumente-Maestro C, Díez J, Remeseiro B. A multi-task framework for breast cancer segmentation and classification in ultrasound imaging. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 260:108540. [PMID: 39647406 DOI: 10.1016/j.cmpb.2024.108540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Revised: 11/08/2024] [Accepted: 11/28/2024] [Indexed: 12/10/2024]
Abstract
BACKGROUND Ultrasound (US) is a medical imaging modality that plays a crucial role in the early detection of breast cancer. The emergence of numerous deep learning systems has offered promising avenues for the segmentation and classification of breast cancer tumors in US images. However, challenges such as the absence of data standardization, the exclusion of non-tumor images during training, and the narrow view of single-task methodologies have hindered the practical applicability of these systems, often resulting in biased outcomes. This study aims to explore the potential of multi-task systems in enhancing the detection of breast cancer lesions. METHODS To address these limitations, our research introduces an end-to-end multi-task framework designed to leverage the inherent correlations between breast cancer lesion classification and segmentation tasks. Additionally, a comprehensive analysis of a widely utilized public breast cancer ultrasound dataset named BUSI was carried out, identifying its irregularities and devising an algorithm tailored for detecting duplicated images in it. RESULTS Experiments are conducted utilizing the curated dataset to minimize potential biases in outcomes. Our multi-task framework exhibits superior performance in breast cancer respecting single-task approaches, achieving improvements close to 15% in segmentation and classification. Moreover, a comparative analysis against the state-of-the-art reveals statistically significant enhancements across both tasks. CONCLUSION The experimental findings underscore the efficacy of multi-task techniques, showcasing better generalization capabilities when considering all image types: benign, malignant, and non-tumor images. Consequently, our methodology represents an advance towards more general architectures with real clinical applications in the breast cancer field.
Collapse
Affiliation(s)
| | - Jorge Díez
- Artificial Intelligence Center, Universidad de Oviedo, Gijón, 33204, Spain
| | - Beatriz Remeseiro
- Artificial Intelligence Center, Universidad de Oviedo, Gijón, 33204, Spain.
| |
Collapse
|
262
|
Arun PS, Francis B, Gopi VP. NL-CoWNet: A Deep Convolutional Encoder–Decoder Architecture for OCT Speckle Elimination Using Nonlocal and Subband Modulated DT-CWT Blocks. IEEE TRANSACTIONS ON ARTIFICIAL INTELLIGENCE 2025; 6:700-709. [DOI: 10.1109/tai.2024.3491935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/01/2025]
Affiliation(s)
- P. S. Arun
- Department of Electronics and Communication Engineering, National Institute of Tiruchirappalli, Tiruchirappalli, Tamil Nadu, India
| | - Bibin Francis
- Department of Electronics and Communication Engineering, National Institute of Tiruchirappalli, Tiruchirappalli, Tamil Nadu, India
| | - Varun P. Gopi
- Department of Electronics and Communication Engineering, National Institute of Tiruchirappalli, Tiruchirappalli, Tamil Nadu, India
| |
Collapse
|
263
|
Balmez R, Brateanu A, Orhei C, Ancuti CO, Ancuti C. DepthLux: Employing Depthwise Separable Convolutions for Low-Light Image Enhancement. SENSORS (BASEL, SWITZERLAND) 2025; 25:1530. [PMID: 40096403 PMCID: PMC11902424 DOI: 10.3390/s25051530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2025] [Revised: 02/22/2025] [Accepted: 02/27/2025] [Indexed: 03/19/2025]
Abstract
Low-light image enhancement is an important task in computer vision, often made challenging by the limitations of image sensors, such as noise, low contrast, and color distortion. These challenges are further exacerbated by the computational demands of processing spatial dependencies under such conditions. We present a novel transformer-based framework that enhances efficiency by utilizing depthwise separable convolutions instead of conventional approaches. Additionally, an original feed-forward network design reduces the computational overhead while maintaining high performance. Experimental results demonstrate that this method achieves competitive results, providing a practical and effective solution for enhancing images captured in low-light environments.
Collapse
Affiliation(s)
- Raul Balmez
- Department of Computer Science, University of Manchester, Manchester M13 9PL, UK; (R.B.); (A.B.)
| | - Alexandru Brateanu
- Department of Computer Science, University of Manchester, Manchester M13 9PL, UK; (R.B.); (A.B.)
| | - Ciprian Orhei
- Faculty of Electronics, Telecommunications and Information Technologies, Polytechnic University Timisoara, 300223 Timisoara, Romania;
| | - Codruta O. Ancuti
- Faculty of Electronics, Telecommunications and Information Technologies, Polytechnic University Timisoara, 300223 Timisoara, Romania;
| | - Cosmin Ancuti
- Faculty of Electronics, Telecommunications and Information Technologies, Polytechnic University Timisoara, 300223 Timisoara, Romania;
| |
Collapse
|
264
|
Hoyoshi K, Sato K, Homma N, Mori I. Noise-related inaccuracies in the quantitative evaluation of CT artifacts. Radiol Phys Technol 2025; 18:157-171. [PMID: 39776374 DOI: 10.1007/s12194-024-00869-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2024] [Revised: 11/05/2024] [Accepted: 11/26/2024] [Indexed: 01/11/2025]
Abstract
Accuracies of measuring the artifact index (AI), a quantitative artifact evaluation index in X-ray CT images, were investigated. The AI is calculated based not only on the standard deviation (SD) of the artifact area in the image, but also on the SD of noise components for considering the noise influence. However, conventional measurement methods may not follow this consideration, for example the non-uniformity of the noise distribution is not taken into account, resulting in reducing the accuracy of AI. To address this problem, this study aims to clarify the impact of noise SD measuring (NSDM) error on AI accuracy and improve the accuracy by reducing the NSDM error. Experimental results demonstrated that the conventional noise measurement methods reduced the accuracy of the AI. Specifically, AI inaccuracy due to the NSDM error is severe in the case of weak artifacts and under high noise conditions. Furthermore, the AI accuracy can be improved by reducing the influence of the NSDM error through image smoothing or by correcting NSDM through noise distribution estimation. These results showed that AI can be affected by NSDM errors practically even though it is robust against noise in principle. The impact of NSDM errors must be avoided for reliable artifact evaluation.
Collapse
Affiliation(s)
- Kazutaka Hoyoshi
- Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, 2-1 Seiryo-Machi, Aoba-Ku, Sendai, Miyagi, 980-8575, Japan.
- Department of Radiology, Yamagata University Hospital, 2-2-2 Iidanishi, Yamagata, Yamagata, 990-9585, Japan.
| | - Kazuhiro Sato
- Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, 2-1 Seiryo-Machi, Aoba-Ku, Sendai, Miyagi, 980-8575, Japan
- Faculty of Health Sciences, Department of Radiological Technology, Hokkaido University of Science, 15-4-1, 7-Jo, Teine-Ku, Sapporo, Hokkaido, 006-8585, Japan
| | - Noriyasu Homma
- Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, 2-1 Seiryo-Machi, Aoba-Ku, Sendai, Miyagi, 980-8575, Japan
| | | |
Collapse
|
265
|
Wang Z, Chien JH, He C. The effect of bilateral knee osteoarthritis on gait symmetry during walking on different heights of staircases. J Biomech 2025; 182:112583. [PMID: 39955796 DOI: 10.1016/j.jbiomech.2025.112583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2024] [Revised: 02/04/2025] [Accepted: 02/10/2025] [Indexed: 02/18/2025]
Abstract
Knee osteoarthritis (KOA) can lead to asymmetric gait, which is one of many potential risk factors for falls. Particularly, those working in industrial environments are often required to navigate stairs, yet there is limited understanding of how KOA impacts gait symmetry during stair negotiation. The goal of this study was to find out how negotiating stairs affects the balance of walking in people with bilateral KOA by measuring ground reaction forces (GRFs). Fifteen patients with bilateral KOA and fifteen healthy controls were recruited for the study. Participants were instructed to perform level-ground walking, as well as ascending and descending stairs at two different heights (180 mm and 210 mm). GRF symmetry was assessed using the symmetric index, cross-correlation (Xcorr), mean square error, root mean square error, maximum error, and mutual information (MI) methods. A significant interaction between the effect of staircase height and the effect of KOA was found in Xcorr in the anterior-posterior (AP, p < 0.001), medial-lateral (ML, p = 0.044) directions, and MI (AP, p < 0.001). Xcorr and MI were significantly smaller in KOA than in controls while ascending and descending the 210 mm staircase, indicating a significantly asymmetric gait in AP direction when descending or ascending stairs. However, no significant interactions were found when using other measures. The conclusions were that 1) reducing the height of the staircase may help KOA patients achieve better symmetry and lower the risk of falls in the industrial environment, and 2) the XCorr was suggested to measure the gait symmetry.
Collapse
Affiliation(s)
- Zhuo Wang
- Rehabilitation Medicine Center and Institute of Rehabilitation Medicine, West China Hospital, Sichuan University, Chengdu 610041 China; Key Laboratory of Rehabilitation Medicine in Sichuan Province, West China Hospital, Sichuan University, Chengdu 610041 China
| | | | - Chengqi He
- Rehabilitation Medicine Center and Institute of Rehabilitation Medicine, West China Hospital, Sichuan University, Chengdu 610041 China; Key Laboratory of Rehabilitation Medicine in Sichuan Province, West China Hospital, Sichuan University, Chengdu 610041 China.
| |
Collapse
|
266
|
Rashidi HH, Hu B, Pantanowitz J, Tran N, Liu S, Chamanzar A, Gur M, Chang CCH, Wang Y, Tafti A, Pantanowitz L, Hanna MG. Statistics of Generative Artificial Intelligence and Nongenerative Predictive Analytics Machine Learning in Medicine. Mod Pathol 2025; 38:100663. [PMID: 39579984 DOI: 10.1016/j.modpat.2024.100663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2024] [Accepted: 11/11/2024] [Indexed: 11/25/2024]
Abstract
The rapidly evolving landscape of artificial intelligence (AI) and machine learning (ML) in medicine has prompted medical professionals to increasingly familiarize themselves with related topics. This also demands grasping the underlying statistical principles that govern their design, validation, and reproducibility. Uniquely, the practice of pathology and medicine produces vast amount of data that can be exploited by AI/ML. The emergence of generative AI, especially in the area of large language models and multimodal frameworks, represents approaches that are starting to transform medicine. Fundamentally, generative and traditional (eg, nongenerative predictive analytics) ML techniques rely on certain common statistical measures to function. However, unique to generative AI are metrics such as, but not limited to, perplexity and BiLingual Evaluation Understudy score that provide a means to determine the quality of generated samples that are typically unfamiliar to most medical practitioners. In contrast, nongenerative predictive analytics ML often uses more familiar metrics tailored to specific tasks as seen in the typical classification (ie, confusion metrics measures, such as accuracy, sensitivity, F1 score, and receiver operating characteristic area under the curve) or regression studies (ie, root mean square error and R2). To this end, the goal of this review article (as part 4 of our AI review series) is to provide an overview and a comparative measure of statistical measures and methodologies used in both generative AI and traditional (ie, nongenerative predictive analytics) ML fields along with their strengths and known limitations. By understanding their similarities and differences along with their respective applications, we will become better stewards of this transformative space, which ultimately enables us to better address our current and future needs and challenges in a more responsible and scientifically sound manner.
Collapse
Affiliation(s)
- Hooman H Rashidi
- Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania; Computational Pathology and AI Center of Excellence, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania.
| | - Bo Hu
- Department of Quantitative Health Sciences, Cleveland Clinic, Cleveland, Ohio
| | | | - Nam Tran
- Department of Pathology, UC Davis School of Medicine, Sacramento, California
| | - Silvia Liu
- Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | - Alireza Chamanzar
- Computational Pathology and AI Center of Excellence, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania; Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania
| | - Mert Gur
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania; Department of Mechanical Engineering, Istanbul Technical University, Istanbul, Turkey
| | - Chung-Chou H Chang
- Department of Medicine and Biostatistics, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Yanshan Wang
- Computational Pathology and AI Center of Excellence, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania; Department of Health Information Management, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Ahmad Tafti
- Computational Pathology and AI Center of Excellence, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania; Department of Health Information Management, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Liron Pantanowitz
- Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania; Computational Pathology and AI Center of Excellence, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Matthew G Hanna
- Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania; Computational Pathology and AI Center of Excellence, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania.
| |
Collapse
|
267
|
Huang W, Dai Y, Fei J, Huang F. MNet: A multi-scale network for visible watermark removal. Neural Netw 2025; 183:106961. [PMID: 39647319 DOI: 10.1016/j.neunet.2024.106961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2024] [Revised: 10/13/2024] [Accepted: 11/25/2024] [Indexed: 12/10/2024]
Abstract
Superimposing visible watermarks on images is an efficient way to indicate ownership and prevent potential unauthorized use. Visible watermark removal technology is receiving increasing attention from researchers due to its ability to enhance the robustness of visible watermarks. In this paper, we propose MNet, a novel multi-scale network for visible watermark removal. In MNet, a variable number of simple U-Nets are stacked in each scale. There are two branches in MNet, i.e., the background restoration branch and the mask prediction branch. In the background restoration branch, we propose a different approach from current methods. Instead of directly reconstructing the background image, we pay great attention to predicting the anti-watermark image. In the watermark mask prediction branch, we adopt dice loss. This further supervises the predicted mask for better prediction accuracy. To make information flow more effective, we employ cross-layer feature fusion and intra-layer feature fusion among U-Nets. Moreover, a scale reduction module is employed to capture multi-scale information effectively. Our approach is evaluated on three different datasets, and the experimental results show that our approach achieves better performance than other state-of-the-art methods. Code will be available at https://github.com/Aitchson-Hwang/MNet.
Collapse
Affiliation(s)
- Wenhong Huang
- School of Cyber Science and Technology, Shenzhen Campus of Sun Yat-sen University, Shenzhen, China.
| | - Yunshu Dai
- School of Cyber Science and Technology, Shenzhen Campus of Sun Yat-sen University, Shenzhen, China.
| | - Jianwei Fei
- School of Cyber Science and Technology, Shenzhen Campus of Sun Yat-sen University, Shenzhen, China.
| | - Fangjun Huang
- School of Cyber Science and Technology, Shenzhen Campus of Sun Yat-sen University, Shenzhen, China.
| |
Collapse
|
268
|
Giannakopoulos II, Carluccio G, Keerthivasan MB, Koerzdoerfer G, Lakshmanan K, De Moura HL, Serrallés JEC, Lattanzi R. MR electrical properties mapping using vision transformers and canny edge detectors. Magn Reson Med 2025; 93:1117-1131. [PMID: 39415436 PMCID: PMC11955224 DOI: 10.1002/mrm.30338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2024] [Revised: 09/24/2024] [Accepted: 09/24/2024] [Indexed: 10/18/2024]
Abstract
PURPOSE We developed a 3D vision transformer-based neural network to reconstruct electrical properties (EP) from magnetic resonance measurements. THEORY AND METHODS Our network uses the magnitude of the transmit magnetic field of a birdcage coil, the associated transceive phase, and a Canny edge mask that identifies the object boundaries as inputs to compute the EP maps. We trained our network on a dataset of 10 000 synthetic tissue-mimicking phantoms and fine-tuned it on a dataset of 11 000 realistic head models. We assessed performance in-distribution simulated data and out-of-distribution head models, with and without synthetic lesions. We further evaluated our network in experiments for an inhomogeneous phantom and a volunteer. RESULTS The conductivity and permittivity maps had an average peak normalized absolute error (PNAE) of 1.3% and 1.7% for the synthetic phantoms, respectively. For the realistic heads, the average PNAE for the conductivity and permittivity was 1.8% and 2.7%, respectively. The location of synthetic lesions was accurately identified, with reconstructed conductivity and permittivity values within 15% and 25% of the ground-truth, respectively. The conductivity and permittivity for the phantom experiment yielded 2.7% and 2.1% average PNAEs with respect to probe-measured values, respectively. The in vivo EP reconstruction truthfully preserved the subject's anatomy with average values over the entire head similar to the expected literature values. CONCLUSION We introduced a new learning-based approach for reconstructing EP from MR measurements obtained with a birdcage coil, marking an important step towards the development of clinically-usable in vivo EP reconstruction protocols.
Collapse
Affiliation(s)
- Ilias I. Giannakopoulos
- The Bernard and Irene Schwartz Center for Biomedical Imaging and Center for Advanced Imaging Innovation and Research (CAIR), Department of Radiology, New York University Grossman School of Medicine, New York, New York, USA
| | | | | | | | - Karthik Lakshmanan
- The Bernard and Irene Schwartz Center for Biomedical Imaging and Center for Advanced Imaging Innovation and Research (CAIR), Department of Radiology, New York University Grossman School of Medicine, New York, New York, USA
| | - Hector L. De Moura
- The Bernard and Irene Schwartz Center for Biomedical Imaging and Center for Advanced Imaging Innovation and Research (CAIR), Department of Radiology, New York University Grossman School of Medicine, New York, New York, USA
| | - José E. Cruz Serrallés
- The Bernard and Irene Schwartz Center for Biomedical Imaging and Center for Advanced Imaging Innovation and Research (CAIR), Department of Radiology, New York University Grossman School of Medicine, New York, New York, USA
| | - Riccardo Lattanzi
- The Bernard and Irene Schwartz Center for Biomedical Imaging and Center for Advanced Imaging Innovation and Research (CAIR), Department of Radiology, New York University Grossman School of Medicine, New York, New York, USA
| |
Collapse
|
269
|
Shaffer BD, Vorenberg JR, Hsieh MA. Spectrally informed learning of fluid flows. CHAOS (WOODBURY, N.Y.) 2025; 35:033126. [PMID: 40085662 DOI: 10.1063/5.0235257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Accepted: 02/26/2025] [Indexed: 03/16/2025]
Abstract
Accurate and efficient fluid flow models are essential for applications relating to many physical phenomena, including geophysical, aerodynamic, and biological systems. While these flows may exhibit rich and multiscale dynamics, in many cases, underlying low-rank structures exist, which describe the bulk of the motion. These structures tend to be spatially large and temporally slow and may contain most of the energy in a given flow. The extraction and parsimonious representation of these low-rank dynamics from high-dimensional data is a key challenge. Inspired by the success of physics-informed machine learning methods, we propose a spectrally informed approach to extract low-rank models of fluid flows by leveraging known spectral properties in the learning process. We incorporate this knowledge by imposing regularizations on the learned dynamics, which bias the training process toward learning low-frequency structures with corresponding higher power. We demonstrate the effectiveness of this method to improve prediction and produce learned models, which better match the underlying spectral properties of prototypical fluid flows.
Collapse
Affiliation(s)
- Benjamin D Shaffer
- Department of Mechanical Engineering and Applied Mechanics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | | | - M Ani Hsieh
- Department of Mechanical Engineering and Applied Mechanics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
270
|
Kang Y, Zhu D, Zhang H, Shi E, Yu S, Wu J, Wang R, Chen G, Jiang X, Zhang T, Zhang S. Identifying influential nodes in brain networks via self-supervised graph-transformer. Comput Biol Med 2025; 186:109629. [PMID: 39731922 DOI: 10.1016/j.compbiomed.2024.109629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 12/24/2024] [Accepted: 12/24/2024] [Indexed: 12/30/2024]
Abstract
BACKGROUND Studying influential nodes (I-nodes) in brain networks is of great significance in the field of brain imaging. Most existing studies consider brain connectivity hubs as I-nodes such as the regions of high centrality or rich-club organization. However, this approach relies heavily on prior knowledge from graph theory, which may overlook the intrinsic characteristics of the brain network, especially when its architecture is not fully understood. In contrast, self-supervised deep learning dispenses with manual features, allowing it to learn meaningful representations directly from the data. This approach enables the exploration of I-nodes for brain networks, which is also lacking in current studies. METHOD This paper proposes a Self-Supervised Graph Reconstruction framework based on Graph-Transformer (SSGR-GT) to identify I-nodes, which has three main characteristics. First, as a self-supervised model, SSGR-GT extracts the importance of brain nodes to the reconstruction. Second, SSGR-GT uses Graph-Transformer, which is well-suited for extracting features from brain graphs, combining both local and global characteristics. Third, multimodal analysis of I-nodes uses graph-based fusion technology, combining functional and structural brain information. RESULTS The I-nodes we obtained are distributed in critical areas such as the superior frontal lobe, lateral parietal lobe, and lateral occipital lobe, with a total of 56 identified across different experiments. These I-nodes are involved in more brain networks than other regions, have longer fiber connections, and occupy more central positions in structural connectivity. They also exhibit strong connectivity and high node efficiency in both functional and structural networks. Furthermore, there is a significant overlap between the I-nodes and both the structural and functional rich-club. CONCLUSIONS Experimental results verify the effectiveness of the proposed method, and I-nodes are obtained and discussed. These findings enhance our understanding of the I-nodes within the brain network, and provide new insights for future research in further understanding the brain working mechanisms.
Collapse
Affiliation(s)
- Yanqing Kang
- Center for Brain and Brain-Inspired Computing Research, School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Di Zhu
- Center for Brain and Brain-Inspired Computing Research, School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Haiyang Zhang
- Center for Brain and Brain-Inspired Computing Research, School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Enze Shi
- Center for Brain and Brain-Inspired Computing Research, School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Sigang Yu
- Center for Brain and Brain-Inspired Computing Research, School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Jinru Wu
- Center for Brain and Brain-Inspired Computing Research, School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Ruoyang Wang
- Center for Brain and Brain-Inspired Computing Research, School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Geng Chen
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Xi Jiang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Tuo Zhang
- School of Automation, Northwestern Polytechnical University, Xi'an, China
| | - Shu Zhang
- Center for Brain and Brain-Inspired Computing Research, School of Computer Science, Northwestern Polytechnical University, Xi'an, China.
| |
Collapse
|
271
|
Zhao X, Liang H, Liang R. Position Fusing and Refining for Clear Salient Object Detection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:4019-4028. [PMID: 36331652 DOI: 10.1109/tnnls.2022.3213557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Multilevel feature fusion plays a pivotal role in salient object detection (SOD). High-level features present rich semantic information but lack object position information, whereas low-level features contain object position information but are mixed with noises such as backgrounds. Appropriately addressing the gap between low- and high-level features is important in SOD. We first propose a global position embedding attention (GPEA) module to minimize the discrepancy between multilevel features in this article. We extract the position information by utilizing the semantic information at high-level features to resist noises at low-level features. Object refine attention (ORA) module is introduced to refine features used to predict saliency maps further without any additional supervision and heighten discriminative regions near the salient object, such as boundaries. Moreover, we find that the saliency maps generated by the previous methods contain some blurry regions, and we design a pixel value (PV) loss to help the model generate saliency maps with improved clarity. Experimental results on five commonly used SOD datasets demonstrated that the proposed method is effective and outperforms the state-of-the-art approaches on multiple metrics.
Collapse
|
272
|
Han Q, Jung C. Deep Selective Fusion of Visible and Near-Infrared Images Using Unsupervised U-Net. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:4172-4183. [PMID: 35100123 DOI: 10.1109/tnnls.2022.3142780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In low light conditions, visible (VIS) images are of a low dynamic range (low contrast) with severe noise and color, while near-infrared (NIR) images contain clear textures without noise and color. Multispectral fusion of VIS and NIR images produces color images of high quality, rich textures, and little noise by taking both advantages of VIS and NIR images. In this article, we propose the deep selective fusion of VIS and NIR images using unsupervised U-Net. Existing image fusion methods are afflicted with the low contrast in VIS images and flash-like effect in NIR images. Thus, we adopt unsupervised U-Net to achieve deep selective fusion of multiple scale features. Due to the absence of the ground truth, we use unsupervised learning by formulating an energy function as a loss function. To deal with insufficient training data, we perform data augmentation by rotating images and adjusting their intensity. We synthesize training data by degrading clean VIS images and masking clean NIR images using a circle. First, we utilize pretrained visual geometry group (VGG) to extract features from VIS images. Second, we build an encoding network to obtain edge information from NIR images. Finally, we combine all features and feed them into a decoding network for fusion. Experimental results demonstrate that the proposed fusion network produces visually pleasing results with fine details, little noise, and natural color and it is superior to state-of-the-art methods in terms of visual quality and quantitative measurements.
Collapse
|
273
|
Yan Y, Kim JP, Nejad-Davarani SP, Dong M, Hurst NJ, Zhao J, Glide-Hurst CK. Deep Learning-Based Synthetic Computed Tomography for Low-Field Brain Magnetic Resonance-Guided Radiation Therapy. Int J Radiat Oncol Biol Phys 2025; 121:832-843. [PMID: 39357787 PMCID: PMC11875202 DOI: 10.1016/j.ijrobp.2024.09.046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 08/27/2024] [Accepted: 09/18/2024] [Indexed: 10/04/2024]
Abstract
PURPOSE Magnetic resonance (MR)-guided radiation therapy enables online adaptation to address intra- and interfractional changes. To address the need of high-fidelity synthetic computed tomography (synCT) required for dose calculation, we developed a conditional generative adversarial network for synCT generation from low-field MR imaging in the brain. METHODS AND MATERIALS Simulation MR-CT pairs from 12 patients with glioma imaged with a head and neck surface coil and treated on a 0.35T MR-linac were prospectively included to train the model consisting of a 9-block residual network generator and a PatchGAN discriminator. Four-fold cross-validation was implemented. SynCT was quantitatively evaluated against real CT using mean absolute error (MAE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM). Dose was calculated on synCT applying original treatment plan. Dosimetric performance was evaluated by dose-volume histogram metric comparison and local 3-dimensional gamma analysis. To demonstrate utilization in treatment adaptation, longitudinal synCTs were generated for qualitative evaluation, and 1 offline adaptation case underwent 2 comparative plan evaluations. Secondary validation was conducted with 9 patients on a different MR-linac using a high-resolution brain coil. RESULTS Our model generated high-quality synCTs with MAE, PSNR, and SSIM of 70.9 ± 10.4 HU, 28.4 ± 1.5 dB, and 0.87 ± 0.02 within the field of view, respectively. Underrepresented postsurgical anomalies challenged model performance. Nevertheless, excellent dosimetric agreement was observed with the mean difference between real and synCT dose-volume histogram metrics of -0.07 ± 0.29 Gy for target D95 and within [-0.14, 0.02] Gy for organs at risk. Significant differences were only observed in the right lens D0.01cc with negligible overall difference (<0.13 Gy). Mean gamma analysis pass rates were 92.2% ± 3.0%, 99.2% ± 0.7%, and 99.9% ± 0.1% at 1%/1 mm, 2%/2 mm, and 3%/3 mm, respectively. Secondary validation yielded no significant differences in synCT performance for whole-brain MAE, PSNR, and SSIM with comparable dosimetric results. CONCLUSIONS Our conditional generative adversarial network model generated high-fidelity brain synCTs from low-field MR imaging with excellent dosimetric performance. Secondary validation suggests great promise of implementing synCTs to facilitate robust dose calculation for online adaptive brain MR-guided radiation therapy.
Collapse
Affiliation(s)
- Yuhao Yan
- Department of Human Oncology, University of Wisconsin-Madison, Madison, Wisconsin; Department of Medical Physics, University of Wisconsin-Madison, Madison, Wisconsin
| | - Joshua P Kim
- Department of Radiation Oncology, Henry Ford Health, Detroit, Michigan
| | | | - Ming Dong
- Department of Computer Science, Wayne State University, Detroit, Michigan
| | - Newton J Hurst
- Department of Human Oncology, University of Wisconsin-Madison, Madison, Wisconsin
| | - Jiwei Zhao
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin; Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin
| | - Carri K Glide-Hurst
- Department of Human Oncology, University of Wisconsin-Madison, Madison, Wisconsin; Department of Medical Physics, University of Wisconsin-Madison, Madison, Wisconsin.
| |
Collapse
|
274
|
Moro F, Giudice MT, Ciancia M, Zace D, Baldassari G, Vagni M, Tran HE, Scambia G, Testa AC. Application of artificial intelligence to ultrasound imaging for benign gynecological disorders: systematic review. ULTRASOUND IN OBSTETRICS & GYNECOLOGY : THE OFFICIAL JOURNAL OF THE INTERNATIONAL SOCIETY OF ULTRASOUND IN OBSTETRICS AND GYNECOLOGY 2025; 65:295-302. [PMID: 39888598 PMCID: PMC11872345 DOI: 10.1002/uog.29171] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 12/05/2024] [Accepted: 12/05/2024] [Indexed: 02/01/2025]
Abstract
OBJECTIVE Although artificial intelligence (AI) is increasingly being applied to ultrasound imaging in gynecology, efforts to synthesize the available evidence have been inadequate. The aim of this systematic review was to summarize and evaluate the literature on the role of AI applied to ultrasound imaging in benign gynecological disorders. METHODS Web of Science, PubMed and Scopus databases were searched from inception until August 2024. Inclusion criteria were studies applying AI to ultrasound imaging in the diagnosis and management of benign gynecological disorders. Studies retrieved from the literature search were imported into Rayyan software and quality assessment was performed using the Quality Assessment Tool for Artificial Intelligence-Centered Diagnostic Test Accuracy Studies (QUADAS-AI). RESULTS Of the 59 studies included, 12 were on polycystic ovary syndrome (PCOS), 11 were on infertility and assisted reproductive technology, 11 were on benign ovarian pathology (i.e. ovarian cysts, ovarian torsion, premature ovarian failure), 10 were on endometrial or myometrial pathology, nine were on pelvic floor disorder and six were on endometriosis. China was the most highly represented country (22/59 (37.3%)). According to QUADAS-AI, most studies were at high risk of bias for the subject selection domain (because the sample size, source or scanner model was not specified, data were not derived from open-source datasets and/or imaging preprocessing was not performed) and the index test domain (AI models were not validated externally), and at low risk of bias for the reference standard domain (the reference standard classified the target condition correctly) and the workflow domain (the time between the index test and the reference standard was reasonable). Most studies (40/59) developed and internally validated AI classification models for distinguishing between normal and pathological cases (i.e. presence vs absence of PCOS, pelvic endometriosis, urinary incontinence, ovarian cyst or ovarian torsion), whereas 19/59 studies aimed to automatically segment or measure ovarian follicles, ovarian volume, endometrial thickness, uterine fibroids or pelvic floor structures. CONCLUSION The published literature on AI applied to ultrasound in benign gynecological disorders is focused mainly on creating classification models to distinguish between normal and pathological cases, and on developing models to automatically segment or measure ovarian volume or follicles. © 2025 The Author(s). Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of International Society of Ultrasound in Obstetrics and Gynecology.
Collapse
Affiliation(s)
- F. Moro
- Dipartimento Scienze della Salute della Donna, del Bambino e di Sanità PubblicaFondazione Policlinico Universitario Agostino Gemelli, IRCCSRomeItaly
- UniCamillus International Medical UniversityRomeItaly
| | - M. T. Giudice
- Dipartimento Scienze della Salute della Donna, del Bambino e di Sanità PubblicaFondazione Policlinico Universitario Agostino Gemelli, IRCCSRomeItaly
| | - M. Ciancia
- Dipartimento Scienze della Salute della Donna, del Bambino e di Sanità PubblicaFondazione Policlinico Universitario Agostino Gemelli, IRCCSRomeItaly
| | - D. Zace
- Infectious Disease Clinic, Department of Systems MedicineTor Vergata UniversityRomeItaly
| | - G. Baldassari
- Radiomics G‐STeP Research Core Facility, Fondazione Policlinico Universitario A. Gemelli, IRCCSRomeItaly
| | - M. Vagni
- Istituto di RadiologiaUniversità Cattolica del Sacro CuoreRomeItaly
| | - H. E. Tran
- Radiomics G‐STeP Research Core Facility, Fondazione Policlinico Universitario A. Gemelli, IRCCSRomeItaly
| | - G. Scambia
- Dipartimento Scienze della Salute della Donna, del Bambino e di Sanità PubblicaFondazione Policlinico Universitario Agostino Gemelli, IRCCSRomeItaly
- Dipartimento Universitario Scienze della Vita e Sanità PubblicaUniversità Cattolica del Sacro CuoreRomeItaly
| | - A. C. Testa
- Dipartimento Scienze della Salute della Donna, del Bambino e di Sanità PubblicaFondazione Policlinico Universitario Agostino Gemelli, IRCCSRomeItaly
- Dipartimento Universitario Scienze della Vita e Sanità PubblicaUniversità Cattolica del Sacro CuoreRomeItaly
| |
Collapse
|
275
|
Ma Y, Wang J, Xu C, Huang Y, Chu M, Fan Z, Xu Y, Wu D. CDAF-Net: A Contextual Contrast Detail Attention Feature Fusion Network for Low-Dose CT Denoising. IEEE J Biomed Health Inform 2025; 29:2048-2060. [PMID: 40030295 DOI: 10.1109/jbhi.2024.3506785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2025]
Abstract
Low-dose computed tomography (LDCT) is a specialized CT scan with a lower radiation dose than normal-dose CT. However, the reduced radiation dose can introduce noise and artifacts, affecting diagnostic accuracy. To enhance the LDCT image quality, we propose a Contextual Contrast Detail Attention Feature Fusion Network (CDAF-Net) for LDCT denoising. Firstly, the LDCT image, with dimensions 1 × H × W, is mapped to a feature map with dimensions C × H × W, and it is processed through the Contextual Contrast Detail Attention (CCDA) module and the Selective Kernel Feature Fusion (SKFF) module. The CCDA module combines a global contextual attention mechanism with detail-enhanced differential convolutions to better understand the overall semantics and structure of the LDCT image, capturing subtle changes and details. The SKFF module effectively merges shallow features extracted by the encoder with deep features from the decoder, integrating feature representations from different levels. This process is repeated across four different resolution feature maps, and the denoised LDCT image is output through a skip connection. We conduct experiments on the Mayo dataset, the LDCT-and-Projection-Data dataset, and the Piglet dataset. Specifically, the CDAF-Net achieves the optimal metrics with a PSNR of 33.7262 dB, an SSIM of 0.9254, and an RMSE of 5.3731 on the Mayo dataset. Improvements are also observed in head CT and ultra-low-dose chest CT images of the LDCT-and-Projection-Data dataset and the Piglet dataset. Experimental results show that the proposed CDAF-Net algorithm provides superior denoising performance compared with the state-of-the-art (SOTA) algorithms.
Collapse
|
276
|
Fu L, Li L, Lu B, Guo X, Shi X, Tian J, Hu Z. Deep Equilibrium Unfolding Learning for Noise Estimation and Removal in Optical Molecular Imaging. Comput Med Imaging Graph 2025; 120:102492. [PMID: 39823663 DOI: 10.1016/j.compmedimag.2025.102492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 01/03/2025] [Accepted: 01/03/2025] [Indexed: 01/19/2025]
Abstract
In clinical optical molecular imaging, the need for real-time high frame rates and low excitation doses to ensure patient safety inherently increases susceptibility to detection noise. Faced with the challenge of image degradation caused by severe noise, image denoising is essential for mitigating the trade-off between acquisition cost and image quality. However, prevailing deep learning methods exhibit uncontrollable and suboptimal performance with limited interpretability, primarily due to neglecting underlying physical model and frequency information. In this work, we introduce an end-to-end model-driven Deep Equilibrium Unfolding Mamba (DEQ-UMamba) that integrates proximal gradient descent technique and learnt spatial-frequency characteristics to decouple complex noise structures into statistical distributions, enabling effective noise estimation and suppression in fluorescent images. Moreover, to address the computational limitations of unfolding networks, DEQ-UMamba trains an implicit mapping by directly differentiating the equilibrium point of the convergent solution, thereby ensuring stability and avoiding non-convergent behavior. With each network module aligned to a corresponding operation in the iterative optimization process, the proposed method achieves clear structural interpretability and strong performance. Comprehensive experiments conducted on both clinical and in vivo datasets demonstrate that DEQ-UMamba outperforms current state-of-the-art alternatives while utilizing fewer parameters, facilitating the advancement of cost-effective and high-quality clinical molecular imaging.
Collapse
Affiliation(s)
- Lidan Fu
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Lingbing Li
- Interventional Radiology Department, Chinese PLA General Hospital, Beijing 100039, China
| | - Binchun Lu
- Department of Precision Instrument, Tsinghua University, Beijing 100084, China
| | - Xiaoyong Guo
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Department of Gastrointestinal Cancer Center, Ward I, Peking University Cancer Hospital & Institute, Beijing 100142, China
| | - Xiaojing Shi
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jie Tian
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China; Key Laboratory of Big Data-Based Precision Medicine of Ministry of Industry and Information Technology, School of Engineering Medicine, Beihang University, Beijing 100191, China; Engineering Research Center of Molecular and Neuro Imaging of Ministry of Education, School of Life Science and Technology, Xidian University, Xi'an 710071, China; National Key Laboratory of Kidney Diseases, Beijing 100853, China.
| | - Zhenhua Hu
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China; National Key Laboratory of Kidney Diseases, Beijing 100853, China.
| |
Collapse
|
277
|
Lee S, Heo S, Lee S. DMESH: A Structure-Preserving Diffusion Model for 3-D Mesh Denoising. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:4385-4399. [PMID: 38412085 DOI: 10.1109/tnnls.2024.3367327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
Denoising diffusion models have shown a powerful capacity for generating high-quality image samples by progressively removing noise. Inspired by this, we present a diffusion-based mesh denoiser that progressively removes noise from mesh. In general, the iterative algorithm of diffusion models attempts to manipulate the overall structure and fine details of target meshes simultaneously. For this reason, it is difficult to apply the diffusion process to a mesh denoising task that removes artifacts while maintaining a structure. To address this, we formulate a structure-preserving diffusion process. Instead of diffusing the mesh vertices to be distributed as zero-centered isotopic Gaussian distribution, we diffuse each vertex into a specific noise distribution, in which the entire structure can be preserved. In addition, we propose a topology-agnostic mesh diffusion model by projecting the vertex into multiple 2-D viewpoints to efficiently learn the diffusion using a deep network. This enables the proposed method to learn the diffusion of arbitrary meshes that have an irregular topology. Finally, the denoised mesh can be obtained via refinement based on 2-D projections obtained from reverse diffusion. Through extensive experiments, we demonstrate that our method outperforms the state-of-the-art mesh denoising methods in both quantitative and qualitative evaluations.
Collapse
|
278
|
Zhao R, Zhu M, Wang N, Gao X. Few-Shot Face Stylization via GAN Prior Distillation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:4492-4503. [PMID: 38536698 DOI: 10.1109/tnnls.2024.3377609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Face stylization has made notable progress in recent years. However, when training on limited data, the performance of existing approaches significantly declines. Although some studies have attempted to tackle this problem, they either failed to achieve the few-shot setting (less than 10) or can only get suboptimal results. In this article, we propose GAN Prior Distillation (GPD) to enable effective few-shot face stylization. GPD contains two models: a teacher network with GAN Prior and a student network that fulfills end-to-end translation. Specifically, we adapt the teacher network trained on large-scale data in the source domain to the target domain using a handful of samples, where it can learn the target domain's knowledge. Then, we can achieve few-shot augmentation by generating source domain and target domain images simultaneously with the same latent codes. We propose an anchor-based knowledge distillation module that can fully use the difference between the training and the augmented data to distill the knowledge of the teacher network into the student network. The trained student network achieves excellent generalization performance with the absorption of additional knowledge. Qualitative and quantitative experiments demonstrate that our method achieves superior results than state-of-the-art approaches in a few-shot setting.
Collapse
|
279
|
Liao J, Zhang T, Shepherd S, Macluskey M, Li C, Huang Z. Semi-supervised assisted multi-task learning for oral optical coherence tomography image segmentation and denoising. BIOMEDICAL OPTICS EXPRESS 2025; 16:1197-1215. [PMID: 40109516 PMCID: PMC11919357 DOI: 10.1364/boe.545377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Revised: 12/05/2024] [Accepted: 12/05/2024] [Indexed: 03/22/2025]
Abstract
Optical coherence tomography (OCT) is promising to become an essential imaging tool for non-invasive oral mucosal tissue assessment, but it faces challenges like speckle noise and motion artifacts. In addition, it is difficult to distinguish different layers of oral mucosal tissues from gray level OCT images due to the similarity of optical properties between different layers. We introduce the Efficient Segmentation-Denoising Model (ESDM), a multi-task deep learning framework designed to enhance OCT imaging by reducing scan time from ∼8s to ∼2s and improving oral epithelium layer segmentation. ESDM integrates the local feature extraction capabilities of the convolution layer and the long-term information processing advantages of the transformer, achieving better denoising and segmentation performance compared to existing models. Our evaluation shows that ESDM outperforms state-of-the-art models with a PSNR of 26.272, SSIM of 0.737, mDice of 0.972, and mIoU of 0.948. Ablation studies confirm the effectiveness of our design, such as the feature fusion methods, which enhance performance with minimal model complexity increase. ESDM also presents high accuracy in quantifying oral epithelium thickness, achieving mean absolute errors as low as 5 µm compared to manual measurements. This research shows that ESDM can notably improve OCT imaging and reduce the cost of accurate oral epithermal segmentation, improving diagnostic capabilities in clinical settings.
Collapse
Affiliation(s)
- Jinpeng Liao
- School of Science and Engineering, University of Dundee, DD1 4HN, Scotland, UK
- Healthcare Engineering, School of Physics and Engineering Technology, University of York, UK
| | - Tianyu Zhang
- School of Science and Engineering, University of Dundee, DD1 4HN, Scotland, UK
| | - Simon Shepherd
- School of Dentistry, University of Dundee, Dundee, DD1 4HN, Scotland, UK
| | | | - Chunhui Li
- School of Science and Engineering, University of Dundee, DD1 4HN, Scotland, UK
| | - Zhihong Huang
- Healthcare Engineering, School of Physics and Engineering Technology, University of York, UK
| |
Collapse
|
280
|
Biglarbeigi P, Morelli A, Bhattacharya G, Ward J, Finlay D, Bhalla N, Payam AF. Incongruous Harmonics of Vibrating Solid-Solid Interface. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2025; 21:e2409410. [PMID: 39552010 PMCID: PMC11899492 DOI: 10.1002/smll.202409410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2024] [Indexed: 11/19/2024]
Abstract
Deconvoluting the vibrations and harmonics in solid-solid interfaces is crucial for designing materials with improved performance, durability, and functionality. The measured vibrating microcantilever signal in the dynamic atomic force microscopy (AFM) encompasses a multitude of distinct signatures reflecting a diverse array of material properties. Nevertheless, uncertainties persist in decoding these signatures, primarily arising from the interplay between attractive and repulsive forces. Consequently, it is challenging to correlate the generated harmonics within the solid-solid interfaces with the imaged phase and topography of materials, as well as the occasional observed contrast reversal. In this study, the vibration harmonics produced at solid-solid interfaces are correlated, linking them to short-range nano-mechanical characteristics through a comprehensive blend of theory, simulation, and experimental methods. These findings shed light on the roots of harmonic generation and contrast reversals, opening avenues for designing innovative materials with customized properties.
Collapse
Affiliation(s)
- Pardis Biglarbeigi
- Department of Pharmacology & TherapeuticsUniversity of LiverpoolWhelan Building, LiverpoolEnglandL69 3GEUK
| | - Alessio Morelli
- Nanotechnology and Integrated Bioengineering CentreSchool of EngineeringUlster UniversityBelfastBT15 1APUK
| | - Gourav Bhattacharya
- Nanotechnology and Integrated Bioengineering CentreSchool of EngineeringUlster UniversityBelfastBT15 1APUK
| | - Joanna Ward
- Nanotechnology and Integrated Bioengineering CentreSchool of EngineeringUlster UniversityBelfastBT15 1APUK
| | - Dewar Finlay
- Nanotechnology and Integrated Bioengineering CentreSchool of EngineeringUlster UniversityBelfastBT15 1APUK
| | - Nikhil Bhalla
- Nanotechnology and Integrated Bioengineering CentreSchool of EngineeringUlster UniversityBelfastBT15 1APUK
| | - Amir Farokh Payam
- Nanotechnology and Integrated Bioengineering CentreSchool of EngineeringUlster UniversityBelfastBT15 1APUK
| |
Collapse
|
281
|
Lau GE, Mortenson MC, Neilsen TB, Van Komen DF, Hodgkiss WS, Knobles DP. Ensemble approach to deep learning seabed classification using multichannel ship noisea). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2025; 157:2127-2149. [PMID: 40135961 DOI: 10.1121/10.0036221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2024] [Accepted: 03/01/2025] [Indexed: 03/27/2025]
Abstract
In shallow-water downward-refracting ocean environments, hydrophone measurements of shipping noise encode information about the seabed. In this study, neural networks are trained on synthetic data to predict seabed classes from multichannel hydrophone spectrograms of shipping noise. Specifically, ResNet-18 networks are trained on different combinations of synthetic inputs from one, two, four, and eight channels. The trained networks are then applied to measured ship spectrograms from the Seabed Characterization Experiment 2017 (SBCEX 2017) to obtain an effective seabed class for the area. Data preprocessing techniques and ensemble modeling are leveraged to improve performance over previous studies. The results showcase the predictive capability of the trained networks; the seabed predictions from the measured ship spectrograms tend towards two seabed classes that share similarities in the upper few meters of sediment and are consistent with geoacoustic inversion results from SBCEX 2017. This work also demonstrates how ensemble modeling yields a measure of precision and confidence in the predicted results. Furthermore, the impact of using data from multiple hydrophone channels is quantified. While the water sound speed in this experiment was only slightly upward refracting, we anticipate increased advantages of using multiple channels to train neural networks for more varied sound speed profiles.
Collapse
Affiliation(s)
- Ginger E Lau
- Department of Physics, Emory University, Atlanta, Georgia 30322, USA
| | - Michael C Mortenson
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah 84602, USA
| | - Tracianne B Neilsen
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah 84602, USA
| | - David F Van Komen
- Kahlert School of Computing, University of Utah, Salt Lake City, Utah, 84112, USA
| | - William S Hodgkiss
- Marine Physical Laboratory, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093, USA
| | | |
Collapse
|
282
|
Shi J, Pelt DM, Batenburg KJ. Multi-stage deep learning artifact reduction for parallel-beam computed tomography. JOURNAL OF SYNCHROTRON RADIATION 2025; 32:442-456. [PMID: 39960472 DOI: 10.1107/s1600577525000359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2024] [Accepted: 01/14/2025] [Indexed: 03/11/2025]
Abstract
Computed tomography (CT) using synchrotron radiation is a powerful technique that, compared with laboratory CT techniques, boosts high spatial and temporal resolution while also providing access to a range of contrast-formation mechanisms. The acquired projection data are typically processed by a computational pipeline composed of multiple stages. Artifacts introduced during data acquisition can propagate through the pipeline and degrade image quality in the reconstructed images. Recently, deep learning has shown significant promise in enhancing image quality for images representing scientific data. This success has driven increasing adoption of deep learning techniques in CT imaging. Various approaches have been proposed to incorporate deep learning into computational pipelines, but each has limitations in addressing artifacts effectively and efficiently in synchrotron CT, either in properly addressing the specific artifacts or in computational efficiency. Recognizing these challenges, we introduce a novel method that incorporates separate deep learning models at each stage of the tomography pipeline - projection, sinogram and reconstruction - to address specific artifacts locally in a data-driven way. Our approach includes bypass connections that feed both the outputs from previous stages and raw data to subsequent stages, minimizing the risk of error propagation. Extensive evaluations on both simulated and real-world datasets illustrate that our approach effectively reduces artifacts and outperforms comparison methods.
Collapse
Affiliation(s)
- Jiayang Shi
- Leiden University, Einsteinweg 55, 2333 CC Leiden, The Netherlands
| | - Daniël M Pelt
- Leiden University, Einsteinweg 55, 2333 CC Leiden, The Netherlands
| | | |
Collapse
|
283
|
Li H, Yuan M, Li J, Liu Y, Lu G, Xu Y, Yu Z, Zhang D. Focus Affinity Perception and Super-Resolution Embedding for Multifocus Image Fusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:4311-4325. [PMID: 38446648 DOI: 10.1109/tnnls.2024.3367782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
Despite the fact that there is a remarkable achievement on multifocus image fusion, most of the existing methods only generate a low-resolution image if the given source images suffer from low resolution. Obviously, a naive strategy is to independently conduct image fusion and image super-resolution. However, this two-step approach would inevitably introduce and enlarge artifacts in the final result if the result from the first step meets artifacts. To address this problem, in this article, we propose a novel method to simultaneously achieve image fusion and super-resolution in one framework, avoiding step-by-step processing of fusion and super-resolution. Since a small receptive field can discriminate the focusing characteristics of pixels in detailed regions, while a large receptive field is more robust to pixels in smooth regions, a subnetwork is first proposed to compute the affinity of features under different types of receptive fields, efficiently increasing the discriminability of focused pixels. Simultaneously, in order to prevent from distortion, a gradient embedding-based super-resolution subnetwork is also proposed, in which the features from the shallow layer, the deep layer, and the gradient map are jointly taken into account, allowing us to get an upsampled image with high resolution. Compared with the existing methods, which implemented fusion and super-resolution independently, our proposed method directly achieves these two tasks in a parallel way, avoiding artifacts caused by the inferior output of image fusion or super-resolution. Experiments conducted on the real-world dataset substantiate the superiority of our proposed method compared with state of the arts.
Collapse
|
284
|
An W, Liu Y, Shang F, Liu H, Jiao L. DEs-Inspired Accelerated Unfolded Linearized ADMM Networks for Inverse Problems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:5319-5333. [PMID: 38625778 DOI: 10.1109/tnnls.2024.3382030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2024]
Abstract
Many research works have shown that the traditional alternating direction multiplier methods (ADMMs) can be better understood by continuous-time differential equations (DEs). On the other hand, many unfolded algorithms directly inherit the traditional iterations to build deep networks. Although they achieve superior practical performance and a faster convergence rate than traditional counterparts, there is a lack of clear insight into unfolded network structures. Thus, we attempt to explore the unfolded linearized ADMM (LADMM) from the perspective of DEs, and design more efficient unfolded networks. First, by proposing an unfolded Euler LADMM scheme and inspired by the trapezoid discretization, we design a new more accurate Trapezoid LADMM scheme. For the convenience of implementation, we provide its explicit version via a prediction-correction strategy. Then, to expand the representation space of unfolded networks, we design an accelerated variant of our Euler LADMM scheme, which can be interpreted as second-order DEs with stronger representation capabilities. To fully explore this representation space, we designed an accelerated Trapezoid LADMM scheme. To the best of our knowledge, this is the first work to explore a comprehensive connection with theoretical guarantees between unfolded ADMMs and first- (second-) order DEs. Finally, we instantiate our schemes as (A-)ELADMM and (A-)TLADMM with the proximal operators, and (A-)ELADMM-Net and (A-)TLADMM-Net with convolutional neural networks (CNNs). Extensive inverse problem experiments show that our Trapezoid LADMM schemes perform better than well-known methods.
Collapse
|
285
|
Diaz Moreno RM, Nuñez G, Venencia CD, Isoardi RA, Almada MJ. Use of a virtual phantom to assess the capability of a treatment planning system to perform magnetic resonance image distortion correction. Phys Eng Sci Med 2025; 48:317-327. [PMID: 39760846 DOI: 10.1007/s13246-024-01515-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Accepted: 12/20/2024] [Indexed: 01/07/2025]
Abstract
Treatment Planning Systems (TPS) offer algorithms for distortion correction (DC) of Magnetic Resonance (MR) images, whose performances demand proper evaluation. This work develops a procedure using a virtual phantom to quantitatively assess a TPS DC algorithm. Variations of the digital Brainweb MR study were created by introducing known distortions and Control Points (CPs). A synthetic Computed Tomography (sCT) study was created based upon the MR study. Elements TPS (Brainlab, Munich, Germany) was used to apply DC to the MR images, choosing the sCT as the gold standard. Deviations in the CP locations between the original images, the distorted images and the corrected images were calculated. Structural Similarity Metric (SSIM) tests were applied for further assessment of image corrections. The introduced distortion deviated the CP locations by a median (range) value of 1.8 (0.2-4.4) mm. After DC was applied, these values were reduced to 0.6 (0.1-1.9) mm. Correction of the original image deviated the CP locations by 0.2 (0-1.1) mm. The SSIM comparisons between the original and the distorted images yielded values of 0.23 and 0.67 before and after DC, respectively. The SSIM comparison of the original study, before and after DC, yielded a value of 0.97. The proposed methodology using a virtual phantom with CPs can be used to assess a TPS DC algorithm. Elements TPS effectively reduced MR distorsions below radiosurgery tolerances.
Collapse
Affiliation(s)
- Rogelio Manuel Diaz Moreno
- Physics Department, Instituto Zunino, Obispo Oro 423, X5000BFI, Córdoba, Argentina.
- , 9 de julio 2015, 10, X5003CQI, Córdoba, Argentina.
| | - Gonzalo Nuñez
- Physics Department, Instituto Zunino, Obispo Oro 423, X5000BFI, Córdoba, Argentina
| | - C Daniel Venencia
- Physics Department, Instituto Zunino, Obispo Oro 423, X5000BFI, Córdoba, Argentina
| | - Roberto A Isoardi
- FUESMEN - Fundación Escuela de Medicina Nuclear. Garibaldi 405, M5500, Mendoza, Argentina
| | - María José Almada
- Physics Department, Instituto Zunino, Obispo Oro 423, X5000BFI, Córdoba, Argentina
| |
Collapse
|
286
|
Tang S, Bicer T, Sun T, Fezzaa K, Clark SJ. Deep learning-based spatio-temporal fusion for high-fidelity ultra-high-speed X-ray radiography. JOURNAL OF SYNCHROTRON RADIATION 2025; 32:432-441. [PMID: 39937516 DOI: 10.1107/s1600577525000323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Accepted: 01/14/2025] [Indexed: 02/13/2025]
Abstract
Full-field ultra-high-speed (UHS) X-ray imaging experiments have been well established to characterize various processes and phenomena. However, the potential of UHS experiments through the joint acquisition of X-ray videos with distinct configurations has not been fully exploited. In this paper, we investigate the use of a deep learning-based spatio-temporal fusion (STF) framework to fuse two complementary sequences of X-ray images and reconstruct the target image sequence with high spatial resolution, high frame rate and high fidelity. We applied a transfer learning strategy to train the model and compared the peak signal-to-noise ratio (PSNR), average absolute difference (AAD) and structural similarity (SSIM) of the proposed framework on two independent X-ray data sets with those obtained from a baseline deep learning model, a Bayesian fusion framework and the bicubic interpolation method. The proposed framework outperformed the other methods with various configurations of the input frame separations and image noise levels. With three subsequent images from the low-resolution (LR) sequence of a four times lower spatial resolution and another two images from the high-resolution (HR) sequence of a 20 times lower frame rate, the proposed approach achieved average PSNRs of 37.57 dB and 35.15 dB, respectively. When coupled with the appropriate combination of high-speed cameras, the proposed approach will enhance the performance and therefore the scientific value of UHS X-ray imaging experiments.
Collapse
Affiliation(s)
- Songyuan Tang
- Advanced Photon Source, Argonne National Laboratory, Lemont, IL 60439, USA
| | - Tekin Bicer
- Advanced Photon Source, Argonne National Laboratory, Lemont, IL 60439, USA
| | - Tao Sun
- Department of Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Kamel Fezzaa
- Advanced Photon Source, Argonne National Laboratory, Lemont, IL 60439, USA
| | - Samuel J Clark
- Advanced Photon Source, Argonne National Laboratory, Lemont, IL 60439, USA
| |
Collapse
|
287
|
Behrendt F, Bhattacharya D, Mieling R, Maack L, Krüger J, Opfer R, Schlaefer A. Guided reconstruction with conditioned diffusion models for unsupervised anomaly detection in brain MRIs. Comput Biol Med 2025; 186:109660. [PMID: 39847946 DOI: 10.1016/j.compbiomed.2025.109660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 11/29/2024] [Accepted: 01/06/2025] [Indexed: 01/25/2025]
Abstract
The application of supervised models to clinical screening tasks is challenging due to the need for annotated data for each considered pathology. Unsupervised Anomaly Detection (UAD) is an alternative approach that aims to identify any anomaly as an outlier from a healthy training distribution. A prevalent strategy for UAD in brain MRI involves using generative models to learn the reconstruction of healthy brain anatomy for a given input image. As these models should fail to reconstruct unhealthy structures, the reconstruction errors indicate anomalies. However, a significant challenge is to balance the accurate reconstruction of healthy anatomy and the undesired replication of abnormal structures. While diffusion models have shown promising results with detailed and accurate reconstructions, they face challenges in preserving intensity characteristics, resulting in false positives. We propose conditioning the denoising process of diffusion models with additional information derived from a latent representation of the input image. We demonstrate that this conditioning allows for accurate and local adaptation to the general input intensity distribution while avoiding the replication of unhealthy structures. We compare the novel approach to different state-of-the-art methods and for different data sets. Our results show substantial improvements in the segmentation performance, with the Dice score improved by 11.9%, 20.0%, and 44.6%, for the BraTS, ATLAS and MSLUB data sets, respectively, while maintaining competitive performance on the WMH data set. Furthermore, our results indicate effective domain adaptation across different MRI acquisitions and simulated contrasts, an important attribute for general anomaly detection methods. The code for our work is available at https://github.com/FinnBehrendt/Conditioned-Diffusion-Models-UAD.
Collapse
|
288
|
Benitez‐Aurioles J, Osorio EMV, Aznar MC, Van Herk M, Pan S, Sitch P, France A, Smith E, Davey A. A neural network to create super-resolution MR from multiple 2D brain scans of pediatric patients. Med Phys 2025; 52:1693-1705. [PMID: 39657055 PMCID: PMC11880662 DOI: 10.1002/mp.17563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 11/02/2024] [Accepted: 11/24/2024] [Indexed: 12/17/2024] Open
Abstract
BACKGROUND High-resolution (HR) 3D MR images provide detailed soft-tissue information that is useful in assessing long-term side-effects after treatment in childhood cancer survivors, such as morphological changes in brain structures. However, these images require long acquisition times, so routinely acquired follow-up images after treatment often consist of 2D low-resolution (LR) images (with thick slices in multiple planes). PURPOSE In this work, we present a super-resolution convolutional neural network, based on previous single-image MRI super-resolution work, that can reconstruct a HR image from 2D LR slices in multiple planes in order to facilitate the extraction of structural biomarkers from routine scans. METHODS A multilevel densely connected super-resolution convolutional neural network (mDCSRN) was adapted to take two perpendicular LR scans (e.g., coronal and axial) as tensors and reconstruct a 3D HR image. A training set of 90 HR T1 pediatric head scans from the Adolescent Brain Cognitive Development (ABCD) study was used, with 2D LR images simulated through a downsampling pipeline that introduces motion artifacts, blurring, and registration errors to make the LR scans more realistic to routinely acquired ones. The outputs of the model were compared against simple interpolation in two steps. First, the quality of the reconstructed HR images was assessed using the peak signal-to-noise ratio and structural similarity index compared to baseline. Second, the precision of structure segmentation (using the autocontouring software Limbus AI) in the reconstructed versus the baseline HR images was assessed using mean distance-to-agreement (mDTA) and 95% Hausdorff distance. Three datasets were used: 10 new ABCD images (dataset 1), 18 images from the Children's Brain Tumor Network (CBTN) study (dataset 2) and 6 "real-world" follow-up images of a pediatric head and neck cancer patient (dataset 3). RESULTS The proposed mDCSRN outperformed simple interpolation in terms of visual quality. Similarly, structure segmentations were closer to baseline images after 3D reconstruction. The mDTA improved to, on average (95% confidence interval), 0.7 (0.4-1.0) and 0.8 (0.7-0.9) mm for datasets 1 and 3 respectively, from the interpolation performance of 6.5 (3.6-9.5) and 1.2 (1.0-1.3) mm. CONCLUSIONS We demonstrate that deep learning methods can successfully reconstruct 3D HR images from 2D LR ones, potentially unlocking datasets for retrospective study and advancing research in the long-term effects of pediatric cancer. Our model outperforms standard interpolation, both in perceptual quality and for autocontouring. Further work is needed to validate it for additional structural analysis tasks.
Collapse
Affiliation(s)
- Jose Benitez‐Aurioles
- Division of Informatics, Imaging and Data SciencesUniversity of ManchesterManchesterUK
| | - Eliana M. Vásquez Osorio
- Radiotherapy‐Related Research Group, Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and HealthUniversity of ManchesterManchesterUK
| | - Marianne C. Aznar
- Radiotherapy‐Related Research Group, Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and HealthUniversity of ManchesterManchesterUK
| | - Marcel Van Herk
- Radiotherapy‐Related Research Group, Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and HealthUniversity of ManchesterManchesterUK
| | | | - Peter Sitch
- The Christie NHS Foundation TrustManchesterUK
| | - Anna France
- The Christie NHS Foundation TrustManchesterUK
| | - Ed Smith
- The Christie NHS Foundation TrustManchesterUK
| | - Angela Davey
- Radiotherapy‐Related Research Group, Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and HealthUniversity of ManchesterManchesterUK
| |
Collapse
|
289
|
Ferreira da Silva M, Nunes Masson JE, dos Santos MF, Rodrigues Silva W, Wladimir Molina I, Martins GMC. Audible Noise Evaluation in Wind Turbines Through Artificial Intelligence Techniques. SENSORS (BASEL, SWITZERLAND) 2025; 25:1492. [PMID: 40096305 PMCID: PMC11902537 DOI: 10.3390/s25051492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2025] [Revised: 02/07/2025] [Accepted: 02/11/2025] [Indexed: 03/19/2025]
Abstract
In recent years, wind power has become a more attractive alternative energy source for overcoming environmental issues. Predictive maintenance is essential for wind power devices to ensure that these systems work reliably and with sufficient availability. This paper presents a method to work around failure detection in wind turbines using the sound emitted from their components. The proposed Artificial Intelligence (AI) model is based on unsupervised learning and image processing, through which the machine learning model learns to identify spectrograms from wind turbines under healthy conditions. The reconstruction of current data determines whether the input data have an uncommon noise, which translates into a possibility of failure or an effective one. The uncommon data are sent to a specialist network, which, through supervising learning, identifies a failure event and alerts operators to possible problems that the wind turbine could pass through, helping with preventive maintenance. The model offered satisfactory results in five tested wind turbines, in which some specific faults known by the operators were captured through the low similarity between the reconstructed data and the input. Additionally, this application could be extended to similar applications in industrial machinery within the scope of audible noises in rotative machine mechanisms.
Collapse
Affiliation(s)
- Mathaus Ferreira da Silva
- Robotictech Technology Services, Juiz de Fora 36036-230, Brazil; (M.F.d.S.); (J.E.N.M.); (W.R.S.); (I.W.M.); (G.M.C.M.)
| | - Juliano Emir Nunes Masson
- Robotictech Technology Services, Juiz de Fora 36036-230, Brazil; (M.F.d.S.); (J.E.N.M.); (W.R.S.); (I.W.M.); (G.M.C.M.)
| | - Murillo Ferreira dos Santos
- Department of Electroelectronics, Federal Center of Technological Education of Minas Gerais (CEFET-MG), Leopoldina 36700-001, Brazil
| | - William Rodrigues Silva
- Robotictech Technology Services, Juiz de Fora 36036-230, Brazil; (M.F.d.S.); (J.E.N.M.); (W.R.S.); (I.W.M.); (G.M.C.M.)
- Department of Electroelectronics, Federal Center of Technological Education of Minas Gerais (CEFET-MG), Leopoldina 36700-001, Brazil
| | - Iuri Wladimir Molina
- Robotictech Technology Services, Juiz de Fora 36036-230, Brazil; (M.F.d.S.); (J.E.N.M.); (W.R.S.); (I.W.M.); (G.M.C.M.)
| | - Gabriel Miguel Castro Martins
- Robotictech Technology Services, Juiz de Fora 36036-230, Brazil; (M.F.d.S.); (J.E.N.M.); (W.R.S.); (I.W.M.); (G.M.C.M.)
- Department of Electroelectronics, Federal Center of Technological Education of Minas Gerais (CEFET-MG), Leopoldina 36700-001, Brazil
| |
Collapse
|
290
|
Zhang J, Peng S, Liu J, Guo A. DCAN: Dynamic Channel Attention Network for Multi-Scale Distortion Correction. SENSORS (BASEL, SWITZERLAND) 2025; 25:1482. [PMID: 40096279 PMCID: PMC11902318 DOI: 10.3390/s25051482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2025] [Revised: 02/17/2025] [Accepted: 02/24/2025] [Indexed: 03/19/2025]
Abstract
Image distortion correction is a fundamental yet challenging task in image restoration, especially in scenarios with complex distortions and fine details. Existing methods often rely on fixed-scale feature extraction, which struggles to capture multi-scale distortions. This limitation results in difficulties in achieving a balance between global structural consistency and local detail preservation on distorted images with varying levels of complexity, resulting in suboptimal restoration quality for highly complex distortions. To address these challenges, this paper proposes a dynamic channel attention network (DCAN) for multi-scale distortion correction. Firstly, DCAN employs a multi-scale design and utilizes the optical flow network for distortion feature extraction, effectively balancing global structural consistency and local detail preservation under varying levels of distortion. Secondly, we present the channel attention and fusion selective module (CAFSM), which dynamically recalibrates feature importance across multi-scale distortions. By embedding CAFSM into the upsampling stage, the network enhances its ability to refine local features while preserving global structural integrity. Moreover, to further improve detail preservation and structural consistency, a comprehensive loss function is designed, incorporating structural similarity loss (SSIM Loss) to balance local and global optimization. Experimental results on the widely used Places2 dataset demonstrate that DCAN achieves state-of-the-art performance, with an average improvement of 1.55 dB in PSNR and 0.06 in SSIM compared with existing methods.
Collapse
Affiliation(s)
| | | | - Jingjing Liu
- Shanghai Key Laboratory of Chips and Systems for Intelligent Connected Vehicle, School of Microelectronics, Shanghai University, Shanghai 200444, China; (J.Z.); (S.P.)
| | - Aiying Guo
- Shanghai Key Laboratory of Chips and Systems for Intelligent Connected Vehicle, School of Microelectronics, Shanghai University, Shanghai 200444, China; (J.Z.); (S.P.)
| |
Collapse
|
291
|
Du Z, Zhang P, Huang X, Hu Z, Yang G, Xi M, Liu D. Deeply supervised two stage generative adversarial network for stain normalization. Sci Rep 2025; 15:7068. [PMID: 40016308 PMCID: PMC11868385 DOI: 10.1038/s41598-025-91587-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Accepted: 02/21/2025] [Indexed: 03/01/2025] Open
Abstract
The color variations present in histopathological images pose a significant challenge to computational pathology and, consequently, negatively affect the performance of certain pathological image analysis methods, especially those based on deep learning techniques. To date, several methods have been proposed to mitigate this issue. However, these methods either produce images with low texture retention, perform poorly when trained with small datasets, or have low generalization capabilities. In this paper, we propose a Deep Supervised Two-stage Generative Adversarial Network known as DSTGAN for stain-normalization. Specifically, we introduce deep supervision to generative adversarial networks in an innovative way to enhance the learning capacity of the model, benefiting from different model regularization methods. To make fuller use of source domain images for training the model, we drew upon semi-supervised concepts to design a novel two-stage staining strategy. Additionally, we construct a generator that can capture long-distance semantic relationships, enabling the model to retain more abundant texture information in the generated images. In the evaluation of the quality of generated images, we have achieved state-of-the-art performance on TUPAC-2016, MITOS-ATYPIA-14, ICIAR-BACH-2018 and MICCAI-16-GlaS datasets, improving the precision of classification and segmentation by 5.2% and 4.2%, respectively. Not only has our model significantly improved the quality of the stained images compared to existing stain normalization methods, but it also has a positive impact on the execution of downstream classification and segmentation tasks. Our method has further reduced the effect that staining differences have on computational pathology, thereby improving the accuracy of histopathological image analysis to some extent.
Collapse
Affiliation(s)
- Zhe Du
- School of Medical Technology and Engineering, Henan University of Science and Technology, Luoyang, China
- Henan Engineering Research Center of Digital Pathology and Artificial Intelligence Diagnosis, Luoyang, China
| | - Pujing Zhang
- School of Medical Technology and Engineering, Henan University of Science and Technology, Luoyang, China
- Henan Engineering Research Center of Digital Pathology and Artificial Intelligence Diagnosis, Luoyang, China
| | - Xiaodong Huang
- School of Medical Technology and Engineering, Henan University of Science and Technology, Luoyang, China
- Henan Engineering Research Center of Digital Pathology and Artificial Intelligence Diagnosis, Luoyang, China
| | - Zhigang Hu
- School of Medical Technology and Engineering, Henan University of Science and Technology, Luoyang, China
| | - Gege Yang
- School of Medical Technology and Engineering, Henan University of Science and Technology, Luoyang, China
- Henan Engineering Research Center of Digital Pathology and Artificial Intelligence Diagnosis, Luoyang, China
| | - Mengyang Xi
- School of Medical Technology and Engineering, Henan University of Science and Technology, Luoyang, China
| | - Dechun Liu
- Henan Engineering Research Center of Digital Pathology and Artificial Intelligence Diagnosis, Luoyang, China.
- The First Affiliated Hospital of Henan University of Science and Technology, Luoyang, China.
| |
Collapse
|
292
|
Frants V, Agaian S. QRNet: A Quaternion-Based Retinex Framework for Enhanced Wireless Capsule Endoscopy Image Quality. Bioengineering (Basel) 2025; 12:239. [PMID: 40150703 PMCID: PMC11939397 DOI: 10.3390/bioengineering12030239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2025] [Revised: 02/21/2025] [Accepted: 02/25/2025] [Indexed: 03/29/2025] Open
Abstract
Wireless capsule endoscopy (WCE) offers a non-invasive diagnostic alternative for the gastrointestinal tract using a battery-powered capsule. Despite advantages, WCE encounters issues with video quality and diagnostic accuracy, often resulting in missing rates of 1-20%. These challenges stem from weak texture characteristics due to non-Lambertian tissue reflections, uneven illumination, and the necessity of color fidelity. Traditional Retinex-based methods used for image enhancement are suboptimal for endoscopy, as they frequently compromise anatomical detail while distorting color. To address these limitations, we introduce QRNet, a novel quaternion-based Retinex framework. QRNet performs image decomposition into reflectance and illumination components within hypercomplex space, maintaining inter-channel relationships that preserve color fidelity. A quaternion wavelet attention mechanism refines essential features while suppressing noise, balancing enhancement and fidelity through an innovative loss function. Experiments on Kvasir-Capsule and Red Lesion Endoscopy datasets demonstrate notable improvements in metrics such as PSNR (+2.3 dB), SSIM (+0.089), and LPIPS (-0.126). Moreover, lesion segmentation accuracy increases by up to 5%, indicating the framework's potential for improving early-stage lesion detection. Ablation studies highlight the quaternion representation's pivotal role in maintaining color consistency, confirming the promise of this advanced approach for clinical settings.
Collapse
Affiliation(s)
- Vladimir Frants
- Graduate Center, City University of New York, New York, NY 10016, USA
| | - Sos Agaian
- Department of Computer Science, College of Staten Island, and the Graduate Center, The City University of New York, New York, NY 10314, USA;
| |
Collapse
|
293
|
Wang J, Zhang X, Miao Y, Xue S, Zhang Y, Shi K, Guo R, Li B, Zheng G. Data-efficient generalization of AI transformers for noise reduction in ultra-fast lung PET scans. Eur J Nucl Med Mol Imaging 2025:10.1007/s00259-025-07165-7. [PMID: 40009163 DOI: 10.1007/s00259-025-07165-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2024] [Accepted: 02/13/2025] [Indexed: 02/27/2025]
Abstract
PURPOSE Respiratory motion during PET acquisition may produce lesion blurring. Ultra-fast 20-second breath-hold (U2BH) PET reduces respiratory motion artifacts, but the shortened scanning time increases statistical noise and may affect diagnostic quality. This study aims to denoise the U2BH PET images using a deep learning (DL)-based method. METHODS The study was conducted on two datasets collected from five scanners where the first dataset included 1272 retrospectively collected full-time PET data while the second dataset contained 46 prospectively collected U2BH and the corresponding full-time PET/CT images. A robust and data-efficient DL method called mask vision transformer (Mask-ViT) was proposed which, after fine-tuned on a limited number of training data from a target scanner, was directly applied to unseen testing data from new scanners. The performance of Mask-ViT was compared with state-of-the-art DL methods including U-Net and C-Gan taking the full-time PET images as the reference. Statistical analysis on image quality metrics were carried out with Wilcoxon signed-rank test. For clinical evaluation, two readers scored image quality on a 5-point scale (5 = excellent) and provided a binary assessment for diagnostic quality evaluation. RESULTS The U2BH PET images denoised by Mask-ViT showed statistically significant improvement over U-Net and C-Gan on image quality metrics (p < 0.05). For clinical evaluation, Mask-ViT exhibited a lesion detection accuracy of 91.3%, 90.4% and 91.7%, when it was evaluated on three different scanners. CONCLUSION Mask-ViT can effectively enhance the quality of the U2BH PET images in a data-efficient generalization setup. The denoised images meet clinical diagnostic requirements of lesion detectability.
Collapse
Affiliation(s)
- Jiale Wang
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Xinyu Zhang
- Department of Nuclear Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Institute for Medical Imaging Technology, Ruijin Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Ying Miao
- Department of Nuclear Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Institute for Medical Imaging Technology, Ruijin Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Song Xue
- Department of Nuclear Medicine, University of Bern, Bern, Switzerland
| | - Yu Zhang
- Department of Nuclear Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Institute for Medical Imaging Technology, Ruijin Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Kuangyu Shi
- Department of Nuclear Medicine, University of Bern, Bern, Switzerland
- Department of Informatics, Technical University of Munich, Munich, Germany
| | - Rui Guo
- Department of Nuclear Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Institute for Medical Imaging Technology, Ruijin Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Biao Li
- Department of Nuclear Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
- Institute for Medical Imaging Technology, Ruijin Hospital, Shanghai Jiao Tong University, Shanghai, China.
| | - Guoyan Zheng
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
294
|
Ottesen JA, Storas T, Vatnehol SAS, Løvland G, Vik-Mo EO, Schellhorn T, Skogen K, Larsson C, Bjørnerud A, Groote-Eindbaas IR, Caan MWA. Deep learning-based Intraoperative MRI reconstruction. Eur Radiol Exp 2025; 9:29. [PMID: 39998750 PMCID: PMC11861787 DOI: 10.1186/s41747-024-00548-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Accepted: 12/27/2024] [Indexed: 02/27/2025] Open
Abstract
BACKGROUND We retrospectively evaluated the quality of deep learning (DL) reconstructions of on-scanner accelerated intraoperative MRI (iMRI) during respective brain tumor surgery. METHODS Accelerated iMRI was performed using dual surface coils positioned around the area of resection. A DL model was trained on the fastMRI neuro dataset to mimic the data from the iMRI protocol. The evaluation was performed on imaging material from 40 patients imaged from Nov 1, 2021, to June 1, 2023, who underwent iMRI during tumor resection surgery. A comparative analysis was conducted between the conventional compressed sense (CS) method and the trained DL reconstruction method. Blinded evaluation of multiple image quality metrics was performed by two neuroradiologists and one neurosurgeon using a 1-to-5 Likert scale (1, nondiagnostic; 2, poor; 3, acceptable; 4, good; and 5, excellent), and the favored reconstruction variant. RESULTS The DL reconstruction was strongly favored or favored over the CS reconstruction for 33/40, 39/40, and 8/40 of cases for readers 1, 2, and 3, respectively. For the evaluation metrics, the DL reconstructions had a higher score than their respective CS counterparts for 72%, 72%, and 14% of the cases for readers 1, 2, and 3, respectively. Still, the DL reconstructions exhibited shortcomings such as a striping artifact and reduced signal. CONCLUSION DL shows promise in allowing for high-quality reconstructions of iMRI. The neuroradiologists noted an improvement in the perceived spatial resolution, signal-to-noise ratio, diagnostic confidence, diagnostic conspicuity, and spatial resolution compared to CS, while the neurosurgeon preferred the CS reconstructions across all metrics. RELEVANCE STATEMENT DL shows promise to allow for high-quality reconstructions of iMRI, however, due to the challenging setting of iMRI, further optimization is needed. KEY POINTS iMRI is a surgical tool with a challenging image setting. DL allowed for high-quality reconstructions of iMRI. Additional optimization is needed due to the challenging intraoperative setting.
Collapse
Affiliation(s)
- Jon André Ottesen
- Computational Radiology & Artificial Intelligence (CRAI) Research Group, Division of Radiology and Nuclear Medicine, Oslo University Hospital, Oslo, Norway.
- Department of Physics, Faculty of Mathematics and Natural Sciences, University of Oslo, Oslo, Norway.
- Division of Radiology and Nuclear Medicine, Department of Physics and Computational Radiology, Oslo University Hospital, Oslo, Norway.
| | - Tryggve Storas
- Division of Radiology and Nuclear Medicine, Department of Physics and Computational Radiology, Oslo University Hospital, Oslo, Norway
| | - Svein Are Sirirud Vatnehol
- The Intervention Centre, Oslo University Hospital, Oslo, Norway
- Department of Optometry, Radiography and Lighting Design, University of South-Eastern Norway, Drammen, Norway
- Department of Health Sciences Gjøvik, Faculty of Medicine and Health Sciences, NTNU, Gjøvik, Norway
| | - Grethe Løvland
- The Intervention Centre, Oslo University Hospital, Oslo, Norway
| | - Einar Osland Vik-Mo
- Vilhelm Magnus Laboratory, Department of Neurosurgery, Oslo University Hospital, Oslo, Norway
- Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Till Schellhorn
- Computational Radiology & Artificial Intelligence (CRAI) Research Group, Division of Radiology and Nuclear Medicine, Oslo University Hospital, Oslo, Norway
| | - Karoline Skogen
- Computational Radiology & Artificial Intelligence (CRAI) Research Group, Division of Radiology and Nuclear Medicine, Oslo University Hospital, Oslo, Norway
| | - Christopher Larsson
- Computational Radiology & Artificial Intelligence (CRAI) Research Group, Division of Radiology and Nuclear Medicine, Oslo University Hospital, Oslo, Norway
- Department of Neurosurgery, Oslo University Hospital, Oslo, Norway
| | - Atle Bjørnerud
- Computational Radiology & Artificial Intelligence (CRAI) Research Group, Division of Radiology and Nuclear Medicine, Oslo University Hospital, Oslo, Norway
- Department of Physics, Faculty of Mathematics and Natural Sciences, University of Oslo, Oslo, Norway
- Division of Radiology and Nuclear Medicine, Department of Physics and Computational Radiology, Oslo University Hospital, Oslo, Norway
| | - Inge Rasmus Groote-Eindbaas
- Computational Radiology & Artificial Intelligence (CRAI) Research Group, Division of Radiology and Nuclear Medicine, Oslo University Hospital, Oslo, Norway
- Department of Radiology, Vestfold Hospital Trust, Tønsberg, Norway
| | - Matthan W A Caan
- Computational Radiology & Artificial Intelligence (CRAI) Research Group, Division of Radiology and Nuclear Medicine, Oslo University Hospital, Oslo, Norway
- Biomedical Engineering and Physics, Amsterdam UMC, University of Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
295
|
Zhang M, Li R, Fu S, Kumar S, Mcginty J, Qin Y, Chen L. Deep learning enhanced light sheet fluorescence microscopy for in vivo 4D imaging of zebrafish heart beating. LIGHT, SCIENCE & APPLICATIONS 2025; 14:92. [PMID: 39994185 PMCID: PMC11850918 DOI: 10.1038/s41377-024-01710-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 10/09/2024] [Accepted: 12/02/2024] [Indexed: 02/26/2025]
Abstract
Time-resolved volumetric fluorescence imaging over an extended duration with high spatial/temporal resolution is a key driving force in biomedical research for investigating spatial-temporal dynamics at organism-level systems, yet it remains a major challenge due to the trade-off among imaging speed, light exposure, illumination power, and image quality. Here, we present a deep-learning enhanced light sheet fluorescence microscopy (LSFM) approach that addresses the restoration of rapid volumetric time-lapse imaging with less than 0.03% light exposure and 3.3% acquisition time compared to a typical standard acquisition. We demonstrate that the convolutional neural network (CNN)-transformer network developed here, namely U-net integrated transformer (UI-Trans), successfully achieves the mitigation of complex noise-scattering-coupled degradation and outperforms state-of-the-art deep learning networks, due to its capability of faithfully learning fine details while comprehending complex global features. With the fast generation of appropriate training data via flexible switching between confocal line-scanning LSFM (LS-LSFM) and conventional LSFM, this method achieves a three- to five-fold signal-to-noise ratio (SNR) improvement and ~1.8 times contrast improvement in ex vivo zebrafish heart imaging and long-term in vivo 4D (3D morphology + time) imaging of heartbeat dynamics at different developmental stages with ultra-economical acquisitions in terms of light dosage and acquisition time.
Collapse
Affiliation(s)
- Meng Zhang
- School of Electronic and Information Engineering, Beihang University, Beijing, 100191, China
| | - Renjian Li
- School of Electronic and Information Engineering, Beihang University, Beijing, 100191, China
- College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, 518118, China
| | - Songnian Fu
- Institute of Advanced Photonics Technology, School of Information Engineering, Guangdong University of Technology, Guangzhou, 51006, China
| | - Sunil Kumar
- Photonics Group, Department of Physics, Imperial College London, London, SW7 2AZ, UK
| | - James Mcginty
- Photonics Group, Department of Physics, Imperial College London, London, SW7 2AZ, UK
| | - Yuwen Qin
- Institute of Advanced Photonics Technology, School of Information Engineering, Guangdong University of Technology, Guangzhou, 51006, China.
| | - Lingling Chen
- College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, 518118, China.
| |
Collapse
|
296
|
Cai S, Mai J, Hong W, Fraser SE, Cutrale F. Rapid diffused optical imaging for accurate 3D estimation of subcutaneous tissue features. iScience 2025; 28:111818. [PMID: 39991548 PMCID: PMC11847144 DOI: 10.1016/j.isci.2025.111818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 07/20/2024] [Accepted: 01/13/2025] [Indexed: 02/25/2025] Open
Abstract
Conventional light imaging in living tissues is limited to depths under 100 μm by the significant tissue scattering. Consequently, few commercial imaging devices can image tissue lesions beneath the surface, or measure their invasion depth, critical in dermatology. We present 3D-multisite diffused optical imaging (3D-mDOI) an approach that combines photon migration techniques from diffuse optical tomography, with automated controls and image analysis techniques for estimating lesion's depth via its optical coefficients. 3D-mDOI is a non-invasive, low-cost, fast, and contact-free instrument capable of estimating subcutaneous tissue structures volumes through multisite-acquisition of re-emitted light diffusion on the sample surface. It offers rapid estimation of Breslow depth, essential for staging melanoma. To standardize the performance, 3D-mDOI employs customized calibrations using physical tissue phantoms, to explore the system's 3D reconstruction capabilities. We find that 3D-mDOI can reconstruct lesions up to 5 mm below the surface, requiring ∼300 s of computation time.
Collapse
Affiliation(s)
- Shanshan Cai
- Department of Biomedical Engineering, University of Southern California, Los Angeles, CA 90089, USA
- Translational Imaging Center, University of Southern California, Los Angeles, CA 90007, USA
- Alfred E. Mann Institute for Biomedical Engineering, University of Southern California, Los Angeles, CA 90089, USA
| | - John Mai
- Alfred E. Mann Institute for Biomedical Engineering, University of Southern California, Los Angeles, CA 90089, USA
| | - Winn Hong
- Alfred E. Mann Institute for Biomedical Engineering, University of Southern California, Los Angeles, CA 90089, USA
| | - Scott E. Fraser
- Department of Biomedical Engineering, University of Southern California, Los Angeles, CA 90089, USA
- Translational Imaging Center, University of Southern California, Los Angeles, CA 90007, USA
- Molecular and Computational Biology Department, University of Southern California, Los Angeles, CA 90089, USA
| | - Francesco Cutrale
- Department of Biomedical Engineering, University of Southern California, Los Angeles, CA 90089, USA
- Translational Imaging Center, University of Southern California, Los Angeles, CA 90007, USA
| |
Collapse
|
297
|
Liu X, Ma X, Song W, Zhang Y, Zhang Y. High fidelity zero shot speaker adaptation in text to speech synthesis with denoising diffusion GAN. Sci Rep 2025; 15:6269. [PMID: 39979408 PMCID: PMC11842752 DOI: 10.1038/s41598-025-90507-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Accepted: 02/13/2025] [Indexed: 02/22/2025] Open
Abstract
Zero-shot speaker adaptation seeks to enable the cloning of voices for previously unseen speakers by leveraging only a few seconds of their speech samples. Nevertheless, existing zero-shot multi-speaker text-to-speech (TTS) systems continue to exhibit significant disparities in the synthesized speech quality and speaker similarity when comparing unseen to seen speakers. To address these challenges and improve synthesized speech quality and speaker similarity for unseen speakers, this study introduces an efficient zero-shot speaker-adaptive TTS model, DiffGAN-ZSTTS. The model is constructed on the FastSpeech2 framework and utilizes a diffusion-based decoder to enhance the model's generalization ability for unseen speaker samples in zero-shot settings. We present the SE-Res2FFT module, which refines the encoder's FFT block by incorporating SE-Res2Net modules in parallel with the multi-head self-attention mechanism, thereby achieving a balanced extraction of local and global features. Furthermore, we introduce the MHSE module, which employs multi-head attention mechanisms to augment the model's capability in representing speaker reference audio features. The model was trained and evaluated using both the AISHELL3 and LibriTTS datasets, providing a comprehensive evaluation of speech synthesis performance across both seen and unseen speaker conditions in Chinese and English. Experimental results indicate that DiffGAN-ZSTTS substantially improves both the synthesized speech quality and speaker similarity. Additionally, we assessed the model's performance on the Baker and VCTK datasets, which are outside the training domain, and the results reveal that the model can successfully perform zero-shot speech synthesis for unseen speakers with only a few seconds of speech, outperforming state-of-the-art models in both speaker similarity and audio quality.
Collapse
Affiliation(s)
- Xiangchun Liu
- School of Information Engineering, Minzu University of China, Beijing, 100081, China
| | - Xuan Ma
- School of Information Engineering, Minzu University of China, Beijing, 100081, China
| | - Wei Song
- School of Information Engineering, Minzu University of China, Beijing, 100081, China.
- Language Information Security Research Center, Institute of National Security MUC, Minzu University of China, Beijing, 100081, China.
- National Language Resource Monitoring and Research Center of Minority Languages, Minzu University of China, Beijing, 100081, China.
- Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance of MOE, Minzu University of China, Beijing, 100081, China.
| | - Yanghao Zhang
- School of Information Engineering, Minzu University of China, Beijing, 100081, China
| | - Yi Zhang
- School of Information Engineering, Minzu University of China, Beijing, 100081, China
| |
Collapse
|
298
|
Ye S, Zhu L, Zhao Z, Wu F, Li Z, Wang B, Zhong K, Sun C, Mukamel S, Jiang J. AI protocol for retrieving protein dynamic structures from two-dimensional infrared spectra. Proc Natl Acad Sci U S A 2025; 122:e2424078122. [PMID: 39951500 PMCID: PMC11848431 DOI: 10.1073/pnas.2424078122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2024] [Accepted: 01/16/2025] [Indexed: 02/16/2025] Open
Abstract
Understanding the dynamic evolution of protein structures is crucial for uncovering their biological functions. Yet, real-time prediction of these dynamic structures remains a significant challenge. Two-dimensional infrared (2DIR) spectroscopy is a powerful tool for analyzing protein dynamics. However, translating its complex, low-dimensional signals into detailed three-dimensional structures is a daunting task. In this study, we introduce a machine learning-based approach that accurately predicts dynamic three-dimensional protein structures from 2DIR descriptors. Our method establishes a robust "spectrum-structure" relationship, enabling the recovery of three-dimensional structures across a wide variety of proteins. It demonstrates broad applicability in predicting dynamic structures along different protein folding trajectories, spanning timescales from microseconds to milliseconds. This approach also shows promise in identifying the structures of previously uncharacterized proteins based solely on their spectral descriptors. The integration of AI with 2DIR spectroscopy offers insights and represents a significant advancement in the real-time analysis of dynamic protein structures.
Collapse
Affiliation(s)
- Sheng Ye
- Engineering Research Center of Autonomous Unmanned System Technology, Ministry of Education, Anhui Provincial Engineering Research Center for Unmanned System and Intelligent Technology, School of AI, Anhui University, Hefei230601, China
| | - Lvshuai Zhu
- Engineering Research Center of Autonomous Unmanned System Technology, Ministry of Education, Anhui Provincial Engineering Research Center for Unmanned System and Intelligent Technology, School of AI, Anhui University, Hefei230601, China
| | - Zhicheng Zhao
- Engineering Research Center of Autonomous Unmanned System Technology, Ministry of Education, Anhui Provincial Engineering Research Center for Unmanned System and Intelligent Technology, School of AI, Anhui University, Hefei230601, China
| | - Fan Wu
- State Key Laboratory of Precision and Intelligent Chemistry, Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei230026, Anhui, China
| | - Zhipeng Li
- Engineering Research Center of Autonomous Unmanned System Technology, Ministry of Education, Anhui Provincial Engineering Research Center for Unmanned System and Intelligent Technology, School of AI, Anhui University, Hefei230601, China
| | - BinBin Wang
- Engineering Research Center of Autonomous Unmanned System Technology, Ministry of Education, Anhui Provincial Engineering Research Center for Unmanned System and Intelligent Technology, School of AI, Anhui University, Hefei230601, China
| | - Kai Zhong
- Zernike Institute for Advanced Materials, Department of Nanoscience and Materials Science, University of Groningen, Groningen9747AG, Netherlands
| | - Changyin Sun
- Engineering Research Center of Autonomous Unmanned System Technology, Ministry of Education, Anhui Provincial Engineering Research Center for Unmanned System and Intelligent Technology, School of AI, Anhui University, Hefei230601, China
| | - Shaul Mukamel
- Department of Chemistry and Department of Physics & Astronomy, University of California, Irvine, CA92697
| | - Jun Jiang
- State Key Laboratory of Precision and Intelligent Chemistry, Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei230026, Anhui, China
| |
Collapse
|
299
|
Wu L, Ran Y, Yan L, Liu Y, Song Y, Han D. A dataset for surface defect detection on complex structured parts based on photometric stereo. Sci Data 2025; 12:276. [PMID: 39956811 PMCID: PMC11830798 DOI: 10.1038/s41597-025-04454-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Accepted: 01/13/2025] [Indexed: 02/18/2025] Open
Abstract
Automated Optical Inspection (AOI) technology is crucial for industrial defect detection but struggles with shadows and surface reflectivity, resulting in false positives and missed detections, especially on non-planar parts. To address these issues, a novel defect detection technique based on deep learning and photometric stereo vision was proposed, along with the creation of the Metal Surface Defect Dataset (MSDD). The proposed Stroboscopic Illuminant Image Acquisition (SIIA) method uses a specially arranged illuminant setup and a Taylor Series Channel Mixer (TSCM) to blend multi-angle illumination images into pseudo-color images. This approach enables end-to-end defect detection using universal object detectors. The method involves mapping color space transformations to spatial domain transformations and utilizing hue randomization for data augmentation. Four object detection methods (FCOS, YOLOv5, YOLOv8, and RT-DETR) were validated on the MSDD, achieving an mAP of 86.1%, surpassing traditional methods. The MSDD includes 138,585 single-channel images and 9,239 mixed images, covering eight defect types. This dataset is essential for automated visual inspection of metal surfaces and is freely accessible for research purposes.
Collapse
Affiliation(s)
- Lin Wu
- School of Life Sciences, Beijing University of Chinese Medicine, Beijing, 102488, China
| | - Yu Ran
- School of Life Sciences, Beijing University of Chinese Medicine, Beijing, 102488, China
| | - Li Yan
- School of Humanities, Beijing University of Chinese Medicine, Beijing, 102488, China
| | - Yixing Liu
- School of Management, Beijing University of Chinese Medicine, Beijing, 102488, China
| | - You Song
- School of Software, Beihang University, Beijing, 100191, China.
| | - Dongran Han
- School of Life Sciences, Beijing University of Chinese Medicine, Beijing, 102488, China.
| |
Collapse
|
300
|
Kim K, Yang J, Almaslamani M, Kang CS, Lee I, Lim I, Woo SK. Deep learning-based organ-wise dosimetry of 64Cu-DOTA-rituximab through only one scanning. Sci Rep 2025; 15:5627. [PMID: 39955298 PMCID: PMC11829985 DOI: 10.1038/s41598-025-88498-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 01/28/2025] [Indexed: 02/17/2025] Open
Abstract
This study aimed to generate a delayed 64Cu-dotatate (DOTA)-rituximab positron emission tomography (PET) image from its early-scanned image by deep learning to mitigate the inconvenience and cost of estimating absorbed radiopharmaceutical doses. We acquired PET images from six patients with malignancies at 1, 24, and 48 h post-injection (p. i.) with 8 mCi 64Cu-DOTA-rituximab to fit a time-activity curve for dosimetry. We used a paired image-to-image translation (I2I) model based on a generative adversarial network to generate delayed images from early PET images. The image similarity function between the generated image and its ground truth was determined by comparing L1 and perceptual losses. We also applied organ-wise dosimetry to acquired and generated images using OLINDA/EXM. The quality of the generated images was good, even of tumors, when using the L1 loss function as an additional loss to the adversarial loss function. The organ-wise cumulative uptake and corresponding equivalent dose were estimated. Although the absorbed dose in some organs was accurately measured, predictions for organs associated with body clearance were relatively inaccurate. These results suggested that paired I2I can be used to alleviate burdensome dosimetry for radioimmunoconjugates.
Collapse
Affiliation(s)
- Kangsan Kim
- Division of Applied RI, Korea Institute of Radiological and Medical Sciences, Seoul, 01812, Republic of Korea
| | - Jingyu Yang
- Division of Applied RI, Korea Institute of Radiological and Medical Sciences, Seoul, 01812, Republic of Korea
| | - Muath Almaslamani
- Division of Applied RI, Korea Institute of Radiological and Medical Sciences, Seoul, 01812, Republic of Korea
- Radiological and Medico-Oncological Sciences, University of Science and Technology, Daejeon, Republic of Korea
| | - Chi Soo Kang
- Division of Applied RI, Korea Institute of Radiological and Medical Sciences, Seoul, 01812, Republic of Korea
- Radiological and Medico-Oncological Sciences, University of Science and Technology, Daejeon, Republic of Korea
| | - Inki Lee
- Department of Nuclear Medicine, Korea Institute of Radiological and Medical Sciences, Seoul, 01812, Republic of Korea
| | - Ilhan Lim
- Department of Nuclear Medicine, Korea Institute of Radiological and Medical Sciences, Seoul, 01812, Republic of Korea
- Radiological and Medico-Oncological Sciences, University of Science and Technology, Daejeon, Republic of Korea
| | - Sang-Keun Woo
- Division of Applied RI, Korea Institute of Radiological and Medical Sciences, Seoul, 01812, Republic of Korea.
- Radiological and Medico-Oncological Sciences, University of Science and Technology, Daejeon, Republic of Korea.
| |
Collapse
|