1
|
Xu J, Moyer D, Gagoski B, Iglesias JE, Grant PE, Golland P, Adalsteinsson E. NeSVoR: Implicit Neural Representation for Slice-to-Volume Reconstruction in MRI. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1707-1719. [PMID: 37018704 PMCID: PMC10287191 DOI: 10.1109/tmi.2023.3236216] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Reconstructing 3D MR volumes from multiple motion-corrupted stacks of 2D slices has shown promise in imaging of moving subjects, e. g., fetal MRI. However, existing slice-to-volume reconstruction methods are time-consuming, especially when a high-resolution volume is desired. Moreover, they are still vulnerable to severe subject motion and when image artifacts are present in acquired slices. In this work, we present NeSVoR, a resolution-agnostic slice-to-volume reconstruction method, which models the underlying volume as a continuous function of spatial coordinates with implicit neural representation. To improve robustness to subject motion and other image artifacts, we adopt a continuous and comprehensive slice acquisition model that takes into account rigid inter-slice motion, point spread function, and bias fields. NeSVoR also estimates pixel-wise and slice-wise variances of image noise and enables removal of outliers during reconstruction and visualization of uncertainty. Extensive experiments are performed on both simulated and in vivo data to evaluate the proposed method. Results show that NeSVoR achieves state-of-the-art reconstruction quality while providing two to ten-fold acceleration in reconstruction times over the state-of-the-art algorithms.
Collapse
|
Research Support, N.I.H., Extramural |
2 |
26 |
2
|
Zhang Y, Shao HC, Pan T, Mengke T. Dynamic cone-beam CT reconstruction using spatial and temporal implicit neural representation learning (STINR). Phys Med Biol 2023; 68:045005. [PMID: 36638543 PMCID: PMC10087494 DOI: 10.1088/1361-6560/acb30d] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 12/27/2022] [Accepted: 01/13/2023] [Indexed: 01/15/2023]
Abstract
Objective. Dynamic cone-beam CT (CBCT) imaging is highly desired in image-guided radiation therapy to provide volumetric images with high spatial and temporal resolutions to enable applications including tumor motion tracking/prediction and intra-delivery dose calculation/accumulation. However, dynamic CBCT reconstruction is a substantially challenging spatiotemporal inverse problem, due to the extremely limited projection sample available for each CBCT reconstruction (one projection for one CBCT volume).Approach. We developed a simultaneous spatial and temporal implicit neural representation (STINR) method for dynamic CBCT reconstruction. STINR mapped the unknown image and the evolution of its motion into spatial and temporal multi-layer perceptrons (MLPs), and iteratively optimized the neuron weightings of the MLPs via acquired projections to represent the dynamic CBCT series. In addition to the MLPs, we also introduced prior knowledge, in the form of principal component analysis (PCA)-based patient-specific motion models, to reduce the complexity of the temporal mapping to address the ill-conditioned dynamic CBCT reconstruction problem. We used the extended-cardiac-torso (XCAT) phantom and a patient 4D-CBCT dataset to simulate different lung motion scenarios to evaluate STINR. The scenarios contain motion variations including motion baseline shifts, motion amplitude/frequency variations, and motion non-periodicity. The XCAT scenarios also contain inter-scan anatomical variations including tumor shrinkage and tumor position change.Main results. STINR shows consistently higher image reconstruction and motion tracking accuracy than a traditional PCA-based method and a polynomial-fitting-based neural representation method. STINR tracks the lung target to an average center-of-mass error of 1-2 mm, with corresponding relative errors of reconstructed dynamic CBCTs around 10%.Significance. STINR offers a general framework allowing accurate dynamic CBCT reconstruction for image-guided radiotherapy. It is a one-shot learning method that does not rely on pre-training and is not susceptible to generalizability issues. It also allows natural super-resolution. It can be readily applied to other imaging modalities as well.
Collapse
|
Research Support, N.I.H., Extramural |
2 |
5 |
3
|
Ye S, Shen L, Islam MT, Xing L. Super-resolution biomedical imaging via reference-free statistical implicit neural representation. Phys Med Biol 2023; 68:10.1088/1361-6560/acfdf1. [PMID: 37757838 PMCID: PMC10615136 DOI: 10.1088/1361-6560/acfdf1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 09/27/2023] [Indexed: 09/29/2023]
Abstract
Objective.Supervised deep learning for image super-resolution (SR) has limitations in biomedical imaging due to the lack of large amounts of low- and high-resolution image pairs for model training. In this work, we propose a reference-free statistical implicit neural representation (INR) framework, which needs only a single or a few observed low-resolution (LR) image(s), to generate high-quality SR images.Approach.The framework models the statistics of the observed LR images via maximum likelihood estimation and trains the INR network to represent the latent high-resolution (HR) image as a continuous function in the spatial domain. The INR network is constructed as a coordinate-based multi-layer perceptron, whose inputs are image spatial coordinates and outputs are corresponding pixel intensities. The trained INR not only constrains functional smoothness but also allows an arbitrary scale in SR imaging.Main results.We demonstrate the efficacy of the proposed framework on various biomedical images, including computed tomography (CT), magnetic resonance imaging (MRI), fluorescence microscopy, and ultrasound images, across different SR magnification scales of 2×, 4×, and 8×. A limited number of LR images were used for each of the SR imaging tasks to show the potential of the proposed statistical INR framework.Significance.The proposed method provides an urgently needed unsupervised deep learning framework for numerous biomedical SR applications that lack HR reference images.
Collapse
|
research-article |
2 |
1 |
4
|
Wang S, Wang L, Cao Y, Deng Z, Ye C, Wang R, Zhu Y, Wei H. Self-supervised arbitrary-scale super-angular resolution diffusion MRI reconstruction. Med Phys 2025. [PMID: 39976309 DOI: 10.1002/mp.17691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 01/03/2025] [Accepted: 01/07/2025] [Indexed: 02/21/2025] Open
Abstract
BACKGROUND Diffusion magnetic resonance imaging (dMRI) is currently the unique noninvasive imaging technique to investigate the microstructure of in vivo tissues. To fully explore the complex tissue microstructure at sub-voxel scale, diffusion weighted (DW) images along many diffusion gradient directions are usually acquired, this is undoubtedly time consuming and inhibits their clinical applications. How to estimate the tissue microstructure only from DW images acquired with few diffusion directions remains a challenge. PURPOSE To address this challenge, we propose a self-supervised arbitrary scale super-angular resolution diffusion MRI reconstruction network (SARDI-nn), which can generate DW images along any directions from few acquisitions, allowing to overcome the limits of diffusion direction number on exploring the tissue microstructure. METHODS SARDI-nn is mainly composed of a diffusion direction-specific DW image feature extraction (DWFE) module and a physics-driven implicit expression and reconstruction (IRR) module. During training, dual downsampling operations are implemented. The first downsampling is used to produce the low-angular resolution (LAR) DW images; the second downsampling is for constructing input and learning target of SARDI-nn. The input LAR DW images pass through a DWFE module (composed of several residual blocks) to extract the feature representations of DW images along input directions, and then these features and the difference between the any querying diffusion direction and the input directions are input into a IRR module to derive the implicit representation and DW image along this query direction. Finally, based on the principle of dMRI, an adaptive weighting method is used to refine the DW image quality. During testing, given any diffusion directions, we can simply infer the corresponding DW images along these directions, accordingly, SARDI-nn can realize arbitrary scale angular super resolution. To test the effectiveness of the proposed method, we compare it with several existing methods in terms of peak signal to noise ratio (PSNR), structural similarity index measure (SSIM), and root mean square error (RMSE) of DW image and microstructure metrics derived from diffusion kurtosis imaging (DKI) and neurite orientation dispersion and density imaging (NODDI) models at different upsampling scales on Human Connectome Project (HCP) and several in-house datasets. RESULTS The comparison results demonstrate that our method achieves almost the best performance at all scales, with SSIM of reconstructed DW images improved by 10.04% at the upscale of 3 and 5.9% at the upscale of 15. Regarding the microstructures derived from DKI and NODDI models, when the upscale is not larger than 6, our method outperforms the best supervised learning method. In addition, the test results on external datasets show the well generality of our method. CONCLUSIONS SARDI-nn is currently the only method that can reconstruct high-angular resolution DW images with any upscales, which allows the variation of both input diffusion direction number and upscales, therefore, it can be easily extended to any unseen test datasets, not requiring to retrain the model. SARDI-nn provides a promising means for exploring the tissue microstructures from DW images along few diffusion gradient directions.
Collapse
|
|
1 |
|
5
|
Gundogdu B, Medved M, Chatterjee A, Engelmann R, Rosado A, Lee G, Oren NC, Oto A, Karczmar GS. Self-supervised multicontrast super-resolution for diffusion-weighted prostate MRI. Magn Reson Med 2024; 92:319-331. [PMID: 38308149 PMCID: PMC11288973 DOI: 10.1002/mrm.30047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 01/19/2024] [Accepted: 01/23/2024] [Indexed: 02/04/2024]
Abstract
PURPOSE This study addresses the challenge of low resolution and signal-to-noise ratio (SNR) in diffusion-weighted images (DWI), which are pivotal for cancer detection. Traditional methods increase SNR at high b-values through multiple acquisitions, but this results in diminished image resolution due to motion-induced variations. Our research aims to enhance spatial resolution by exploiting the global structure within multicontrast DWI scans and millimetric motion between acquisitions. METHODS We introduce a novel approach employing a "Perturbation Network" to learn subvoxel-size motions between scans, trained jointly with an implicit neural representation (INR) network. INR encodes the DWI as a continuous volumetric function, treating voxel intensities of low-resolution acquisitions as discrete samples. By evaluating this function with a finer grid, our model predicts higher-resolution signal intensities for intermediate voxel locations. The Perturbation Network's motion-correction efficacy was validated through experiments on biological phantoms and in vivo prostate scans. RESULTS Quantitative analyses revealed significantly higher structural similarity measures of super-resolution images to ground truth high-resolution images compared to high-order interpolation (p< $$ < $$ 0.005). In blind qualitative experiments,96 . 1 % $$ 96.1\% $$ of super-resolution images were assessed to have superior diagnostic quality compared to interpolated images. CONCLUSION High-resolution details in DWI can be obtained without the need for high-resolution training data. One notable advantage of the proposed method is that it does not require a super-resolution training set. This is important in clinical practice because the proposed method can easily be adapted to images with different scanner settings or body parts, whereas the supervised methods do not offer such an option.
Collapse
|
Research Support, N.I.H., Extramural |
1 |
|
6
|
Shao HC, Mengke T, Deng J, Zhang Y. 3D cine-magnetic resonance imaging using spatial and temporal implicit neural representation learning (STINR-MR). Phys Med Biol 2024; 69:095007. [PMID: 38479004 PMCID: PMC11017162 DOI: 10.1088/1361-6560/ad33b7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 02/27/2024] [Accepted: 03/13/2024] [Indexed: 03/26/2024]
Abstract
Objective. 3D cine-magnetic resonance imaging (cine-MRI) can capture images of the human body volume with high spatial and temporal resolutions to study anatomical dynamics. However, the reconstruction of 3D cine-MRI is challenged by highly under-sampled k-space data in each dynamic (cine) frame, due to the slow speed of MR signal acquisition. We proposed a machine learning-based framework, spatial and temporal implicit neural representation learning (STINR-MR), for accurate 3D cine-MRI reconstruction from highly under-sampled data.Approach. STINR-MR used a joint reconstruction and deformable registration approach to achieve a high acceleration factor for cine volumetric imaging. It addressed the ill-posed spatiotemporal reconstruction problem by solving a reference-frame 3D MR image and a corresponding motion model that deforms the reference frame to each cine frame. The reference-frame 3D MR image was reconstructed as a spatial implicit neural representation (INR) network, which learns the mapping from input 3D spatial coordinates to corresponding MR values. The dynamic motion model was constructed via a temporal INR, as well as basis deformation vector fields (DVFs) extracted from prior/onboard 4D-MRIs using principal component analysis. The learned temporal INR encodes input time points and outputs corresponding weighting factors to combine the basis DVFs into time-resolved motion fields that represent cine-frame-specific dynamics. STINR-MR was evaluated using MR data simulated from the 4D extended cardiac-torso (XCAT) digital phantom, as well as two MR datasets acquired clinically from human subjects. Its reconstruction accuracy was also compared with that of the model-based non-rigid motion estimation method (MR-MOTUS) and a deep learning-based method (TEMPEST).Main results. STINR-MR can reconstruct 3D cine-MR images with high temporal (<100 ms) and spatial (3 mm) resolutions. Compared with MR-MOTUS and TEMPEST, STINR-MR consistently reconstructed images with better image quality and fewer artifacts and achieved superior tumor localization accuracy via the solved dynamic DVFs. For the XCAT study, STINR reconstructed the tumors to a mean ± SD center-of-mass error of 0.9 ± 0.4 mm, compared to 3.4 ± 1.0 mm of the MR-MOTUS method. The high-frame-rate reconstruction capability of STINR-MR allows different irregular motion patterns to be accurately captured.Significance. STINR-MR provides a lightweight and efficient framework for accurate 3D cine-MRI reconstruction. It is a 'one-shot' method that does not require external data for pre-training, allowing it to avoid generalizability issues typically encountered in deep learning-based methods.
Collapse
|
research-article |
1 |
|
7
|
Lee J, Baek J. Iterative reconstruction for limited-angle CT using implicit neural representation. Phys Med Biol 2024; 69:105008. [PMID: 38593820 DOI: 10.1088/1361-6560/ad3c8e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 04/09/2024] [Indexed: 04/11/2024]
Abstract
Objective.Limited-angle computed tomography (CT) presents a challenge due to its ill-posed nature. In such scenarios, analytical reconstruction methods often exhibit severe artifacts. To tackle this inverse problem, several supervised deep learning-based approaches have been proposed. However, they are constrained by limitations such as generalization issue and the difficulty of acquiring a large amount of paired CT images.Approach.In this work, we propose an iterative neural reconstruction framework designed for limited-angle CT. By leveraging a coordinate-based neural representation, we formulate tomographic reconstruction as a convex optimization problem involving a deep neural network. We then employ differentiable projection layer to optimize this network by minimizing the discrepancy between the predicted and measured projection data. In addition, we introduce a prior-based weight initialization method to ensure the network starts optimization with an informed initial guess. This strategic initialization significantly improves the quality of iterative reconstruction by stabilizing the divergent behavior in ill-posed neural fields. Our method operates in a self-supervised manner, thereby eliminating the need for extensive data.Main results.The proposed method outperforms other iterative and learning-based methods. Experimental results on XCAT and Mayo Clinic datasets demonstrate the effectiveness of our approach in restoring anatomical features as well as structures. This finding was substantiated by visual inspections and quantitative evaluations using NRMSE, PSNR, and SSIM. Moreover, we conduct a comprehensive investigation into the divergent behavior of iterative neural reconstruction, thus revealing its suboptimal convergence when starting from scratch. In contrast, our method consistently produced accurate images by incorporating an initial estimate as informed initialization.Significance.This work showcases the feasibility to reconstruct high-fidelity CT images from limited-angle x-ray projections. The proposed methodology introduces a novel data-free approach to enhance medical imaging, holding promise across various clinical applications.
Collapse
|
Research Support, N.I.H., Extramural |
1 |
|
8
|
Park HS, Seo JK, Jeon K. Implicit neural representation-based method for metal-induced beam hardening artifact reduction in X-ray CT imaging. Med Phys 2025; 52:2201-2211. [PMID: 39888006 DOI: 10.1002/mp.17649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2024] [Revised: 01/11/2025] [Accepted: 01/14/2025] [Indexed: 02/01/2025] Open
Abstract
BACKGROUND In X-ray computed tomography (CT), metal-induced beam hardening artifacts arise from the complex interactions between polychromatic X-ray beams and metallic objects, leading to degraded image quality and impeding accurate diagnosis. A previously proposed metal-induced beam hardening correction (MBHC) method provides a theoretical framework for addressing nonlinear artifacts through mathematical analysis, with its effectiveness demonstrated by numerical simulations and phantom experiments. However, in practical applications, this method relies on precise segmentation of highly attenuating materials and parameter estimations, which limit its ability to fully correct artifacts caused by the intricate interactions between metals and other dense materials, such as bone or teeth. PURPOSE This study aims to develop a parameter-free MBHC method that eliminates the need for accurate segmentation and parameter estimations, thereby addressing the limitations of the original MBHC approach. METHODS The proposed method employs implicit neural representations (INR) to generate two tomographic images: one representing the monochromatic attenuation distribution at a specific energy level, and another capturing the nonlinear beam hardening effects caused by the polychromatic nature of X-ray beams. A loss function drives the generation of these images, where the predicted projection data is nonlinearly modeled by the combination of the two images. This approach eliminates the need for geometric and parameter estimation of metals, providing a more generalized solution. RESULTS Numerical and phantom experiments demonstrates that the proposed method effectively reduces beam hardening artifacts caused by interactions between highly attenuating materials such as metals, bone, and teeth. Additionally, the proposed INR-based method demonstrates potential in addressing challenges related to data insufficiencies, such as photon starvation and truncated fields of view in CT imaging. CONCLUSIONS The proposed generalized MBHC method provides high-quality image reconstructions without requiring parameter estimations and segmentations, offering a robust solution for reducing metal-induced beam hardening artifacts in CT imaging.
Collapse
|
|
1 |
|
9
|
Shao HC, Mengke T, Pan T, Zhang Y. Dynamic CBCT imaging using prior model-free spatiotemporal implicit neural representation (PMF-STINR). Phys Med Biol 2024; 69:115030. [PMID: 38697195 PMCID: PMC11133878 DOI: 10.1088/1361-6560/ad46dc] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 04/12/2024] [Accepted: 05/01/2024] [Indexed: 05/04/2024]
Abstract
Objective. Dynamic cone-beam computed tomography (CBCT) can capture high-spatial-resolution, time-varying images for motion monitoring, patient setup, and adaptive planning of radiotherapy. However, dynamic CBCT reconstruction is an extremely ill-posed spatiotemporal inverse problem, as each CBCT volume in the dynamic sequence is only captured by one or a few x-ray projections, due to the slow gantry rotation speed and the fast anatomical motion (e.g. breathing).Approach. We developed a machine learning-based technique, prior-model-free spatiotemporal implicit neural representation (PMF-STINR), to reconstruct dynamic CBCTs from sequentially acquired x-ray projections. PMF-STINR employs a joint image reconstruction and registration approach to address the under-sampling challenge, enabling dynamic CBCT reconstruction from singular x-ray projections. Specifically, PMF-STINR uses spatial implicit neural representations to reconstruct a reference CBCT volume, and it applies temporal INR to represent the intra-scan dynamic motion of the reference CBCT to yield dynamic CBCTs. PMF-STINR couples the temporal INR with a learning-based B-spline motion model to capture time-varying deformable motion during the reconstruction. Compared with the previous methods, the spatial INR, the temporal INR, and the B-spline model of PMF-STINR are all learned on the fly during reconstruction in a one-shot fashion, without using any patient-specific prior knowledge or motion sorting/binning.Main results. PMF-STINR was evaluated via digital phantom simulations, physical phantom measurements, and a multi-institutional patient dataset featuring various imaging protocols (half-fan/full-fan, full sampling/sparse sampling, different energy and mAs settings, etc). The results showed that the one-shot learning-based PMF-STINR can accurately and robustly reconstruct dynamic CBCTs and capture highly irregular motion with high temporal (∼ 0.1 s) resolution and sub-millimeter accuracy.Significance. PMF-STINR can reconstruct dynamic CBCTs and solve the intra-scan motion from conventional 3D CBCT scans without using any prior anatomical/motion model or motion sorting/binning. It can be a promising tool for motion management by offering richer motion information than traditional 4D-CBCTs.
Collapse
|
research-article |
1 |
|
10
|
Shao HC, Mengke T, Deng J, Zhang Y. 3D cine-magnetic resonance imaging using spatial and temporal implicit neural representation learning (STINR-MR). ARXIV 2023:arXiv:2308.09771v1. [PMID: 37645038 PMCID: PMC10462175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Objective 3D cine-magnetic resonance imaging (cine-MRI) can capture images of the human body volume with high spatial and temporal resolutions to study the anatomical dynamics. However, the reconstruction of 3D cine-MRI is challenged by highly undersampled k-space data in each dynamic (cine) frame, due to the slow speed of MR signal acquisition. We proposed a machine learning-based framework, spatial and temporal implicit neural representation learning (STINR-MR), for accurate 3D cine-MRI reconstruction from highly undersampled data. Approach STINR-MR used a joint reconstruction and deformable registration approach to achieve a high acceleration factor for cine volumetric imaging. It addressed the ill-posed spatiotemporal reconstruction problem by solving a reference-frame 3D MR image and a corresponding motion model which deforms the reference frame to each cine frame. The reference-frame 3D MR image was reconstructed as a spatial implicit neural representation (INR) network, which learns the mapping from input 3D spatial coordinates to corresponding MR values. The dynamic motion model was constructed via a temporal INR, as well as basis deformation vector fields (DVFs) extracted from prior/onboard 4D-MRIs using principal component analysis (PCA). The learned temporal INR encodes input time points and outputs corresponding weighting factors to combine the basis DVFs into time-resolved motion fields that represent cine-frame-specific dynamics. STINR-MR was evaluated using MR data simulated from the 4D extended cardiac-torso (XCAT) digital phantom, as well as MR data acquired clinically from a healthy human subject. Its reconstruction accuracy was also compared with that of the model-based non-rigid motion estimation method (MR-MOTUS). Main results STINR-MR can reconstruct 3D cine-MR images with high temporal (<100 ms) and spatial (3 mm) resolutions. Compared with MR-MOTUS, STINR-MR consistently reconstructed images with better image quality and fewer artifacts and achieved superior tumor localization accuracy via the solved dynamic DVFs. For the XCAT study, STINR reconstructed the tumors to a mean±S.D. center-of-mass error of 1.0±0.4 mm, compared to 3.4±1.0 mm of the MR-MOTUS method. The high-frame-rate reconstruction capability of STINR-MR allows different irregular motion patterns to be accurately captured. Significance STINR-MR provides a lightweight and efficient framework for accurate 3D cine-MRI reconstruction. It is a 'one-shot' method that does not require external data for pre-training, allowing it to avoid generalizability issues typically encountered in deep learning-based methods.
Collapse
|
Preprint |
2 |
|
11
|
Zhao Y, Wang L, Zhai X, Han J, Ma WWS, Ding J, Gu Y, Fu X. Near-Isotropic, Extreme-Stiffness, Continuous 3D Mechanical Metamaterial Sequences Using Implicit Neural Representation. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025; 12:e2410428. [PMID: 39601118 PMCID: PMC11744521 DOI: 10.1002/advs.202410428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2024] [Revised: 10/27/2024] [Indexed: 11/29/2024]
Abstract
Mechanical metamaterials represent a distinct category of engineered materials characterized by their tailored density distributions to have unique properties. It is challenging to create continuous density distributions to design a smooth mechanical metamaterial sequence in which each metamaterial possesses stiffness close to the theoretical limit in all directions. This study proposes three near-isotropic, extreme-stiffness, and continuous 3D mechanical metamaterial sequences by combining topology optimization and data-driven design. Through innovative structural design, the sequences achieve over 98% of the Hashin-Shtrikman upper bounds in the most unfavorable direction. This performance spans a relative density range of 0.2-1, surpassing previous designs, which fall short at medium and higher densities. Moreover, the metamaterial sequence is innovatively represented by the implicit neural function; thus, it is resolution-free to exhibit continuously varying densities. Experimental validation demonstrates the manufacturability and high stiffness of the three sequences.
Collapse
|
research-article |
1 |
|
12
|
Shao HC, Mengke T, Pan T, Zhang Y. Dynamic CBCT Imaging using Prior Model-Free Spatiotemporal Implicit Neural Representation (PMF-STINR). ARXIV 2023:arXiv:2311.10036v2. [PMID: 38013886 PMCID: PMC10680908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Objective Dynamic cone-beam computed tomography (CBCT) can capture high-spatial-resolution, time-varying images for motion monitoring, patient setup, and adaptive planning of radiotherapy. However, dynamic CBCT reconstruction is an extremely ill-posed spatiotemporal inverse problem, as each CBCT volume in the dynamic sequence is only captured by one or a few X-ray projections, due to the slow gantry rotation speed and the fast anatomical motion (e.g., breathing). Approach We developed a machine learning-based technique, prior-model-free spatiotemporal implicit neural representation (PMF-STINR), to reconstruct dynamic CBCTs from sequentially acquired X-ray projections. PMF-STINR employs a joint image reconstruction and registration approach to address the under-sampling challenge, enabling dynamic CBCT reconstruction from singular X-ray projections. Specifically, PMF-STINR uses spatial implicit neural representation to reconstruct a reference CBCT volume, and it applies temporal INR to represent the intra-scan dynamic motion with respect to the reference CBCT to yield dynamic CBCTs. PMF-STINR couples the temporal INR with a learning-based B-spline motion model to capture time-varying deformable motion during the reconstruction. Compared with the previous methods, the spatial INR, the temporal INR, and the B-spline model of PMF-STINR are all learned on the fly during reconstruction in a one-shot fashion, without using any patient-specific prior knowledge or motion sorting/binning. Main results PMF-STINR was evaluated via digital phantom simulations, physical phantom measurements, and a multi-institutional patient dataset featuring various imaging protocols (half-fan/full-fan, full sampling/sparse sampling, different energy and mAs settings, etc.). The results showed that the one-shot learning-based PMF-STINR can accurately and robustly reconstruct dynamic CBCTs and capture highly irregular motion with high temporal (~0.1s) resolution and sub-millimeter accuracy. Significance PMF-STINR can reconstruct dynamic CBCTs and solve the intra-scan motion from conventional 3D CBCT scans without using any prior anatomical/motion model or motion sorting/binning. It can be a promising tool for motion management by offering richer motion information than traditional 4D-CBCTs.
Collapse
|
Preprint |
2 |
|
13
|
Liu S, Cao P, Feng Y, Ji Y, Chen J, Xie X, Wu L. NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1167. [PMID: 37628197 PMCID: PMC10453668 DOI: 10.3390/e25081167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 08/03/2023] [Accepted: 08/03/2023] [Indexed: 08/27/2023]
Abstract
Recently, end-to-end deep models for video compression have made steady advancements. However, this resulted in a lengthy and complex pipeline containing numerous redundant parameters. The video compression approaches based on implicit neural representation (INR) allow videos to be directly represented as a function approximated by a neural network, resulting in a more lightweight model, whereas the singularity of the feature extraction pipeline limits the network's ability to fit the mapping function for video frames. Hence, we propose a neural representation approach for video compression with an implicit multiscale fusion network (NRVC), utilizing normalized residual networks to improve the effectiveness of INR in fitting the target function. We propose the multiscale representations for video compression (MSRVC) network, which effectively extracts features from the input video sequence to enhance the degree of overfitting in the mapping function. Additionally, we propose the feature extraction channel attention (FECA) block to capture interaction information between different feature extraction channels, further improving the effectiveness of feature extraction. The results show that compared to the NeRV method with similar bits per pixel (BPP), NRVC has a 2.16% increase in the decoded peak signal-to-noise ratio (PSNR). Moreover, NRVC outperforms the conventional HEVC in terms of PSNR.
Collapse
|
research-article |
2 |
|
14
|
Du W, Cui H, He L, Chen H, Zhang Y, Yang H. Structure-aware diffusion for low-dose CT imaging. Phys Med Biol 2024; 69:155008. [PMID: 38942004 DOI: 10.1088/1361-6560/ad5d47] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 06/28/2024] [Indexed: 06/30/2024]
Abstract
Reducing the radiation dose leads to the x-ray computed tomography (CT) images suffering from heavy noise and artifacts, which inevitably interferes with the subsequent clinic diagnostic and analysis. Leading works have explored diffusion models for low-dose CT imaging to avoid the structure degeneration and blurring effects of previous deep denoising models. However, most of them always begin their generative processes with Gaussian noise, which has little or no structure priors of the clean data distribution, thereby leading to long-time inference and unpleasant reconstruction quality. To alleviate these problems, this paper presents a Structure-Aware Diffusion model (SAD), an end-to-end self-guided learning framework for high-fidelity CT image reconstruction. First, SAD builds a nonlinear diffusion bridge between clean and degraded data distributions, which could directly learn the implicit physical degradation prior from observed measurements. Second, SAD integrates the prompt learning mechanism and implicit neural representation into the diffusion process, where rich and diverse structure representations extracted by degraded inputs are exploited as prompts, which provides global and local structure priors, to guide CT image reconstruction. Finally, we devise an efficient self-guided diffusion architecture using an iterative updated strategy, which further refines structural prompts during each generative step to drive finer image reconstruction. Extensive experiments on AAPM-Mayo and LoDoPaB-CT datasets demonstrate that our SAD could achieve superior performance in terms of noise removal, structure preservation, and blind-dose generalization, with few generative steps, even one step only.
Collapse
|
|
1 |
|
15
|
Hendriks T, Vilanova A, Chamberland M. Implicit neural representation of multi-shell constrained spherical deconvolution for continuous modeling of diffusion MRI. IMAGING NEUROSCIENCE (CAMBRIDGE, MASS.) 2025; 3:imag_a_00501. [PMID: 40078536 PMCID: PMC11894815 DOI: 10.1162/imag_a_00501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 01/16/2025] [Accepted: 02/09/2025] [Indexed: 03/14/2025]
Abstract
Diffusion magnetic resonance imaging (dMRI) provides insight into the micro and macro-structure of the brain. Multi-shell multi-tissue constrained spherical deconvolution (MSMT-CSD) models the underlying local fiber orientation distributions (FODs) using the dMRI signal. While generally producing high-quality FODs, MSMT-CSD is a voxel-wise method that can be impacted by noise and produce erroneous FODs. Local models also do not use the spatial correlation between neighboring voxels to increase parameter estimating power. Additionally, voxel-wise methods require interpolation at arbitrary locations outside of voxel centers. These interpolations can be computationally costly or inaccurate, depending on the method of choice. Expanding upon previous work, we apply the implicit neural representation (INR) methodology to the MSMT-CSD model. This results in an unsupervised machine-learning framework that generates a continuous representation of a given dMRI dataset. The input of the INR consists of coordinates in the volume, which produce the spherical harmonics coefficients parameterizing an FOD at any desired location. A key characteristic of our model is its ability to leverage spatial correlations in the volume, which acts as a form of regularization. We evaluate the output FODs quantitatively and qualitatively in synthetic and real dMRI datasets and compare them to existing methods.
Collapse
|
research-article |
1 |
|
16
|
Li Y, Liao YP, Wang J, Lu W, Zhang Y. Patient-specific MRI super-resolution via implicit neural representations and knowledge transfer. Phys Med Biol 2025; 70:075021. [PMID: 40064110 DOI: 10.1088/1361-6560/adbed4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2024] [Accepted: 03/10/2025] [Indexed: 04/02/2025]
Abstract
Objective.Magnetic resonance imaging (MRI) is a non-invasive imaging technique that provides high soft tissue contrast, playing a vital role in disease diagnosis and treatment planning. However, due to limitations in imaging hardware, scan time, and patient compliance, the resolution of MRI images is often insufficient. Super-resolution (SR) techniques can enhance MRI resolution, reveal more detailed anatomical information, and improve the identification of complex structures, while also reducing scan time and patient discomfort. However, traditional population-based models trained on large datasets may introduce artifacts or hallucinated structures, which compromise their reliability in clinical applications.Approach.To address these challenges, we propose a patient-specific knowledge transfer implicit neural representation (KT-INR) SR model. The KT-INR model integrates a dual-head INR with a pre-trained generative adversarial network (GAN) model trained on a large-scale dataset. Anatomical information from different MRI sequences of the same patient, combined with the SR mappings learned by the GAN model on a population-based dataset, is transferred as prior knowledge to the INR. This integration enhances both the performance and reliability of the SR model.Main results.We validated the effectiveness of the KT-INR model across three distinct clinical SR tasks on the brain tumor segmentation dataset. For task 1, KT-INR achieved an average structural similarity index, peak signal-to-noise ratio, and learned perceptual image patch similarity of 0.9813, 36.845, and 0.0186, respectively. In comparison, a state-of-the-art SR technique, ArSSR, attained average values of 0.9689, 33.4557, and 0.0309 for the same metrics. The experimental results demonstrate that KT-INR outperforms all other methods across all tasks and evaluation metrics, with particularly remarkable performance in resolving fine anatomical details.Significance.The KT-INR model significantly enhances the reliability of SR results, effectively addressing the hallucination effects commonly seen in traditional models. It provides a robust solution for patient-specific MRI SR.
Collapse
|
|
1 |
|
17
|
Luo J, Han L, Gao X, Liu X, Wang W. SR-FEINR: Continuous Remote Sensing Image Super-Resolution Using Feature-Enhanced Implicit Neural Representation. SENSORS (BASEL, SWITZERLAND) 2023; 23:3573. [PMID: 37050632 PMCID: PMC10098664 DOI: 10.3390/s23073573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 03/18/2023] [Accepted: 03/25/2023] [Indexed: 06/19/2023]
Abstract
Remote sensing images often have limited resolution, which can hinder their effectiveness in various applications. Super-resolution techniques can enhance the resolution of remote sensing images, and arbitrary resolution super-resolution techniques provide additional flexibility in choosing appropriate image resolutions for different tasks. However, for subsequent processing, such as detection and classification, the resolution of the input image may vary greatly for different methods. In this paper, we propose a method for continuous remote sensing image super-resolution using feature-enhanced implicit neural representation (SR-FEINR). Continuous remote sensing image super-resolution means users can scale a low-resolution image into an image with arbitrary resolution. Our algorithm is composed of three main components: a low-resolution image feature extraction module, a positional encoding module, and a feature-enhanced multi-layer perceptron module. We are the first to apply implicit neural representation in a continuous remote sensing image super-resolution task. Through extensive experiments on two popular remote sensing image datasets, we have shown that our SR-FEINR outperforms the state-of-the-art algorithms in terms of accuracy. Our algorithm showed an average improvement of 0.05 dB over the existing method on ×30 across three datasets.
Collapse
|
research-article |
2 |
|
18
|
Bhardwaj R, Jothi Balaji J, Lakshminarayanan V. OW-SLR: Overlapping Windows on Semi-Local Region for Image Super-Resolution. J Imaging 2023; 9:246. [PMID: 37998093 PMCID: PMC10672420 DOI: 10.3390/jimaging9110246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 10/24/2023] [Accepted: 10/31/2023] [Indexed: 11/25/2023] Open
Abstract
There has been considerable progress in implicit neural representation to upscale an image to any arbitrary resolution. However, existing methods are based on defining a function to predict the Red, Green and Blue (RGB) value from just four specific loci. Relying on just four loci is insufficient as it leads to losing fine details from the neighboring region(s). We show that by taking into account the semi-local region leads to an improvement in performance. In this paper, we propose applying a new technique called Overlapping Windows on Semi-Local Region (OW-SLR) to an image to obtain any arbitrary resolution by taking the coordinates of the semi-local region around a point in the latent space. This extracted detail is used to predict the RGB value of a point. We illustrate the technique by applying the algorithm to the Optical Coherence Tomography-Angiography (OCT-A) images and show that it can upscale them to random resolution. This technique outperforms the existing state-of-the-art methods when applied to the OCT500 dataset. OW-SLR provides better results for classifying healthy and diseased retinal images such as diabetic retinopathy and normals from the given set of OCT-A images.
Collapse
|
research-article |
2 |
|
19
|
Li X, Bellotti R, Bachtiary B, Hrbacek J, Weber DC, Lomax AJ, Buhmann JM, Zhang Y. A unified generation-registration framework for improved MR-based CT synthesis in proton therapy. Med Phys 2024; 51:8302-8316. [PMID: 39137294 DOI: 10.1002/mp.17338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 06/11/2024] [Accepted: 07/06/2024] [Indexed: 08/15/2024] Open
Abstract
BACKGROUND The use of magnetic resonance (MR) imaging for proton therapy treatment planning is gaining attention as a highly effective method for guidance. At the core of this approach is the generation of computed tomography (CT) images from MR scans. However, the critical issue in this process is accurately aligning the MR and CT images, a task that becomes particularly challenging in frequently moving body areas, such as the head-and-neck. Misalignments in these images can result in blurred synthetic CT (sCT) images, adversely affecting the precision and effectiveness of the treatment planning. PURPOSE This study introduces a novel network that cohesively unifies image generation and registration processes to enhance the quality and anatomical fidelity of sCTs derived from better-aligned MR images. METHODS The approach synergizes a generation network (G) with a deformable registration network (R), optimizing them jointly in MR-to-CT synthesis. This goal is achieved by alternately minimizing the discrepancies between the generated/registered CT images and their corresponding reference CT counterparts. The generation network employs a UNet architecture, while the registration network leverages an implicit neural representation (INR) of the displacement vector fields (DVFs). We validated this method on a dataset comprising 60 head-and-neck patients, reserving 12 cases for holdout testing. RESULTS Compared to the baseline Pix2Pix method with MAE 124.95 ± $\pm$ 30.74 HU, the proposed technique demonstrated 80.98 ± $\pm$ 7.55 HU. The unified translation-registration network produced sharper and more anatomically congruent outputs, showing superior efficacy in converting MR images to sCTs. Additionally, from a dosimetric perspective, the plan recalculated on the resulting sCTs resulted in a remarkably reduced discrepancy to the reference proton plans. CONCLUSIONS This study conclusively demonstrates that a holistic MR-based CT synthesis approach, integrating both image-to-image translation and deformable registration, significantly improves the precision and quality of sCT generation, particularly for the challenging body area with varied anatomic changes between corresponding MR and CT.
Collapse
|
|
1 |
|
20
|
Miao Z, Zhang L, Tian J, Yang G, Hui H. Continuous implicit neural representation for arbitrary super-resolution of system matrix in magnetic particle imaging. Phys Med Biol 2025; 70:045012. [PMID: 39912345 DOI: 10.1088/1361-6560/ada419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2024] [Accepted: 12/30/2024] [Indexed: 02/07/2025]
Abstract
Objective. Magnetic particle imaging (MPI) is a novel imaging technique that uses magnetic fields to detect tracer materials consisting of magnetic nanoparticles. System matrix (SM) based image reconstruction is essential for achieving high image quality in MPI. However, the time-consuming SM calibrations need to be repeated whenever the magnetic field's or nanoparticle's characteristics change. Accelerating this calibration process is therefore crucial. The most common acceleration approach involves undersampling during the SM calibration procedure, followed by super-resolution methods to recover the high-resolution SM. However, these methods typically require separate training of multiple models for different undersampling ratios, leading to increased storage and training time costs.Approach. We propose an arbitrary-scale SM super-resolution method based on continuous implicit neural representation (INR). Using INR, the SM is modeled as a continuous function in space, enabling arbitrary-scale super-resolution by sampling the function at different densities. A cross-frequency encoder is implemented to share SM frequency information and analyze contextual relationships, resulting in a more intelligent and efficient sampling strategy. Convolutional neural networks (CNNs) are utilized to learn and optimize the grid sampling process in INR, leveraging the advantage of CNNs in learning local feature associations and considering surrounding information comprehensively.Main results. Experimental results on OpenMPI demonstrate that our method outperforms existing methods and enables calibration at any scale with a single model. The proposed method achieves high accuracy and efficiency in SM recovery, even at high undersampling rates.Significance. The proposed method significantly reduces the storage and training time costs associated with SM calibration, making it more practical for real-world applications. By enabling arbitrary-scale super-resolution with a single model, our approach enhances the flexibility and efficiency of MPI systems, paving the way for more widespread adoption of MPI technology.
Collapse
|
|
1 |
|