1
|
Zhang W, Zhao L, Gou H, Gong Y, Zhou Y, Feng Q. PRSCS-Net: Progressive 3D/2D rigid Registration network with the guidance of Single-view Cycle Synthesis. Med Image Anal 2024; 97:103283. [PMID: 39094463 DOI: 10.1016/j.media.2024.103283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 07/08/2024] [Accepted: 07/17/2024] [Indexed: 08/04/2024]
Abstract
The 3D/2D registration for 3D pre-operative images (computed tomography, CT) and 2D intra-operative images (X-ray) plays an important role in image-guided spine surgeries. Conventional iterative-based approaches suffer from time-consuming processes. Existing learning-based approaches require high computational costs and face poor performance on large misalignment because of projection-induced losses or ill-posed reconstruction. In this paper, we propose a Progressive 3D/2D rigid Registration network with the guidance of Single-view Cycle Synthesis, named PRSCS-Net. Specifically, we first introduce the differentiable backward/forward projection operator into the single-view cycle synthesis network, which reconstructs corresponding 3D geometry features from two 2D intra-operative view images (one from the input, and the other from the synthesis). In this way, the problem of limited views during reconstruction can be solved. Subsequently, we employ a self-reconstruction path to extract latent representation from pre-operative 3D CT images. The following pose estimation process will be performed in the 3D geometry feature space, which can solve the dimensional gap, greatly reduce the computational complexity, and ensure that the features extracted from pre-operative and intra-operative images are as relevant as possible to pose estimation. Furthermore, to enhance the ability of our model for handling large misalignment, we develop a progressive registration path, including two sub-registration networks, aiming to estimate the pose parameters via two-step warping volume features. Finally, our proposed method has been evaluated on a public dataset CTSpine1k and an in-house dataset C-ArmLSpine for 3D/2D registration. Results demonstrate that PRSCS-Net achieves state-of-the-art registration performance in terms of registration accuracy, robustness, and generalizability compared with existing methods. Thus, PRSCS-Net has potential for clinical spinal disease surgical planning and surgical navigation systems.
Collapse
Affiliation(s)
- Wencong Zhang
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China
| | - Lei Zhao
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China
| | - Hang Gou
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China
| | - Yanggang Gong
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China
| | - Yujia Zhou
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China.
| | - Qianjin Feng
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China.
| |
Collapse
|
2
|
Killeen BD, Chaudhary S, Osgood G, Unberath M. Take a shot! Natural language control of intelligent robotic X-ray systems in surgery. Int J Comput Assist Radiol Surg 2024; 19:1165-1173. [PMID: 38619790 PMCID: PMC11178437 DOI: 10.1007/s11548-024-03120-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 03/22/2024] [Indexed: 04/16/2024]
Abstract
PURPOSE The expanding capabilities of surgical systems bring with them increasing complexity in the interfaces that humans use to control them. Robotic C-arm X-ray imaging systems, for instance, often require manipulation of independent axes via joysticks, while higher-level control options hide inside device-specific menus. The complexity of these interfaces hinder "ready-to-hand" use of high-level functions. Natural language offers a flexible, familiar interface for surgeons to express their desired outcome rather than remembering the steps necessary to achieve it, enabling direct access to task-aware, patient-specific C-arm functionality. METHODS We present an English language voice interface for controlling a robotic X-ray imaging system with task-aware functions for pelvic trauma surgery. Our fully integrated system uses a large language model (LLM) to convert natural spoken commands into machine-readable instructions, enabling low-level commands like "Tilt back a bit," to increase the angular tilt or patient-specific directions like, "Go to the obturator oblique view of the right ramus," based on automated image analysis. RESULTS We evaluate our system with 212 prompts provided by an attending physician, in which the system performed satisfactory actions 97% of the time. To test the fully integrated system, we conduct a real-time study in which an attending physician placed orthopedic hardware along desired trajectories through an anthropomorphic phantom, interacting solely with an X-ray system via voice. CONCLUSION Voice interfaces offer a convenient, flexible way for surgeons to manipulate C-arms based on desired outcomes rather than device-specific processes. As LLMs grow increasingly capable, so too will their applications in supporting higher-level interactions with surgical assistance systems.
Collapse
Affiliation(s)
- Benjamin D Killeen
- Laboratory for Computational Sensing and Robotics, Johns Hopkins University, Baltimore, MD, 21218, USA.
| | - Shreayan Chaudhary
- Laboratory for Computational Sensing and Robotics, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Greg Osgood
- Department of Orthopaedic Surgery, Johns Hopkins University, Baltimore, MD, 212187, USA
| | - Mathias Unberath
- Laboratory for Computational Sensing and Robotics, Johns Hopkins University, Baltimore, MD, 21218, USA
| |
Collapse
|
3
|
Burton W, Myers C, Stefanovic M, Shelburne K, Rullkoetter P. Scan-Free and Fully Automatic Tracking of Native Knee Anatomy from Dynamic Stereo-Radiography with Statistical Shape and Intensity Models. Ann Biomed Eng 2024; 52:1591-1603. [PMID: 38558356 DOI: 10.1007/s10439-024-03473-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 02/09/2024] [Indexed: 04/04/2024]
Abstract
Kinematic tracking of native anatomy from stereo-radiography provides a quantitative basis for evaluating human movement. Conventional tracking procedures require significant manual effort and call for acquisition and annotation of subject-specific volumetric medical images. The current work introduces a framework for fully automatic tracking of native knee anatomy from dynamic stereo-radiography which forgoes reliance on volumetric scans. The method consists of three computational steps. First, captured radiographs are annotated with segmentation maps and anatomic landmarks using a convolutional neural network. Next, a non-convex polynomial optimization problem formulated from annotated landmarks is solved to acquire preliminary anatomy and pose estimates. Finally, a global optimization routine is performed for concurrent refinement of anatomy and pose. An objective function is maximized which quantifies similarities between masked radiographs and digitally reconstructed radiographs produced from statistical shape and intensity models. The proposed framework was evaluated against manually tracked trials comprising dynamic activities, and additional frames capturing a static knee phantom. Experiments revealed anatomic surface errors routinely below 1.0 mm in both evaluation cohorts. Median absolute errors of individual bone pose estimates were below 1.0∘ or mm for 15 out of 18 degrees of freedom in both evaluation cohorts. Results indicate that accurate pose estimation of native anatomy from stereo-radiography may be performed with significantly reduced manual effort, and without reliance on volumetric scans.
Collapse
Affiliation(s)
- William Burton
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E Wesley Ave, Denver, CO, 80208, USA.
| | - Casey Myers
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E Wesley Ave, Denver, CO, 80208, USA
| | - Margareta Stefanovic
- Department of Electrical and Computer Engineering, University of Denver, 2155 E Wesley Ave, Denver, CO, 80208, USA
| | - Kevin Shelburne
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E Wesley Ave, Denver, CO, 80208, USA
| | - Paul Rullkoetter
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E Wesley Ave, Denver, CO, 80208, USA
| |
Collapse
|
4
|
Burton W, Myers C, Stefanovic M, Shelburne K, Rullkoetter P. Fully automatic tracking of native knee kinematics from stereo-radiography with digitally reconstructed radiographs. J Biomech 2024; 166:112066. [PMID: 38574563 DOI: 10.1016/j.jbiomech.2024.112066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 03/19/2024] [Accepted: 03/25/2024] [Indexed: 04/06/2024]
Abstract
Precise measurement of joint-level motion from stereo-radiography facilitates understanding of human movement. Conventional procedures for kinematic tracking require significant manual effort and are time intensive. The current work introduces a method for fully automatic tracking of native knee kinematics from stereo-radiography sequences. The framework consists of three computational steps. First, biplanar radiograph frames are annotated with segmentation maps and key points using a convolutional neural network. Next, initial bone pose estimates are acquired by solving a polynomial optimization problem constructed from annotated key points and anatomic landmarks from digitized models. A semidefinite relaxation is formulated to realize the global minimum of the non-convex problem. Pose estimates are then refined by registering computed tomography-based digitally reconstructed radiographs to masked radiographs. A novel rendering method is also introduced which enables generating digitally reconstructed radiographs from computed tomography scans with inconsistent slice widths. The automatic tracking framework was evaluated with stereo-radiography trials manually tracked with model-image registration, and with frames which capture a synthetic leg phantom. The tracking method produced pose estimates which were consistently similar to manually tracked values; and demonstrated pose errors below 1.0 degree or millimeter for all femur and tibia degrees of freedom in phantom trials. Results indicate the described framework may benefit orthopaedics and biomechanics applications through acceleration of kinematic tracking.
Collapse
Affiliation(s)
- William Burton
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E Wesley Ave, Denver, 80208, CO, USA.
| | - Casey Myers
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E Wesley Ave, Denver, 80208, CO, USA.
| | - Margareta Stefanovic
- Department of Electrical and Computer Engineering, University of Denver, 2155 E Wesley Ave, Denver, 80208, CO, USA.
| | - Kevin Shelburne
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E Wesley Ave, Denver, 80208, CO, USA.
| | - Paul Rullkoetter
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E Wesley Ave, Denver, 80208, CO, USA.
| |
Collapse
|
5
|
Gao C, Feng A, Liu X, Taylor RH, Armand M, Unberath M. A Fully Differentiable Framework for 2D/3D Registration and the Projective Spatial Transformers. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:275-285. [PMID: 37549070 PMCID: PMC10879149 DOI: 10.1109/tmi.2023.3299588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/09/2023]
Abstract
Image-based 2D/3D registration is a critical technique for fluoroscopic guided surgical interventions. Conventional intensity-based 2D/3D registration approa- ches suffer from a limited capture range due to the presence of local minima in hand-crafted image similarity functions. In this work, we aim to extend the 2D/3D registration capture range with a fully differentiable deep network framework that learns to approximate a convex-shape similarity function. The network uses a novel Projective Spatial Transformer (ProST) module that has unique differentiability with respect to 3D pose parameters, and is trained using an innovative double backward gradient-driven loss function. We compare the most popular learning-based pose regression methods in the literature and use the well-established CMAES intensity-based registration as a benchmark. We report registration pose error, target registration error (TRE) and success rate (SR) with a threshold of 10mm for mean TRE. For the pelvis anatomy, the median TRE of ProST followed by CMAES is 4.4mm with a SR of 65.6% in simulation, and 2.2mm with a SR of 73.2% in real data. The CMAES SRs without using ProST registration are 28.5% and 36.0% in simulation and real data, respectively. Our results suggest that the proposed ProST network learns a practical similarity function, which vastly extends the capture range of conventional intensity-based 2D/3D registration. We believe that the unique differentiable property of ProST has the potential to benefit related 3D medical imaging research applications. The source code is available at https://github.com/gaocong13/Projective-Spatial-Transformers.
Collapse
|
6
|
Bobrow TL, Golhar M, Vijayan R, Akshintala VS, Garcia JR, Durr NJ. Colonoscopy 3D video dataset with paired depth from 2D-3D registration. Med Image Anal 2023; 90:102956. [PMID: 37713764 PMCID: PMC10591895 DOI: 10.1016/j.media.2023.102956] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 06/29/2023] [Accepted: 09/04/2023] [Indexed: 09/17/2023]
Abstract
Screening colonoscopy is an important clinical application for several 3D computer vision techniques, including depth estimation, surface reconstruction, and missing region detection. However, the development, evaluation, and comparison of these techniques in real colonoscopy videos remain largely qualitative due to the difficulty of acquiring ground truth data. In this work, we present a Colonoscopy 3D Video Dataset (C3VD) acquired with a high definition clinical colonoscope and high-fidelity colon models for benchmarking computer vision methods in colonoscopy. We introduce a novel multimodal 2D-3D registration technique to register optical video sequences with ground truth rendered views of a known 3D model. The different modalities are registered by transforming optical images to depth maps with a Generative Adversarial Network and aligning edge features with an evolutionary optimizer. This registration method achieves an average translation error of 0.321 millimeters and an average rotation error of 0.159 degrees in simulation experiments where error-free ground truth is available. The method also leverages video information, improving registration accuracy by 55.6% for translation and 60.4% for rotation compared to single frame registration. 22 short video sequences were registered to generate 10,015 total frames with paired ground truth depth, surface normals, optical flow, occlusion, six degree-of-freedom pose, coverage maps, and 3D models. The dataset also includes screening videos acquired by a gastroenterologist with paired ground truth pose and 3D surface models. The dataset and registration source code are available at https://durr.jhu.edu/C3VD.
Collapse
Affiliation(s)
- Taylor L Bobrow
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Mayank Golhar
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Rohan Vijayan
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Venkata S Akshintala
- Division of Gastroenterology and Hepatology, Johns Hopkins Medicine, Baltimore, MD 21287, USA
| | - Juan R Garcia
- Department of Art as Applied to Medicine, Johns Hopkins School of Medicine, Baltimore, MD 21287, USA
| | - Nicholas J Durr
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA.
| |
Collapse
|
7
|
Chiang LH, Chen YC, Huang GS, Huang TF, Sun YC, Chang WC, Hsu YC. Efficacy and reliability of three-dimensional fusion guidance for fluoroscopic navigation in transarterial embolization for refractory musculoskeletal pain. Quant Imaging Med Surg 2023; 13:7719-7730. [PMID: 38106285 PMCID: PMC10722005 DOI: 10.21037/qims-23-490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 08/11/2023] [Indexed: 12/19/2023]
Abstract
Background This study aimed to evaluate the efficacy and reliability of three-dimensional (3D) fusion guidance in roadmapping for fluoroscopic navigation during trans-arterial embolization for refractory musculoskeletal pain (TAE-MSK pain) in the extremities. Methods The included research patients were divided into two groups: group A-TAE-MSK pain performed without the use of 3D fusion guidance; group B-TAE-MSK pain performed with the use of 3D fusion guidance for fluoroscopic navigation. We compared the procedure time, radiation dose, visual analogue scale for pain scores, and adverse effects (before and 3 months after TAE-MSK pain) among the two groups. In the group B, we determined the reliability of ideal branch angle for pre-operative non-contrast 3D magnetic resonance angiography (MRA) and intra-operative 3D cone beam computed tomography (CBCT) angiography. Results We recruited 65 patients, including 23 males and 42 females (average age 58.20±12.58 years), with 38 and 27 patients in groups A and B. A total of 247 vessels were defined as target branch vessels. Significant changes were observed in the fluoroscopy time which was 32.31±12.39 and 14.33±3.06 minutes, in group A and group B (P<0.001), respectively; procedure time, which was 46.45±17.06 in group A and 24.67±9.78 in group B (P<0.001); and radiation exposure dose, determined as 0.71±0.64 and 0.34±0.29 mSv (P<0.01) in group A and group B, respectively. Furthermore, the number of target branch vessels, that underwent successful catheterization were 107 (97%) in group B as compared to 96 (70%) in group A, which was also significant (P<0.001). The study also showed that the ideal branch-angle has a similarly high consistency in pre-operative and intra-operative angiography based on the intra-class correlation coefficient (ICC) (0.994; 0.990, respectively). Conclusions 3D fusion guidance for fluoroscopic navigation not only is a reliable process, but also effectively reduces the operation time and radiation dose of TAE-MSK pain.
Collapse
Affiliation(s)
- Lung-Hui Chiang
- Department of Radiology, Tri-Service General Hospital, Taipei, Taiwan
- National Defense Medical Center, Taipei, Taiwan
| | - Ya-Che Chen
- Department of Radiology, Tri-Service General Hospital, Taipei, Taiwan
- National Defense Medical Center, Taipei, Taiwan
| | - Guo-Shu Huang
- Department of Radiology, Tri-Service General Hospital, Taipei, Taiwan
- National Defense Medical Center, Taipei, Taiwan
- Department of Medical Research, Tri-Service General Hospital, Taipei, Taiwan
| | - Ting-Fu Huang
- Department of Radiology, Tri-Service General Hospital, Taipei, Taiwan
- National Defense Medical Center, Taipei, Taiwan
| | - Yung-Chih Sun
- Department of Radiology, Tri-Service General Hospital, Taipei, Taiwan
- National Defense Medical Center, Taipei, Taiwan
| | - Wei-Chou Chang
- Department of Radiology, Tri-Service General Hospital, Taipei, Taiwan
- National Defense Medical Center, Taipei, Taiwan
| | - Yi-Chih Hsu
- Department of Radiology, Tri-Service General Hospital, Taipei, Taiwan
- National Defense Medical Center, Taipei, Taiwan
| |
Collapse
|
8
|
Burton W, Crespo IR, Andreassen T, Pryhoda M, Jensen A, Myers C, Shelburne K, Banks S, Rullkoetter P. Fully automatic tracking of native glenohumeral kinematics from stereo-radiography. Comput Biol Med 2023; 163:107189. [PMID: 37393783 DOI: 10.1016/j.compbiomed.2023.107189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 06/12/2023] [Accepted: 06/19/2023] [Indexed: 07/04/2023]
Abstract
The current work introduces a system for fully automatic tracking of native glenohumeral kinematics in stereo-radiography sequences. The proposed method first applies convolutional neural networks to obtain segmentation and semantic key point predictions in biplanar radiograph frames. Preliminary bone pose estimates are computed by solving a non-convex optimization problem with semidefinite relaxations to register digitized bone landmarks to semantic key points. Initial poses are then refined by registering computed tomography-based digitally reconstructed radiographs to captured scenes, which are masked by segmentation maps to isolate the shoulder joint. A particular neural net architecture which exploits subject-specific geometry is also introduced to improve segmentation predictions and increase robustness of subsequent pose estimates. The method is evaluated by comparing predicted glenohumeral kinematics to manually tracked values from 17 trials capturing 4 dynamic activities. Median orientation differences between predicted and ground truth poses were 1.7∘ and 8.6∘ for the scapula and humerus, respectively. Joint-level kinematics differences were less than 2∘ in 65%, 13%, and 63% of frames for XYZ orientation DoFs based on Euler angle decompositions. Automation of kinematic tracking can increase scalability of tracking workflows in research, clinical, or surgical applications.
Collapse
Affiliation(s)
- William Burton
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E. Wesley Ave., Denver, CO, 80210, USA.
| | - Ignacio Rivero Crespo
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E. Wesley Ave., Denver, CO, 80210, USA
| | - Thor Andreassen
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E. Wesley Ave., Denver, CO, 80210, USA
| | - Moira Pryhoda
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E. Wesley Ave., Denver, CO, 80210, USA
| | - Andrew Jensen
- Department of Mechanical and Aerospace Engineering, University of Florida, 939 Center Dr., Gainesville, FL, 32611, USA
| | - Casey Myers
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E. Wesley Ave., Denver, CO, 80210, USA
| | - Kevin Shelburne
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E. Wesley Ave., Denver, CO, 80210, USA
| | - Scott Banks
- Department of Mechanical and Aerospace Engineering, University of Florida, 939 Center Dr., Gainesville, FL, 32611, USA
| | - Paul Rullkoetter
- Center for Orthopaedic Biomechanics, University of Denver, 2155 E. Wesley Ave., Denver, CO, 80210, USA
| |
Collapse
|
9
|
Geng H, Xiao D, Yang S, Fan J, Fu T, Lin Y, Bai Y, Ai D, Song H, Wang Y, Duan F, Yang J. CT2X-IRA: CT to x-ray image registration agent using domain-cross multi-scale-stride deep reinforcement learning. Phys Med Biol 2023; 68:175024. [PMID: 37549676 DOI: 10.1088/1361-6560/acede5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 08/07/2023] [Indexed: 08/09/2023]
Abstract
Objective.In computer-assisted minimally invasive surgery, the intraoperative x-ray image is enhanced by overlapping it with a preoperative CT volume to improve visualization of vital anatomical structures. Therefore, accurate and robust 3D/2D registration of CT volume and x-ray image is highly desired in clinical practices. However, previous registration methods were prone to initial misalignments and struggled with local minima, leading to issues of low accuracy and vulnerability.Approach.To improve registration performance, we propose a novel CT/x-ray image registration agent (CT2X-IRA) within a task-driven deep reinforcement learning framework, which contains three key strategies: (1) a multi-scale-stride learning mechanism provides multi-scale feature representation and flexible action step size, establishing fast and globally optimal convergence of the registration task. (2) A domain adaptation module reduces the domain gap between the x-ray image and digitally reconstructed radiograph projected from the CT volume, decreasing the sensitivity and uncertainty of the similarity measurement. (3) A weighted reward function facilitates CT2X-IRA in searching for the optimal transformation parameters, improving the estimation accuracy of out-of-plane transformation parameters under large initial misalignments.Main results.We evaluate the proposed CT2X-IRA on both the public and private clinical datasets, achieving target registration errors of 2.13 mm and 2.33 mm with the computation time of 1.5 s and 1.1 s, respectively, showing an accurate and fast workflow for CT/x-ray image rigid registration.Significance.The proposed CT2X-IRA obtains the accurate and robust 3D/2D registration of CT and x-ray images, suggesting its potential significance in clinical applications.
Collapse
Affiliation(s)
- Haixiao Geng
- School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, People's Republic of China
| | - Deqiang Xiao
- School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, People's Republic of China
| | - Shuo Yang
- School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, People's Republic of China
| | - Jingfan Fan
- School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, People's Republic of China
| | - Tianyu Fu
- School of Medical Engineering, Beijing Institute of Technology, Beijing 100081, People's Republic of China
| | - Yucong Lin
- School of Medical Engineering, Beijing Institute of Technology, Beijing 100081, People's Republic of China
| | - Yanhua Bai
- Department of Interventional Radiology, The First Medical Center of Chinese PLA General Hospital, Beijing 100853, People's Republic of China
| | - Danni Ai
- School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, People's Republic of China
| | - Hong Song
- School of Computer Science, Beijing Institute of Technology, Beijing 100081, People's Republic of China
| | - Yongtian Wang
- School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, People's Republic of China
| | - Feng Duan
- Department of Interventional Radiology, The First Medical Center of Chinese PLA General Hospital, Beijing 100853, People's Republic of China
| | - Jian Yang
- School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, People's Republic of China
| |
Collapse
|
10
|
Killeen BD, Gao C, Oguine KJ, Darcy S, Armand M, Taylor RH, Osgood G, Unberath M. An autonomous X-ray image acquisition and interpretation system for assisting percutaneous pelvic fracture fixation. Int J Comput Assist Radiol Surg 2023; 18:1201-1208. [PMID: 37213057 PMCID: PMC11002911 DOI: 10.1007/s11548-023-02941-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 04/25/2023] [Indexed: 05/23/2023]
Abstract
PURPOSE Percutaneous fracture fixation involves multiple X-ray acquisitions to determine adequate tool trajectories in bony anatomy. In order to reduce time spent adjusting the X-ray imager's gantry, avoid excess acquisitions, and anticipate inadequate trajectories before penetrating bone, we propose an autonomous system for intra-operative feedback that combines robotic X-ray imaging and machine learning for automated image acquisition and interpretation, respectively. METHODS Our approach reconstructs an appropriate trajectory in a two-image sequence, where the optimal second viewpoint is determined based on analysis of the first image. A deep neural network is responsible for detecting the tool and corridor, here a K-wire and the superior pubic ramus, respectively, in these radiographs. The reconstructed corridor and K-wire pose are compared to determine likelihood of cortical breach, and both are visualized for the clinician in a mixed reality environment that is spatially registered to the patient and delivered by an optical see-through head-mounted display. RESULTS We assess the upper bounds on system performance through in silico evaluation across 11 CTs with fractures present, in which the corridor and K-wire are adequately reconstructed. In post hoc analysis of radiographs across 3 cadaveric specimens, our system determines the appropriate trajectory to within 2.8 ± 1.3 mm and 2.7 ± 1.8[Formula: see text]. CONCLUSION An expert user study with an anthropomorphic phantom demonstrates how our autonomous, integrated system requires fewer images and lower movement to guide and confirm adequate placement compared to current clinical practice. Code and data are available.
Collapse
Affiliation(s)
| | - Cong Gao
- Johns Hopkins University, Baltimore, 21210, MD, USA
| | | | - Sean Darcy
- Johns Hopkins University, Baltimore, 21210, MD, USA
| | - Mehran Armand
- Johns Hopkins University, Baltimore, 21210, MD, USA
- Department of Orthopaedic Surgery, Johns Hopkins University, Baltimore, USA
| | | | - Greg Osgood
- Department of Orthopaedic Surgery, Johns Hopkins University, Baltimore, USA
| | | |
Collapse
|
11
|
Zhang J, Mazurowski MA, Allen BC, Wildman-Tobriner B. Multistep Automated Data Labelling Procedure (MADLaP) for thyroid nodules on ultrasound: An artificial intelligence approach for automating image annotation. Artif Intell Med 2023; 141:102553. [PMID: 37295897 DOI: 10.1016/j.artmed.2023.102553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 02/14/2023] [Accepted: 04/11/2023] [Indexed: 06/12/2023]
Abstract
Machine learning (ML) for diagnosis of thyroid nodules on ultrasound is an active area of research. However, ML tools require large, well-labeled datasets, the curation of which is time-consuming and labor-intensive. The purpose of our study was to develop and test a deep-learning-based tool to facilitate and automate the data annotation process for thyroid nodules; we named our tool Multistep Automated Data Labelling Procedure (MADLaP). MADLaP was designed to take multiple inputs including pathology reports, ultrasound images, and radiology reports. Using multiple step-wise 'modules' including rule-based natural language processing, deep-learning-based imaging segmentation, and optical character recognition, MADLaP automatically identified images of a specific thyroid nodule and correctly assigned a pathology label. The model was developed using a training set of 378 patients across our health system and tested on a separate set of 93 patients. Ground truths for both sets were selected by an experienced radiologist. Performance metrics including yield (how many labeled images the model produced) and accuracy (percentage correct) were measured using the test set. MADLaP achieved a yield of 63 % and an accuracy of 83 %. The yield progressively increased as the input data moved through each module, while accuracy peaked part way through. Error analysis showed that inputs from certain examination sites had lower accuracy (40 %) than the other sites (90 %, 100 %). MADLaP successfully created curated datasets of labeled ultrasound images of thyroid nodules. While accurate, the relatively suboptimal yield of MADLaP exposed some challenges when trying to automatically label radiology images from heterogeneous sources. The complex task of image curation and annotation could be automated, allowing for enrichment of larger datasets for use in machine learning development.
Collapse
Affiliation(s)
- Jikai Zhang
- Department of Electrical and Computer Engineering, Duke University, Room 10070, 2424 Erwin Rd, Durham, NC 27705, United States.
| | - Maciej A Mazurowski
- Department of Radiology, Duke University Medical Center, Durham, NC, United States; Department of Electrical and Computer Engineering, Department of Biostatistics and Bioinformatics, Department of Computer Science, Duke University, Room 9044, 2424 Erwin Rd, Durham, NC 27705, United States
| | - Brian C Allen
- Department of Radiology, Duke University Medical Center, Duke University, Dept of Radiology, Box 3808, Durham, NC 27710, United States
| | - Benjamin Wildman-Tobriner
- Department of Radiology, Duke University Medical Center, Duke University, Dept of Radiology, Box 3808, Durham, NC 27710, United States
| |
Collapse
|
12
|
Sun W, Zhao Y, Liu J, Zheng G. LatentPCN: latent space-constrained point cloud network for reconstruction of 3D patient-specific bone surface models from calibrated biplanar X-ray images. Int J Comput Assist Radiol Surg 2023:10.1007/s11548-023-02877-3. [PMID: 37027083 DOI: 10.1007/s11548-023-02877-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 03/15/2023] [Indexed: 04/08/2023]
Abstract
PURPOSE Accurate three-dimensional (3D) models play crucial roles in computer assisted planning and interventions. MR or CT images are frequently used to derive 3D models but have the disadvantages that they are expensive or involving ionizing radiation (e.g., CT acquisition). An alternative method based on calibrated 2D biplanar X-ray images is highly desired. METHODS A point cloud network, referred as LatentPCN, is developed for reconstruction of 3D surface models from calibrated biplanar X-ray images. LatentPCN consists of three components: an encoder, a predictor, and a decoder. During training, a latent space is learned to represent shape features. After training, LatentPCN maps sparse silhouettes generated from 2D images to a latent representation, which is taken as the input to the decoder to derive a 3D bone surface model. Additionally, LatentPCN allows for estimation of a patient-specific reconstruction uncertainty. RESULTS We designed and conducted comprehensive experiments on datasets of 25 simulated cases and 10 cadaveric cases to evaluate the performance of LatentLCN. On these two datasets, the mean reconstruction errors achieved by LatentLCN were 0.83 mm and 0.92 mm, respectively. A correlation between large reconstruction errors and high uncertainty in the reconstruction results was observed. CONCLUSION LatentPCN can reconstruct patient-specific 3D surface models from calibrated 2D biplanar X-ray images with high accuracy and uncertainty estimation. The sub-millimeter reconstruction accuracy on cadaveric cases demonstrates its potential for surgical navigation applications.
Collapse
Affiliation(s)
- Wenyuan Sun
- Institute of Medical Robotics, Shanghai Jiao Tong University, Dongchuan Road, Shanghai, 200240, China
| | - Yuyun Zhao
- Institute of Medical Robotics, Shanghai Jiao Tong University, Dongchuan Road, Shanghai, 200240, China
| | - Jihao Liu
- Institute of Medical Robotics, Shanghai Jiao Tong University, Dongchuan Road, Shanghai, 200240, China
| | - Guoyan Zheng
- Institute of Medical Robotics, Shanghai Jiao Tong University, Dongchuan Road, Shanghai, 200240, China.
| |
Collapse
|
13
|
Gao C, Killeen BD, Hu Y, Grupp RB, Taylor RH, Armand M, Unberath M. Synthetic data accelerates the development of generalizable learning-based algorithms for X-ray image analysis. NAT MACH INTELL 2023; 5:294-308. [PMID: 38523605 PMCID: PMC10959504 DOI: 10.1038/s42256-023-00629-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 02/06/2023] [Indexed: 03/26/2024]
Abstract
Artificial intelligence (AI) now enables automated interpretation of medical images. However, AI's potential use for interventional image analysis remains largely untapped. This is because the post hoc analysis of data collected during live procedures has fundamental and practical limitations, including ethical considerations, expense, scalability, data integrity and a lack of ground truth. Here we demonstrate that creating realistic simulated images from human models is a viable alternative and complement to large-scale in situ data collection. We show that training AI image analysis models on realistically synthesized data, combined with contemporary domain generalization techniques, results in machine learning models that on real data perform comparably to models trained on a precisely matched real data training set. We find that our model transfer paradigm for X-ray image analysis, which we refer to as SyntheX, can even outperform real-data-trained models due to the effectiveness of training on a larger dataset. SyntheX provides an opportunity to markedly accelerate the conception, design and evaluation of X-ray-based intelligent systems. In addition, SyntheX provides the opportunity to test novel instrumentation, design complementary surgical approaches, and envision novel techniques that improve outcomes, save time or mitigate human error, free from the ethical and practical considerations of live human data collection.
Collapse
Affiliation(s)
- Cong Gao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Benjamin D. Killeen
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Yicheng Hu
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Robert B. Grupp
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Russell H. Taylor
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Mehran Armand
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- Department of Orthopaedic Surgery, Johns Hopkins Applied Physics Laboratory, Baltimore, MD, USA
| | - Mathias Unberath
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
14
|
Liu M, Martin-Gomez A, Oni JK, Mears SC, Armand M. Towards Visualizing Early-stage Osteonecrosis using Intraoperative Imaging Modalities. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING. IMAGING & VISUALIZATION 2022; 11:1234-1242. [PMID: 38179232 PMCID: PMC10766436 DOI: 10.1080/21681163.2022.2157329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 11/19/2022] [Indexed: 12/23/2022]
Abstract
Osteonecrosis of the Femoral Head (ONFH) is a progressive disease characterized by the death of bone cells due to the loss of blood supply. Early detection and treatment of this disease are vital in avoiding Total Hip Replacement. Early stages of ONFH can be diagnosed using Magnetic Resonance Imaging (MRI), commonly used intra-operative imaging modalities such as fluoroscopy frequently fail to depict the lesion. Therefore, increasing the difficulty of intra-operative localization of osteonecrosis. This work introduces a novel framework that enables the localization of necrotic lesions in Computed Tomography (CT) as a step toward localizing and visualizing necrotic lesions in intra-operative images. The proposed framework uses Deep Learning algorithms to enable automatic segmentation of femur, pelvis, and necrotic lesions in MRI. An additional step performs semi-automatic segmentation of these anatomies, excluding the necrotic lesions, in CT. A final step performs pairwise registration of the corresponding anatomies, allowing for the localization and visualization of the necrosis in CT. To investigate the feasibility of integrating the proposed framework in the surgical workflow, we conducted experiments on MRIs and CTs containing early-stage ONFH. Our results indicate that the proposed framework is able to segment the anatomical structures of interest and accurately register the femurs and pelvis of the corresponding volumes, allowing for the visualization and localization of the ONFH in CT and generated X-rays, which could enable intra-operative visualization of the necrotic lesions for surgical procedures such as core decompression of the femur.
Collapse
Affiliation(s)
- Mingxu Liu
- Biomechanical- and Image-Guided Surgical Systems (BIGSS), Laboratory for Computational Sensing and Robotics, Johns Hopkins University, Baltimore, MD, USA
| | - Alejandro Martin-Gomez
- Biomechanical- and Image-Guided Surgical Systems (BIGSS), Laboratory for Computational Sensing and Robotics, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Julius K Oni
- Department of Orthopaedic Surgery, Johns Hopkins University, Baltimore, MD, USA
| | - Simon C Mears
- Department of Orthopaedic Surgery, University of Arkansas for Medical Sciences, AR, USA
| | - Mehran Armand
- Biomechanical- and Image-Guided Surgical Systems (BIGSS), Laboratory for Computational Sensing and Robotics, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- Department of Orthopaedic Surgery, Johns Hopkins University, Baltimore, MD, USA
- Department of Mechanical Engineering, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
15
|
Killeen BD, Winter J, Gu W, Martin-Gomez A, Taylor RH, Osgood G, Unberath M. Mixed Reality Interfaces for Achieving Desired Views with Robotic X-ray Systems. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING. IMAGING & VISUALIZATION 2022; 11:1130-1135. [PMID: 37555199 PMCID: PMC10406465 DOI: 10.1080/21681163.2022.2154272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 11/19/2022] [Indexed: 12/14/2022]
Abstract
Robotic X-ray C-arm imaging systems can precisely achieve any position and orientation relative to the patient. Informing the system, however, what pose exactly corresponds to a desired view is challenging. Currently these systems are operated by the surgeon using joysticks, but this interaction paradigm is not necessarily effective because users may be unable to efficiently actuate more than a single axis of the system simultaneously. Moreover, novel robotic imaging systems, such as the Brainlab Loop-X, allow for independent source and detector movements, adding even more complexity. To address this challenge, we consider complementary interfaces for the surgeon to command robotic X-ray systems effectively. Specifically, we consider three interaction paradigms: (1) the use of a pointer to specify the principal ray of the desired view relative to the anatomy, (2) the same pointer, but combined with a mixed reality environment to synchronously render digitally reconstructed radiographs from the tool's pose, and (3) the same mixed reality environment but with a virtual X-ray source instead of the pointer. Initial human-in-the-loop evaluation with an attending trauma surgeon indicates that mixed reality interfaces for robotic X-ray system control are promising and may contribute to substantially reducing the number of X-ray images acquired solely during "fluoro hunting" for the desired view or standard plane.
Collapse
Affiliation(s)
- Benjamin D Killeen
- Laboratory for Computational Sensing and Robotics, Johns Hopkins University, Baltimore, MD, USA
| | - Jonas Winter
- Laboratory for Computational Sensing and Robotics, Johns Hopkins University, Baltimore, MD, USA
| | - Wenhao Gu
- Laboratory for Computational Sensing and Robotics, Johns Hopkins University, Baltimore, MD, USA
| | - Alejandro Martin-Gomez
- Laboratory for Computational Sensing and Robotics, Johns Hopkins University, Baltimore, MD, USA
| | - Russell H Taylor
- Laboratory for Computational Sensing and Robotics, Johns Hopkins University, Baltimore, MD, USA
| | - Greg Osgood
- Department of Orthopaedic Surgery, Johns Hopkins Hospital, Baltimore, MD, USA
| | - Mathias Unberath
- Laboratory for Computational Sensing and Robotics, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
16
|
Gao C, Phalen H, Margalit A, Ma JH, Ku PC, Unberath M, Taylor RH, Jain A, Armand M. Fluoroscopy-Guided Robotic System for Transforaminal Lumbar Epidural Injections. IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS 2022; 4:901-909. [PMID: 37790985 PMCID: PMC10544812 DOI: 10.1109/tmrb.2022.3196321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
We present an autonomous robotic spine needle injection system using fluoroscopic image-based navigation. Our system includes patient-specific planning, intra-operative image-based 2D/3D registration and navigation, and automatic robot-guided needle injection. We performed intensive simulation studies to validate the registration accuracy. We achieved a mean spine vertebrae registration error of 0.8 ± 0.3 mm, 0.9 ± 0.7 degrees, mean injection device registration error of 0.2 ± 0.6 mm, 1.2 ± 1.3 degrees, in translation and rotation, respectively. We then conducted cadaveric studies comparing our system to an experienced clinician's free-hand injections. We achieved a mean needle tip translational error of 5.1 ± 2.4 mm and needle orientation error of 3.6 ± 1.9 degrees for robotic injections, compared to 7.6 ± 2.8 mm and 9.9 ± 4.7 degrees for clinician's free-hand injections, respectively. During injections, all needle tips were placed within the defined safety zones for this application. The results suggest the feasibility of using our image-guided robotic injection system for spinal orthopedic applications.
Collapse
Affiliation(s)
- Cong Gao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA 21211
| | - Henry Phalen
- Department of Mechanical Engineering, Johns Hopkins University, Baltimore, MD, USA 21211
| | - Adam Margalit
- Department of Orthopaedic Surgery, Baltimore, MD, USA 21224
| | - Justin H Ma
- Department of Mechanical Engineering, Johns Hopkins University, Baltimore, MD, USA 21211
| | - Ping-Cheng Ku
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA 21211
| | - Mathias Unberath
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA 21211
| | - Russell H Taylor
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA 21211
| | - Amit Jain
- Department of Orthopaedic Surgery, Baltimore, MD, USA 21224
| | - Mehran Armand
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA 21211
- Department of Mechanical Engineering, Johns Hopkins University, Baltimore, MD, USA 21211
- Department of Orthopaedic Surgery, Baltimore, MD, USA 21224
- Johns Hopkins Applied Physics Laboratory, Baltimore, MD, USA 21224
| |
Collapse
|
17
|
Kausch L, Thomas S, Kunze H, Norajitra T, Klein A, Ayala L, El Barbari J, Mandelka E, Privalov M, Vetter S, Mahnken A, Maier-Hein L, Maier-Hein K. C-arm positioning for standard projections during spinal implant placement. Med Image Anal 2022; 81:102557. [DOI: 10.1016/j.media.2022.102557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 06/09/2022] [Accepted: 07/22/2022] [Indexed: 10/16/2022]
|
18
|
Gao C, Phalen H, Sefati S, Ma J, Taylor R, Unberath M, Armand M. Fluoroscopic Navigation for a Surgical Robotic System Including a Continuum Manipulator. IEEE Trans Biomed Eng 2022; 69:453-464. [PMID: 34270412 PMCID: PMC8817231 DOI: 10.1109/tbme.2021.3097631] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
We present an image-based navigation solution for a surgical robotic system with a Continuum Manipulator (CM). Our navigation system uses only fluoroscopic images from a mobile C-arm to estimate the CM shape and pose with respect to the bone anatomy. The CM pose and shape estimation is achieved using image intensity-based 2D/3D registration. A learning-based framework is used to automatically detect the CM in X-ray images, identifying landmark features that are used to initialize and regularize image registration. We also propose a modified hand-eye calibration method that numerically optimizes the hand-eye matrix during image registration. The proposed navigation system for CM positioning was tested in simulation and cadaveric studies. In simulation, the proposed registration achieved a mean error of 1.10±0.72 mm between the CM tip and a target entry point on the femur. In cadaveric experiments, the mean CM tip position error was 2.86±0.80 mm after registration and repositioning of the CM. The results suggest that our proposed fluoroscopic navigation is feasible to guide the CM in orthopedic applications.
Collapse
|
19
|
Zhou C, Cha T, Peng Y, Li G. Transfer learning from an artificial radiograph-landmark dataset for registration of the anatomic skull model to dual fluoroscopic X-ray images. Comput Biol Med 2021; 138:104923. [PMID: 34638020 DOI: 10.1016/j.compbiomed.2021.104923] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 09/19/2021] [Accepted: 10/04/2021] [Indexed: 01/01/2023]
Abstract
Registration of 3D anatomic structures to their 2D dual fluoroscopic X-ray images is a widely used motion tracking technique. However, deep learning implementation is often impeded by a paucity of medical images and ground truths. In this study, we proposed a transfer learning strategy for 3D-to-2D registration using deep neural networks trained from an artificial dataset. Digitally reconstructed radiographs (DRRs) and radiographic skull landmarks were automatically created from craniocervical CT data of a female subject. They were used to train a residual network (ResNet) for landmark detection and a cycle generative adversarial network (GAN) to eliminate the style difference between DRRs and actual X-rays. Landmarks on the X-rays experiencing GAN style translation were detected by the ResNet, and were used in triangulation optimization for 3D-to-2D registration of the skull in actual dual-fluoroscope images (with a non-orthogonal setup, point X-ray sources, image distortions, and partially captured skull regions). The registration accuracy was evaluated in multiple scenarios of craniocervical motions. In walking, learning-based registration for the skull had angular/position errors of 3.9 ± 2.1°/4.6 ± 2.2 mm. However, the accuracy was lower during functional neck activity, due to overly small skull regions imaged on the dual fluoroscopic images at end-range positions. The methodology to strategically augment artificial training data can tackle the complicated skull registration scenario, and has potentials to extend to widespread registration scenarios.
Collapse
Affiliation(s)
- Chaochao Zhou
- Orthopaedic Bioengineering Research Center, Department of Orthopaedic Surgery, Newton-Wellesley Hospital, Newton, MA, USA; Department of Orthopaedic Surgery, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Thomas Cha
- Orthopaedic Bioengineering Research Center, Department of Orthopaedic Surgery, Newton-Wellesley Hospital, Newton, MA, USA; Department of Orthopaedic Surgery, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Yun Peng
- NuVasive Inc, San Diego, CA, USA
| | - Guoan Li
- Orthopaedic Bioengineering Research Center, Department of Orthopaedic Surgery, Newton-Wellesley Hospital, Newton, MA, USA.
| |
Collapse
|
20
|
Grimm M, Esteban J, Unberath M, Navab N. Pose-Dependent Weights and Domain Randomization for Fully Automatic X-Ray to CT Registration. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:2221-2232. [PMID: 33861701 DOI: 10.1109/tmi.2021.3073815] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Fully automatic X-ray to CT registration requires a solid initialization to provide an initial alignment within the capture range of existing intensity-based registrations. This work addresses that need by providing a novel automatic initialization, which enables end to end registration. First, a neural network is trained once to detect a set of anatomical landmarks on simulated X-rays. A domain randomization scheme is proposed to enable the network to overcome the challenge of being trained purely on simulated data and run inference on real X-rays. Then, for each patient CT, a fully-automatic patient-specific landmark extraction scheme is used. It is based on backprojecting and clustering the previously trained network's predictions on a set of simulated X-rays. Next, the network is retrained to detect the new landmarks. Finally the combination of network and 3D landmark locations is used to compute the initialization using a perspective-n-point algorithm. During the computation of the pose, a weighting scheme is introduced to incorporate the confidence of the network in detecting the landmarks. The algorithm is evaluated on the pelvis using both real and simulated x-rays. The mean (± standard deviation) target registration error in millimetres is 4.1 ± 4.3 for simulated X-rays with a success rate of 92% and 4.2 ± 3.9 for real X-rays with a success rate of 86.8%, where a success is defined as a translation error of less than 30 mm .
Collapse
|
21
|
Unberath M, Gao C, Hu Y, Judish M, Taylor RH, Armand M, Grupp R. The Impact of Machine Learning on 2D/3D Registration for Image-Guided Interventions: A Systematic Review and Perspective. Front Robot AI 2021; 8:716007. [PMID: 34527706 PMCID: PMC8436154 DOI: 10.3389/frobt.2021.716007] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 07/30/2021] [Indexed: 11/13/2022] Open
Abstract
Image-based navigation is widely considered the next frontier of minimally invasive surgery. It is believed that image-based navigation will increase the access to reproducible, safe, and high-precision surgery as it may then be performed at acceptable costs and effort. This is because image-based techniques avoid the need of specialized equipment and seamlessly integrate with contemporary workflows. Furthermore, it is expected that image-based navigation techniques will play a major role in enabling mixed reality environments, as well as autonomous and robot-assisted workflows. A critical component of image guidance is 2D/3D registration, a technique to estimate the spatial relationships between 3D structures, e.g., preoperative volumetric imagery or models of surgical instruments, and 2D images thereof, such as intraoperative X-ray fluoroscopy or endoscopy. While image-based 2D/3D registration is a mature technique, its transition from the bench to the bedside has been restrained by well-known challenges, including brittleness with respect to optimization objective, hyperparameter selection, and initialization, difficulties in dealing with inconsistencies or multiple objects, and limited single-view performance. One reason these challenges persist today is that analytical solutions are likely inadequate considering the complexity, variability, and high-dimensionality of generic 2D/3D registration problems. The recent advent of machine learning-based approaches to imaging problems that, rather than specifying the desired functional mapping, approximate it using highly expressive parametric models holds promise for solving some of the notorious challenges in 2D/3D registration. In this manuscript, we review the impact of machine learning on 2D/3D registration to systematically summarize the recent advances made by introduction of this novel technology. Grounded in these insights, we then offer our perspective on the most pressing needs, significant open problems, and possible next steps.
Collapse
Affiliation(s)
- Mathias Unberath
- Advanced Robotics and Computationally Augmented Environments (ARCADE) Lab, Department of Computer Science, Johns Hopkins University, Baltimore, MD, United States
| | | | | | | | | | | | | |
Collapse
|
22
|
D'Isidoro F, Chênes C, Ferguson SJ, Schmid J. A new 2D-3D registration gold-standard dataset for the hip joint based on uncertainty modeling. Med Phys 2021; 48:5991-6006. [PMID: 34287934 PMCID: PMC9290855 DOI: 10.1002/mp.15124] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2020] [Revised: 03/15/2021] [Accepted: 06/28/2021] [Indexed: 12/11/2022] Open
Abstract
Purpose Estimation of the accuracy of 2D‐3D registration is paramount for a correct evaluation of its outcome in both research and clinical studies. Publicly available datasets with standardized evaluation methodology are necessary for validation and comparison of 2D‐3D registration techniques. Given the large use of 2D‐3D registration in biomechanics, we introduced the first gold standard validation dataset for computed tomography (CT)‐to‐x‐ray registration of the hip joint, based on fluoroscopic images with large rotation angles. As the ground truth computed with fiducial markers is affected by localization errors in the image datasets, we proposed a new methodology based on uncertainty propagation to estimate the accuracy of a gold standard dataset. Methods The gold standard dataset included a 3D CT scan of a female hip phantom and 19 2D fluoroscopic images acquired at different views and voltages. The ground truth transformations were estimated based on the corresponding pairs of extracted 2D and 3D fiducial locations. These were assumed to be corrupted by Gaussian noise, without any restrictions of isotropy. We devised the multiple projective points criterion (MPPC) that jointly optimizes the transformations and the noisy 3D fiducial locations for all views. The accuracy of the transformations obtained with the MPPC was assessed in both synthetic and real experiments using different formulations of the target registration error (TRE), including a novel formulation of the TRE (uTRE) derived from the uncertainty analysis of the MPPC. Results The proposed MPPC method was statistically more accurate compared to the validation methods for 2D‐3D registration that did not optimize the 3D fiducial positions or wrongly assumed the isotropy of the noise. The reported results were comparable to previous published works of gold standard datasets. However, a formulation of the TRE commonly found in these gold standard datasets was found to significantly miscalculate the true TRE computed in synthetic experiments with known ground truths. In contrast, the uncertainty‐based uTRE was statistically closer to the true TRE. Conclusions We proposed a new gold standard dataset for the validation of CT‐to‐X‐ray registration of the hip joint. The gold standard transformations were derived from a novel method modeling the uncertainty in extracted 2D and 3D fiducials. Results showed that considering possible noise anisotropy and including corrupted 3D fiducials in the optimization resulted in improved accuracy of the gold standard. A new uncertainty‐based formulation of the TRE also appeared as a good alternative to the unknown true TRE that has been replaced in previous works by an alternative TRE not fully reflecting the gold standard accuracy.
Collapse
Affiliation(s)
| | - Christophe Chênes
- Geneva School of Health Sciences, HES-SO University of Applied Sciences and Arts of Western Switzerland, Geneva, Switzerland
| | | | - Jérôme Schmid
- Geneva School of Health Sciences, HES-SO University of Applied Sciences and Arts of Western Switzerland, Geneva, Switzerland
| |
Collapse
|
23
|
Grupp RB, Murphy RJ, Hegeman RA, Alexander CP, Unberath M, Otake Y, McArthur BA, Armand M, Taylor RH. Fast and automatic periacetabular osteotomy fragment pose estimation using intraoperatively implanted fiducials and single-view fluoroscopy. Phys Med Biol 2020; 65:245019. [PMID: 32590372 DOI: 10.1088/1361-6560/aba089] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Accurate and consistent mental interpretation of fluoroscopy to determine the position and orientation of acetabular bone fragments in 3D space is difficult. We propose a computer assisted approach that uses a single fluoroscopic view and quickly reports the pose of an acetabular fragment without any user input or initialization. Intraoperatively, but prior to any osteotomies, two constellations of metallic ball-bearings (BBs) are injected into the wing of a patient's ilium and lateral superior pubic ramus. One constellation is located on the expected acetabular fragment, and the other is located on the remaining, larger, pelvis fragment. The 3D locations of each BB are reconstructed using three fluoroscopic views and 2D/3D registrations to a preoperative CT scan of the pelvis. The relative pose of the fragment is established by estimating the movement of the two BB constellations using a single fluoroscopic view taken after osteotomy and fragment relocation. BB detection and inter-view correspondences are automatically computed throughout the processing pipeline. The proposed method was evaluated on a multitude of fluoroscopic images collected from six cadaveric surgeries performed bilaterally on three specimens. Mean fragment rotation error was 2.4 ± 1.0 degrees, mean translation error was 2.1 ± 0.6 mm, and mean 3D lateral center edge angle error was 1.0 ± 0.5 degrees. The average runtime of the single-view pose estimation was 0.7 ± 0.2 s. The proposed method demonstrates accuracy similar to other state of the art systems which require optical tracking systems or multiple-view 2D/3D registrations with manual input. The errors reported on fragment poses and lateral center edge angles are within the margins required for accurate intraoperative evaluation of femoral head coverage.
Collapse
Affiliation(s)
- R B Grupp
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, United States of America
| | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Gao C, Liu X, Gu W, Killeen B, Armand M, Taylor R, Unberath M. Generalizing Spatial Transformers to Projective Geometry with Applications to 2D/3D Registration. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2020; 12263:329-339. [PMID: 33135014 DOI: 10.1007/978-3-030-59716-0_32] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Differentiable rendering is a technique to connect 3D scenes with corresponding 2D images. Since it is differentiable, processes during image formation can be learned. Previous approaches to differentiable rendering focus on mesh-based representations of 3D scenes, which is inappropriate for medical applications where volumetric, voxelized models are used to represent anatomy. We propose a novel Projective Spatial Transformer module that generalizes spatial transformers to projective geometry, thus enabling differentiable volume rendering. We demonstrate the usefulness of this architecture on the example of 2D/3D registration between radiographs and CT scans. Specifically, we show that our transformer enables end-to-end learning of an image processing and projection model that approximates an image similarity function that is convex with respect to the pose parameters, and can thus be optimized effectively using conventional gradient descent. To the best of our knowledge, we are the first to describe the spatial transformers in the context of projective transmission imaging, including rendering and pose estimation. We hope that our developments will benefit related 3D research applications. The source code is available at https://github.com/gaocong13/Projective-Spatial-Transformers.
Collapse
Affiliation(s)
- Cong Gao
- Johns Hopkins University, Baltimore MD 21218, USA
| | - Xingtong Liu
- Johns Hopkins University, Baltimore MD 21218, USA
| | - Wenhao Gu
- Johns Hopkins University, Baltimore MD 21218, USA
| | | | | | | | | |
Collapse
|
25
|
Gu W, Gao C, Grupp R, Fotouhi J, Unberath M. Extended Capture Range of Rigid 2D/3D Registration by Estimating Riemannian Pose Gradients. MACHINE LEARNING IN MEDICAL IMAGING. MLMI (WORKSHOP) 2020; 12436:281-291. [PMID: 33145587 PMCID: PMC7605345 DOI: 10.1007/978-3-030-59861-7_29] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Traditional intensity-based 2D/3D registration requires near-perfect initialization in order for image similarity metrics to yield meaningful updates of X-ray pose and reduce the likelihood of getting trapped in a local minimum. The conventional approaches strongly depend on image appearance rather than content, and therefore, fail in revealing large pose offsets that substantially alter the appearance of the same structure. We complement traditional similarity metrics with a convolutional neural network-based (CNN-based) registration solution that captures large-range pose relations by extracting both local and contextual information, yielding meaningful X-ray pose updates without the need for accurate initialization. To register a 2D X-ray image and a 3D CT scan, our CNN accepts a target X-ray image and a digitally reconstructed radiograph at the current pose estimate as input and iteratively outputs pose updates in the direction of the pose gradient on the Riemannian Manifold. Our approach integrates seamlessly with conventional image-based registration frameworks, where long-range relations are captured primarily by our CNN-based method while short-range offsets are recovered accurately with an image similarity-based method. On both synthetic and real X-ray images of the human pelvis, we demonstrate that the proposed method can successfully recover large rotational and translational offsets, irrespective of initialization.
Collapse
Affiliation(s)
- Wenhao Gu
- Johns Hopkins University, Baltimore MD 21218, USA
| | - Cong Gao
- Johns Hopkins University, Baltimore MD 21218, USA
| | - Robert Grupp
- Johns Hopkins University, Baltimore MD 21218, USA
| | | | | |
Collapse
|
26
|
Gao C, Farvardin A, Grupp RB, Bakhtiarinejad M, Ma L, Thies M, Unberath M, Taylor RH, Armand M. Fiducial-Free 2D/3D Registration for Robot-Assisted Femoroplasty. ACTA ACUST UNITED AC 2020; 2:437-446. [PMID: 33763632 DOI: 10.1109/tmrb.2020.3012460] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Femoroplasty is a proposed alternative therapeutic method for preventing osteoporotic hip fractures in the elderly. Previously developed navigation system for femoroplasty required the attachment of an external X-ray fiducial to the femur. We propose a fiducial-free 2D/3D registration pipeline using fluoroscopic images for robot-assisted femoroplasty. Intraoperative fluoroscopic images are taken from multiple views to perform registration of the femur and drilling/injection device. The proposed method was tested through comprehensive simulation and cadaveric studies. Performance was evaluated on the registration error of the femur and the drilling/injection device. In simulations, the proposed approach achieved a mean accuracy of 1.26±0.74 mm for the relative planned injection entry point; 0.63±0.21° and 0.17±0.19° for the femur injection path direction and device guide direction, respectively. In the cadaver studies, a mean error of 2.64 ± 1.10 mm was achieved between the planned entry point and the device guide tip. The biomechanical analysis showed that even with a 4 mm translational deviation from the optimal injection path, the yield load prior to fracture increased by 40.7%. This result suggests that the fiducial-less 2D/3D registration is sufficiently accurate to guide robot assisted femoroplasty.
Collapse
Affiliation(s)
- Cong Gao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA 21211
| | - Amirhossein Farvardin
- Department of Mechanical Engineering, Johns Hopkins University, Baltimore, MD, USA 21211
| | - Robert B Grupp
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA 21211
| | - Mahsan Bakhtiarinejad
- Department of Mechanical Engineering, Johns Hopkins University, Baltimore, MD, USA 21211
| | - Liuhong Ma
- Department of Cranio-maxillo-facial Surgery Center, Plastic Surgery Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, CHN,100144
| | - Mareike Thies
- Pattern Recognition Lab, Friedrich-Alexander-Universitt Erlangen-Nrnberg, Erlangen, Germany 91058
| | - Mathias Unberath
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA 21211
| | - Russell H Taylor
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA 21211
| | - Mehran Armand
- Department of Mechanical Engineering, Johns Hopkins University, Baltimore, MD, USA 21211; Department of Orthopaedic Surgery and Johns Hopkins Applied Physics Laboratory, Baltimore, MD, USA 21224
| |
Collapse
|