1
|
Quan Q, Yao Q, Zhu H, Wang Q, Zhou SK. Which images to label for few-shot medical image analysis? Med Image Anal 2024; 96:103200. [PMID: 38801797 DOI: 10.1016/j.media.2024.103200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 03/26/2024] [Accepted: 05/06/2024] [Indexed: 05/29/2024]
Abstract
The success of deep learning methodologies hinges upon the availability of meticulously labeled extensive datasets. However, when dealing with medical images, the annotation process for such abundant training data often necessitates the involvement of experienced radiologists, thereby consuming their limited time resources. In order to alleviate this burden, few-shot learning approaches have been developed, which manage to achieve competitive performance levels with only several labeled images. Nevertheless, a crucial yet previously overlooked problem in few-shot learning is about the selection of template images for annotation before learning, which affects the final performance. In this study, we propose a novel TEmplate Choosing Policy (TECP) that aims to identify and select "the most worthy" images for annotation, particularly within the context of multiple few-shot medical tasks, including landmark detection, anatomy detection, and anatomy segmentation. TECP is composed of four integral components: (1) Self-supervised training, which entails training a pre-existing deep model to extract salient features from radiological images; (2) Alternative proposals for localizing informative regions within the images; and (3) Representative Score Estimation, which involves the evaluation and identification of the most representative samples or templates. (4) Ranking, which rank all candidates and select one with highest representative score. The efficacy of the TECP approach is demonstrated through a series of comprehensive experiments conducted on multiple public datasets. Across all three medical tasks, the utilization of TECP yields noticeable improvements in model performance.
Collapse
Affiliation(s)
- Quan Quan
- Institute of Computing Technology, Chinese Academy of Sciences (CAS), Beijing, 100080, China; University of Chinese Academy of Sciences (UCAS), Beijing, 101408, China
| | - Qingsong Yao
- Institute of Computing Technology, Chinese Academy of Sciences (CAS), Beijing, 100080, China; University of Chinese Academy of Sciences (UCAS), Beijing, 101408, China
| | - Heqin Zhu
- School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei, 230026, China
| | - Qiyuan Wang
- School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei, 230026, China
| | - S Kevin Zhou
- Institute of Computing Technology, Chinese Academy of Sciences (CAS), Beijing, 100080, China; School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei, 230026, China; Center for Medical Imaging, Robotics, Analytic Computing; Learning (MIRACLE), Suzhou Institute for Advance Research, USTC, Suzhou, 215000, China; Key Laboratory of Precision and Intelligent Chemistry, USTC, Hefei, 230026, China.
| |
Collapse
|
2
|
Huang Z, Zhao R, Leung FHF, Banerjee S, Lam KM, Zheng YP, Ling SH. Landmark Localization From Medical Images With Generative Distribution Prior. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2679-2692. [PMID: 38421850 DOI: 10.1109/tmi.2024.3371948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/02/2024]
Abstract
In medical image analysis, anatomical landmarks usually contain strong prior knowledge of their structural information. In this paper, we propose to promote medical landmark localization by modeling the underlying landmark distribution via normalizing flows. Specifically, we introduce the flow-based landmark distribution prior as a learnable objective function into a regression-based landmark localization framework. Moreover, we employ an integral operation to make the mapping from heatmaps to coordinates differentiable to further enhance heatmap-based localization with the learned distribution prior. Our proposed Normalizing Flow-based Distribution Prior (NFDP) employs a straightforward backbone and non-problem-tailored architecture (i.e., ResNet18), which delivers high-fidelity outputs across three X-ray-based landmark localization datasets. Remarkably, the proposed NFDP can do the job with minimal additional computational burden as the normalizing flows module is detached from the framework on inferencing. As compared to existing techniques, our proposed NFDP provides a superior balance between prediction accuracy and inference speed, making it a highly efficient and effective approach. The source code of this paper is available at https://github.com/jacksonhzx95/NFDP.
Collapse
|
3
|
Tan Z, Feng J, Lu W, Yin Y, Yang G, Zhou J. Multi-task global optimization-based method for vascular landmark detection. Comput Med Imaging Graph 2024; 114:102364. [PMID: 38432060 DOI: 10.1016/j.compmedimag.2024.102364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Revised: 12/04/2023] [Accepted: 02/22/2024] [Indexed: 03/05/2024]
Abstract
Vascular landmark detection plays an important role in medical analysis and clinical treatment. However, due to the complex topology and similar local appearance around landmarks, the popular heatmap regression based methods always suffer from the landmark confusion problem. Vascular landmarks are connected by vascular segments and have special spatial correlations, which can be utilized for performance improvement. In this paper, we propose a multi-task global optimization-based framework for accurate and automatic vascular landmark detection. A multi-task deep learning network is exploited to accomplish landmark heatmap regression, vascular semantic segmentation, and orientation field regression simultaneously. The two auxiliary objectives are highly correlated with the heatmap regression task and help the network incorporate the structural prior knowledge. During inference, instead of performing a max-voting strategy, we propose a global optimization-based post-processing method for final landmark decision. The spatial relationships between neighboring landmarks are utilized explicitly to tackle the landmark confusion problem. We evaluated our method on a cerebral MRA dataset with 564 volumes, a cerebral CTA dataset with 510 volumes, and an aorta CTA dataset with 50 volumes. The experiments demonstrate that the proposed method is effective for vascular landmark localization and achieves state-of-the-art performance.
Collapse
Affiliation(s)
- Zimeng Tan
- Department of Automation, Tsinghua University, Beijing, China
| | - Jianjiang Feng
- Department of Automation, Tsinghua University, Beijing, China.
| | - Wangsheng Lu
- UnionStrong (Beijing) Technology Co.Ltd, Beijing, China
| | - Yin Yin
- UnionStrong (Beijing) Technology Co.Ltd, Beijing, China
| | | | - Jie Zhou
- Department of Automation, Tsinghua University, Beijing, China
| |
Collapse
|
4
|
Zhou Y, Mao B, Zhang J, Zhou Y, Li J, Rong Q. Orthodontic craniofacial pattern diagnosis: cephalometric geometry and machine learning. Med Biol Eng Comput 2023; 61:3345-3361. [PMID: 37672141 DOI: 10.1007/s11517-023-02919-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Accepted: 08/21/2023] [Indexed: 09/07/2023]
Abstract
Efficient and reliable diagnosis of craniofacial patterns is critical to orthodontic treatment. Although machine learning (ML) is time-saving and high-precision, prior knowledge should validate its reliability. This study proposed a craniofacial ML diagnostic workflow base on a cephalometric geometric model through clinical verification. A cephalometric geometric model was established to determine the landmark location by analyzing 408 X-ray lateral cephalograms. Through geometric information and feature engineering, nine supervised ML algorithms were conducted for sagittal and vertical skeleton patterns. After dimension reduction, plane decision boundary and landmark contribution contours were depicted to demonstrate the diagnostic consistency and the consistency with clinical norms. As a result, multi-layer perceptron achieved 97.56% accuracy for sagittal, while linear support vector machine reached 90.24% for the vertical. Sagittal diagnoses showed average superiority (91.60 ± 5.43)% over the vertical (82.25 ± 6.37)%, where discriminative algorithms exhibited more steady performance (93.20 ± 3.29)% than the generative (85.98 ± 9.48)%. Further, the Kruskal-Wallis H test was carried out to explore statistical differences in diagnoses. Though sagittal patterns had no statistical difference in diagnostic accuracy, the vertical showed significance. All aspects of the tests indicated that the proposed craniofacial ML workflow was highly consistent with clinical norms and could supplement practical diagnosis.
Collapse
Affiliation(s)
- Yuqing Zhou
- Department of Mechanics and Engineering Science, College of Engineering, Peking University, Beijing, 100081, China
| | - Bochun Mao
- Department of Orthodontics, Peking University School of Stomatology & National Center of Stomatology & National Clinical Research Center for Oral Diseases & National Engineering Research Center of Oral Biomaterials and Digital Medical Devices & Beijing Key Laboratory of Digital Stomatology, Beijing, 100081, China
| | - Jiwu Zhang
- Department of Mechanics and Engineering Science, College of Engineering, Peking University, Beijing, 100081, China
| | - Yanheng Zhou
- Department of Orthodontics, Peking University School of Stomatology & National Center of Stomatology & National Clinical Research Center for Oral Diseases & National Engineering Research Center of Oral Biomaterials and Digital Medical Devices & Beijing Key Laboratory of Digital Stomatology, Beijing, 100081, China
| | - Jing Li
- Department of Orthodontics, Peking University School of Stomatology & National Center of Stomatology & National Clinical Research Center for Oral Diseases & National Engineering Research Center of Oral Biomaterials and Digital Medical Devices & Beijing Key Laboratory of Digital Stomatology, Beijing, 100081, China.
| | - Qiguo Rong
- Department of Mechanics and Engineering Science, College of Engineering, Peking University, Beijing, 100081, China.
| |
Collapse
|
5
|
Hong W, Kim SM, Choi J, Ahn J, Paeng JY, Kim H. Automated Cephalometric Landmark Detection Using Deep Reinforcement Learning. J Craniofac Surg 2023; 34:2336-2342. [PMID: 37622568 DOI: 10.1097/scs.0000000000009685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 06/25/2023] [Indexed: 08/26/2023] Open
Abstract
Accurate cephalometric landmark detection leads to accurate analysis, diagnosis, and surgical planning. Many studies on automated landmark detection have been conducted, however reinforcement learning-based networks have not yet been applied. This is the first study to apply deep Q-network (DQN) and double deep Q-network (DDQN) to automated cephalometric landmark detection to the best of our knowledge. The performance of the DQN-based network for cephalometric landmark detection was evaluated using the IEEE International Symposium of Biomedical Imaging (ISBI) 2015 Challenge data set and compared with the previously proposed methods. Furthermore, the clinical applicability of DQN-based automated cephalometric landmark detection was confirmed by testing the DQN-based and DDQN-based network using 500-patient data collected in a clinic. The DQN-based network demonstrated that the average mean radius error of 19 landmarks was smaller than 2 mm, that is, the clinically accepted level, without data augmentation and additional preprocessing. Our DQN-based and DDQN-based approaches tested with the 500-patient data set showed the average success detection rate of 67.33% and 66.04% accuracy within 2 mm, respectively, indicating the feasibility and potential of clinical application.
Collapse
Affiliation(s)
- Woojae Hong
- Department of Biomechatronic Engineering, Sungkyunkwan University, Suwon, Gyeonggi
| | - Seong-Min Kim
- Department of Biomechatronic Engineering, Sungkyunkwan University, Suwon, Gyeonggi
| | - Joongyeon Choi
- Department of Biomechatronic Engineering, Sungkyunkwan University, Suwon, Gyeonggi
| | - Jaemyung Ahn
- Department of Oral and Maxillofacial Surgery, Samsung Medical Center, Seoul, Republic of Korea
| | - Jun-Young Paeng
- Department of Oral and Maxillofacial Surgery, Samsung Medical Center, Seoul, Republic of Korea
| | - Hyunggun Kim
- Department of Biomechatronic Engineering, Sungkyunkwan University, Suwon, Gyeonggi
| |
Collapse
|
6
|
Ao Y, Wu H. Feature Aggregation and Refinement Network for 2D Anatomical Landmark Detection. J Digit Imaging 2023; 36:547-561. [PMID: 36401132 PMCID: PMC10039137 DOI: 10.1007/s10278-022-00718-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 10/06/2022] [Accepted: 10/13/2022] [Indexed: 11/19/2022] Open
Abstract
Localization of anatomical landmarks is essential for clinical diagnosis, treatment planning, and research. This paper proposes a novel deep network named feature aggregation and refinement network (FARNet) for automatically detecting anatomical landmarks. FARNet employs an encoder-decoder structure architecture. To alleviate the problem of limited training data in the medical domain, we adopt a backbone network pre-trained on natural images as the encoder. The decoder includes a multi-scale feature aggregation module for multi-scale feature fusion and a feature refinement module for high-resolution heatmap regression. Coarse-to-fine supervisions are applied to the two modules to facilitate end-to-end training. We further propose a novel loss function named Exponential Weighted Center loss for accurate heatmap regression, which focuses on the losses from the pixels near landmarks and suppresses the ones from far away. We evaluate FARNet on three publicly available anatomical landmark detection datasets, including cephalometric, hand, and spine radiographs. Our network achieves state-of-the-art performances on all three datasets. Code is available at https://github.com/JuvenileInWind/FARNet .
Collapse
Affiliation(s)
- Yueyuan Ao
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan 611731 China
| | - Hong Wu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan 611731 China
| |
Collapse
|
7
|
Jiang F, Guo Y, Yang C, Zhou Y, Lin Y, Cheng F, Quan S, Feng Q, Li J. Artificial intelligence system for automated landmark localization and analysis of cephalometry. Dentomaxillofac Radiol 2023; 52:20220081. [PMID: 36279185 PMCID: PMC9793451 DOI: 10.1259/dmfr.20220081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 10/15/2022] [Accepted: 10/16/2022] [Indexed: 01/11/2023] Open
Abstract
OBJECTIVES Cephalometric analysis is essential for diagnosis, treatment planning and outcome assessment of orthodontics and orthognathic surgery. Utilizing artificial intelligence (AI) to achieve automated landmark localization has proved feasible and convenient. However, current systems remain insufficient for clinical application, as patients exhibit various malocclusions in cephalograms produced by different manufacturers while limited cephalograms were applied to train AI in these systems. METHODS A robust and clinically applicable AI system was proposed for automatic cephalometric analysis. First, 9870 cephalograms taken by different radiography machines with various malocclusions of patients were collected from 20 medical institutions. Then 30 landmarks of all these cephalogram samples were manually annotated to train an AI system, composed of a two-stage convolutional neural network and a software-as-a-service system. Further, more than 100 orthodontists participated to refine the AI-output landmark localizations and retrain this system. RESULTS The average landmark prediction error of this system was as low as 0.94 ± 0.74 mm and the system achieved an average classification accuracy of 89.33%. CONCLUSIONS An automatic cephalometric analysis system based on convolutional neural network was proposed, which can realize automatic landmark location and cephalometric measurements classification. This system showed promise in improving diagnostic efficiency in clinical circumstances.
Collapse
Affiliation(s)
- Fulin Jiang
- Department of Orthodontics, State Key Laboratory of Oral Diseases, West China School of Stomatology, West China Hospital of Stomatology, Sichuan University, Chengdu, China
- Chengdu Boltzmann Intelligence Technology Co., Ltd, Chengdu, China
| | - Yutong Guo
- Department of Orthodontics, State Key Laboratory of Oral Diseases, West China School of Stomatology, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Cai Yang
- Department of Orthodontics, State Key Laboratory of Oral Diseases, West China School of Stomatology, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Yimei Zhou
- Department of Orthodontics, State Key Laboratory of Oral Diseases, West China School of Stomatology, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Yucheng Lin
- Chengdu Boltzmann Intelligence Technology Co., Ltd, Chengdu, China
| | - Fangyuan Cheng
- Chengdu Boltzmann Intelligence Technology Co., Ltd, Chengdu, China
| | - Shuqi Quan
- Department of Orthodontics, State Key Laboratory of Oral Diseases, West China School of Stomatology, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Qingchen Feng
- Department of Orthodontics, State Key Laboratory of Oral Diseases, West China School of Stomatology, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Juan Li
- Department of Orthodontics, State Key Laboratory of Oral Diseases, West China School of Stomatology, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| |
Collapse
|
8
|
Xu J, Zeng B, Egger J, Wang C, Smedby Ö, Jiang X, Chen X. A review on AI-based medical image computing in head and neck surgery. Phys Med Biol 2022; 67. [DOI: 10.1088/1361-6560/ac840f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 07/25/2022] [Indexed: 11/11/2022]
Abstract
Abstract
Head and neck surgery is a fine surgical procedure with a complex anatomical space, difficult operation and high risk. Medical image computing (MIC) that enables accurate and reliable preoperative planning is often needed to reduce the operational difficulty of surgery and to improve patient survival. At present, artificial intelligence, especially deep learning, has become an intense focus of research in MIC. In this study, the application of deep learning-based MIC in head and neck surgery is reviewed. Relevant literature was retrieved on the Web of Science database from January 2015 to May 2022, and some papers were selected for review from mainstream journals and conferences, such as IEEE Transactions on Medical Imaging, Medical Image Analysis, Physics in Medicine and Biology, Medical Physics, MICCAI, etc. Among them, 65 references are on automatic segmentation, 15 references on automatic landmark detection, and eight references on automatic registration. In the elaboration of the review, first, an overview of deep learning in MIC is presented. Then, the application of deep learning methods is systematically summarized according to the clinical needs, and generalized into segmentation, landmark detection and registration of head and neck medical images. In segmentation, it is mainly focused on the automatic segmentation of high-risk organs, head and neck tumors, skull structure and teeth, including the analysis of their advantages, differences and shortcomings. In landmark detection, the focus is mainly on the introduction of landmark detection in cephalometric and craniomaxillofacial images, and the analysis of their advantages and disadvantages. In registration, deep learning networks for multimodal image registration of the head and neck are presented. Finally, their shortcomings and future development directions are systematically discussed. The study aims to serve as a reference and guidance for researchers, engineers or doctors engaged in medical image analysis of head and neck surgery.
Collapse
|
9
|
Zhu H, Yao Q, Xiao L, Zhou SK. Learning to Localize Cross-Anatomy Landmarks in X-Ray Images with a Universal Model. BME FRONTIERS 2022; 2022:9765095. [PMID: 37850187 PMCID: PMC10521670 DOI: 10.34133/2022/9765095] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Accepted: 05/04/2022] [Indexed: 10/19/2023] Open
Abstract
Objective and Impact Statement. In this work, we develop a universal anatomical landmark detection model which learns once from multiple datasets corresponding to different anatomical regions. Compared with the conventional model trained on a single dataset, this universal model not only is more light weighted and easier to train but also improves the accuracy of the anatomical landmark location. Introduction. The accurate and automatic localization of anatomical landmarks plays an essential role in medical image analysis. However, recent deep learning-based methods only utilize limited data from a single dataset. It is promising and desirable to build a model learned from different regions which harnesses the power of big data. Methods. Our model consists of a local network and a global network, which capture local features and global features, respectively. The local network is a fully convolutional network built up with depth-wise separable convolutions, and the global network uses dilated convolution to enlarge the receptive field to model global dependencies. Results. We evaluate our model on four 2D X-ray image datasets totaling 1710 images and 72 landmarks in four anatomical regions. Extensive experimental results show that our model improves the detection accuracy compared to the state-of-the-art methods. Conclusion. Our model makes the first attempt to train a single network on multiple datasets for landmark detection. Experimental results qualitatively and quantitatively show that our proposed model performs better than other models trained on multiple datasets and even better than models trained on a single dataset separately.
Collapse
Affiliation(s)
- Heqin Zhu
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Qingsong Yao
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Li Xiao
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - S. Kevin Zhou
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
- Center for Medical Imaging, Robotics, Analytic Computing & Learning (MIRACLE), School of Biomedical Engineering & Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou 215123, China
| |
Collapse
|
10
|
Schurer-Waldheim S, Seebock P, Bogunovic H, Gerendas BS, Schmidt-Erfurth U. Robust Fovea Detection in Retinal OCT Imaging using Deep Learning. IEEE J Biomed Health Inform 2022; 26:3927-3937. [PMID: 35394920 DOI: 10.1109/jbhi.2022.3166068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The fovea centralis is an essential landmark in the retina where the photoreceptor layer is entirely composed of cones responsible for sharp, central vision. The localization of this anatomical landmark in optical coherence tomography (OCT) volumes is important for assessing visual function correlates and treatment guidance in macular disease. In this study, the "PRE U-net" is introduced as a novel approach for a fully automated fovea centralis detection, addressing the localization as a pixel-wise regression task. 2D B-scans are sampled from each image volume and are concatenated with spatial location information to train the deep network. A total of 5586 OCT volumes from 1,541 eyes was used to train, validate and test the deep learning method. The test data is comprised of healthy subjects and patients affected by neovascular age-related macular degeneration (nAMD), diabetic macula edema (DME) and macular edema from retinal vein occlusion (RVO), covering the three major retinal diseases responsible for blindness. Our experiments demonstrate that the PRE U-net significantly outperforms state-of-the-art methods and improves the robustness of automated localization, which is of value for clinical practice.
Collapse
|
11
|
Lu G, Zhang Y, Kong Y, Zhang C, Coatrieux JL, Shu H. Landmark Localization for Cephalometric Analysis using Multiscale Image Patch-based Graph Convolutional Networks. IEEE J Biomed Health Inform 2022; 26:3015-3024. [PMID: 35259123 DOI: 10.1109/jbhi.2022.3157722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Accurate and robust cephalometric image analysis plays an essential role in orthodontic diagnosis, treatment assessment and surgical planning. This paper proposes a novel landmark localization method for cephalometric analysis using multiscale image patch-based graph convolutional networks. In detail, image patches with the same size are hierarchically sampled from the Gaussian pyramid to well preserve multiscale context information. We combine local appearance and shape information into spatialized features with an attention module to enrich node representations in graph. The spatial relationships of landmarks are built with the incorporation of three-layer graph convolutional networks, and multiple landmarks are simultaneously updated and moved toward the targets in a cascaded coarse-to-fine process. Quantitative results obtained on publicly available cephalometric X-ray images have exhibited superior performance compared with other state-of-the-art methods in terms of mean radial error and successful detection rate within various precision ranges. Our approach performs significantly better especially in the clinically accepted range of 2 mm and this makes it suitable in cephalometric analysis and orthognathic surgery.
Collapse
|
12
|
Le VNT, Kang J, Oh IS, Kim JG, Yang YM, Lee DW. Effectiveness of Human–Artificial Intelligence Collaboration in Cephalometric Landmark Detection. J Pers Med 2022; 12:jpm12030387. [PMID: 35330386 PMCID: PMC8954049 DOI: 10.3390/jpm12030387] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Revised: 02/28/2022] [Accepted: 03/01/2022] [Indexed: 12/15/2022] Open
Abstract
Detection of cephalometric landmarks has contributed to the analysis of malocclusion during orthodontic diagnosis. Many recent studies involving deep learning have focused on head-to-head comparisons of accuracy in landmark identification between artificial intelligence (AI) and humans. However, a human–AI collaboration for the identification of cephalometric landmarks has not been evaluated. We selected 1193 cephalograms and used them to train the deep anatomical context feature learning (DACFL) model. The number of target landmarks was 41. To evaluate the effect of human–AI collaboration on landmark detection, 10 images were extracted randomly from 100 test images. The experiment included 20 dental students as beginners in landmark localization. The outcomes were determined by measuring the mean radial error (MRE), successful detection rate (SDR), and successful classification rate (SCR). On the dataset, the DACFL model exhibited an average MRE of 1.87 ± 2.04 mm and an average SDR of 73.17% within a 2 mm threshold. Compared with the beginner group, beginner–AI collaboration improved the SDR by 5.33% within a 2 mm threshold and also improved the SCR by 8.38%. Thus, the beginner–AI collaboration was effective in the detection of cephalometric landmarks. Further studies should be performed to demonstrate the benefits of an orthodontist–AI collaboration.
Collapse
Affiliation(s)
- Van Nhat Thang Le
- Department of Pediatric Dentistry and Institute of Oral Bioscience, School of Dentistry, Jeonbuk National University, Jeonju 54896, Korea; (V.N.T.L.); (J.-G.K.); (Y.-M.Y.)
- Research Institute of Clinical Medicine, Jeonbuk National University, Jeonju 54907, Korea
- Biomedical Research Institute, Jeonbuk National University Hospital, Jeonju 54907, Korea
- Faculty of Odonto-Stomatology, Hue University of Medicine and Pharmacy, Hue University, Hue 49120, Vietnam
| | - Junhyeok Kang
- Division of Computer Science and Engineering, Jeonbuk National University, Jeonju 54907, Korea; (J.K.); (I.-S.O.)
| | - Il-Seok Oh
- Division of Computer Science and Engineering, Jeonbuk National University, Jeonju 54907, Korea; (J.K.); (I.-S.O.)
| | - Jae-Gon Kim
- Department of Pediatric Dentistry and Institute of Oral Bioscience, School of Dentistry, Jeonbuk National University, Jeonju 54896, Korea; (V.N.T.L.); (J.-G.K.); (Y.-M.Y.)
- Research Institute of Clinical Medicine, Jeonbuk National University, Jeonju 54907, Korea
- Biomedical Research Institute, Jeonbuk National University Hospital, Jeonju 54907, Korea
| | - Yeon-Mi Yang
- Department of Pediatric Dentistry and Institute of Oral Bioscience, School of Dentistry, Jeonbuk National University, Jeonju 54896, Korea; (V.N.T.L.); (J.-G.K.); (Y.-M.Y.)
- Research Institute of Clinical Medicine, Jeonbuk National University, Jeonju 54907, Korea
- Biomedical Research Institute, Jeonbuk National University Hospital, Jeonju 54907, Korea
| | - Dae-Woo Lee
- Department of Pediatric Dentistry and Institute of Oral Bioscience, School of Dentistry, Jeonbuk National University, Jeonju 54896, Korea; (V.N.T.L.); (J.-G.K.); (Y.-M.Y.)
- Research Institute of Clinical Medicine, Jeonbuk National University, Jeonju 54907, Korea
- Biomedical Research Institute, Jeonbuk National University Hospital, Jeonju 54907, Korea
- Correspondence: ; Tel.: +82-63-250-2826
| |
Collapse
|
13
|
Li S, Gong Q, Li H, Chen S, Liu Y, Ruan G, Zhu L, Liu L, Chen H. Automatic location scheme of anatomical landmarks in 3D head MRI based on the scale attention hourglass network. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 214:106564. [PMID: 34894558 DOI: 10.1016/j.cmpb.2021.106564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 11/04/2021] [Accepted: 11/27/2021] [Indexed: 06/14/2023]
Abstract
BACKGROUND AND OBJECTIVE An anatomical landmark is biologically meaningful point in medical images and often used for medical image registration. The purpose of this study is to automatically locate anatomical landmarks from 3D medical images. METHODS A two-step automatic location scheme of anatomical landmarks in 3D medical image was designed in this study. In the first step, the full convolutional neural network was used for slice detection from a 3D medical image. In the second step, the scale attention hourglass network was used for landmark location in the detected slice and could overcome the difficulty of similar anatomical structures and different image parameters. This method was implemented and tested on four stable anatomical landmarks in 3D head MRI. RESULTS A total of 500 and 300 3D head volumes were used for training and testing, respectively. Results showed that the slice detection accuracy reached 85.7% and that the maximum location error was less than one slice. The average accuracy of the four anatomical landmarks in the detected slice reached 87.2%, and the spatial distance was 2.4 ± 2.4, which obtained better performance compared with hourglass network and feature pyramid networks. CONCLUSIONS This method can be useful for locating anatomical landmarks in 3D head MRI and provides technical support for medical image registration and big data analysis.
Collapse
Affiliation(s)
- Sai Li
- School of Life & Environmental Science, Guilin University of Electronic Technology, Guilin 541004, China
| | - Qiong Gong
- School of Life & Environmental Science, Guilin University of Electronic Technology, Guilin 541004, China
| | - Haojiang Li
- Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Shuchao Chen
- School of Life & Environmental Science, Guilin University of Electronic Technology, Guilin 541004, China
| | - Yifei Liu
- Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Guangying Ruan
- Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Lin Zhu
- School of Life & Environmental Science, Guilin University of Electronic Technology, Guilin 541004, China
| | - Lizhi Liu
- Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China.
| | - Hongbo Chen
- School of Life & Environmental Science, Guilin University of Electronic Technology, Guilin 541004, China.
| |
Collapse
|
14
|
|
15
|
Chen X, Lian C, Deng HH, Kuang T, Lin HY, Xiao D, Gateno J, Shen D, Xia JJ, Yap PT. Fast and Accurate Craniomaxillofacial Landmark Detection via 3D Faster R-CNN. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:3867-3878. [PMID: 34310293 PMCID: PMC8686670 DOI: 10.1109/tmi.2021.3099509] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Automatic craniomaxillofacial (CMF) landmark localization from cone-beam computed tomography (CBCT) images is challenging, considering that 1) the number of landmarks in the images may change due to varying deformities and traumatic defects, and 2) the CBCT images used in clinical practice are typically large. In this paper, we propose a two-stage, coarse-to-fine deep learning method to tackle these challenges with both speed and accuracy in mind. Specifically, we first use a 3D faster R-CNN to roughly locate landmarks in down-sampled CBCT images that have varying numbers of landmarks. By converting the landmark point detection problem to a generic object detection problem, our 3D faster R-CNN is formulated to detect virtual, fixed-size objects in small boxes with centers indicating the approximate locations of the landmarks. Based on the rough landmark locations, we then crop 3D patches from the high-resolution images and send them to a multi-scale UNet for the regression of heatmaps, from which the refined landmark locations are finally derived. We evaluated the proposed approach by detecting up to 18 landmarks on a real clinical dataset of CMF CBCT images with various conditions. Experiments show that our approach achieves state-of-the-art accuracy of 0.89 ± 0.64mm in an average time of 26.2 seconds per volume.
Collapse
|
16
|
Automatic vertebrae localization and segmentation in CT with a two-stage Dense-U-Net. Sci Rep 2021; 11:22156. [PMID: 34772972 PMCID: PMC8589948 DOI: 10.1038/s41598-021-01296-1] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Accepted: 10/26/2021] [Indexed: 11/09/2022] Open
Abstract
Automatic vertebrae localization and segmentation in computed tomography (CT) are fundamental for spinal image analysis and spine surgery with computer-assisted surgery systems. But they remain challenging due to high variation in spinal anatomy among patients. In this paper, we proposed a deep-learning approach for automatic CT vertebrae localization and segmentation with a two-stage Dense-U-Net. The first stage used a 2D-Dense-U-Net to localize vertebrae by detecting the vertebrae centroids with dense labels and 2D slices. The second stage segmented the specific vertebra within a region-of-interest identified based on the centroid using 3D-Dense-U-Net. Finally, each segmented vertebra was merged into a complete spine and resampled to original resolution. We evaluated our method on the dataset from the CSI 2014 Workshop with 6 metrics: location error (1.69 ± 0.78 mm), detection rate (100%) for vertebrae localization; the dice coefficient (0.953 ± 0.014), intersection over union (0.911 ± 0.025), Hausdorff distance (4.013 ± 2.128 mm), pixel accuracy (0.998 ± 0.001) for vertebrae segmentation. The experimental results demonstrated the efficiency of the proposed method. Furthermore, evaluation on the dataset from the xVertSeg challenge with location error (4.12 ± 2.31), detection rate (100%), dice coefficient (0.877 ± 0.035) shows the generalizability of our method. In summary, our solution localized the vertebrae successfully by detecting the centroids of vertebrae and implemented instance segmentation of vertebrae in the whole spine.
Collapse
|
17
|
Reddy PK, Kanakatte A, Gubbi J, Poduval M, Ghose A, Purushothaman B. Anatomical Landmark Detection using Deep Appearance-Context Network. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:3569-3572. [PMID: 34892010 DOI: 10.1109/embc46164.2021.9630457] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Accurate identification of anatomical landmarks is a crucial step in medical image analysis. While deep neural networks have shown impressive performance on computer vision tasks, they rely on a large amount of data, which is often not available. In this work, we propose an attention-driven end-to-end deep learning architecture, which learns the local appearance and global context separately that helps in stable training under limited data. The experiments conducted demonstrate the effectiveness of the proposed approach with impressive results in localizing landmarks when evaluated on cephalometric and spine X-ray image data. The predicted landmarks are further utilized in biomedical applications to demonstrate the impact.
Collapse
|
18
|
Deep Reinforcement Learning with Explicit Spatio-Sequential Encoding Network for Coronary Ostia Identification in CT Images. SENSORS 2021; 21:s21186187. [PMID: 34577391 PMCID: PMC8469841 DOI: 10.3390/s21186187] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 08/31/2021] [Accepted: 09/13/2021] [Indexed: 11/16/2022]
Abstract
Accurate identification of the coronary ostia from 3D coronary computed tomography angiography (CCTA) is a essential prerequisite step for automatically tracking and segmenting three main coronary arteries. In this paper, we propose a novel deep reinforcement learning (DRL) framework to localize the two coronary ostia from 3D CCTA. An optimal action policy is determined using a fully explicit spatial-sequential encoding policy network applying 2.5D Markovian states with three past histories. The proposed network is trained using a dueling DRL framework on the CAT08 dataset. The experiment results show that our method is more efficient and accurate than the other methods. blueFloating-point operations (FLOPs) are calculated to measure computational efficiency. The result shows that there are 2.5M FLOPs on the proposed method, which is about 10 times smaller value than 3D box-based methods. In terms of accuracy, the proposed method shows that 2.22 ± 1.12 mm and 1.94 ± 0.83 errors on the left and right coronary ostia, respectively. The proposed method can be applied to the tasks to identify other target objects by changing the target locations in the ground truth data. Further, the proposed method can be utilized as a pre-processing method for coronary artery tracking methods.
Collapse
|
19
|
Fu Z, Jiao J, Suttie M, Noble JA. Facial Anatomical Landmark Detection using Regularized Transfer Learning with Application to Fetal Alcohol Syndrome Recognition. IEEE J Biomed Health Inform 2021; 26:1591-1601. [PMID: 34495853 PMCID: PMC9209878 DOI: 10.1109/jbhi.2021.3110680] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Fetal alcohol syndrome (FAS) caused by prenatal alcohol exposure can result in a series of cranio-facial anomalies, and behavioral and neurocognitive problems. Current diagnosis of FAS is typically done by identifying a set of facial characteristics, which are often obtained by manual examination. Anatomical landmark detection, which provides rich geometric information, is important to detect the presence of FAS associated facial anomalies. This imaging application is characterized by large variations in data appearance and limited availability of labeled data. Current deep learning-based heatmap regression methods designed for facial landmark detection in natural images assume availability of large datasets and are therefore not wellsuited for this application. To address this restriction, we develop a new regularized transfer learning approach that exploits the knowledge of a network learned on large facial recognition datasets. In contrast to standard transfer learning which focuses on adjusting the pre-trained weights, the proposed learning approach regularizes the model behavior. It explicitly reuses the rich visual semantics of a domain-similar source model on the target task data as an additional supervisory signal for regularizing landmark detection optimization. Specifically, we develop four regularization constraints for the proposed transfer learning, including constraining the feature outputs from classification and intermediate layers, as well as matching activation attention maps in both spatial and channel levels. Experimental evaluation on a collected clinical imaging dataset demonstrate that the proposed approach can effectively improve model generalizability under limited training samples, and is advantageous to other approaches in the literature.
Collapse
|
20
|
Oh K, Oh IS, Le VNT, Lee DW. Deep Anatomical Context Feature Learning for Cephalometric Landmark Detection. IEEE J Biomed Health Inform 2021; 25:806-817. [PMID: 32750939 DOI: 10.1109/jbhi.2020.3002582] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In the past decade, anatomical context features have been widely used for cephalometric landmark detection and significant progress is still being made. However, most existing methods rely on handcrafted graphical models rather than incorporating anatomical context during training, leading to suboptimal performance. In this study, we present a novel framework that allows a Convolutional Neural Network (CNN) to learn richer anatomical context features during training. Our key idea consists of the Local Feature Perturbator (LFP) and the Anatomical Context loss (AC loss). When training the CNN, the LFP perturbs a cephalometric image based on prior anatomical distribution, forcing the CNN to gaze relevant features more globally. Then AC loss helps the CNN to learn the anatomical context based on spatial relationships between the landmarks. The experimental results demonstrate that the proposed framework makes the CNN learn richer anatomical representation, leading to increased performance. In the performance comparisons, the proposed scheme outperforms state-of-the-art methods on the ISBI 2015 Cephalometric X-ray Image Analysis Challenge.
Collapse
|
21
|
Noothout JMH, De Vos BD, Wolterink JM, Postma EM, Smeets PAM, Takx RAP, Leiner T, Viergever MA, Isgum I. Deep Learning-Based Regression and Classification for Automatic Landmark Localization in Medical Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:4011-4022. [PMID: 32746142 DOI: 10.1109/tmi.2020.3009002] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
In this study, we propose a fast and accurate method to automatically localize anatomical landmarks in medical images. We employ a global-to-local localization approach using fully convolutional neural networks (FCNNs). First, a global FCNN localizes multiple landmarks through the analysis of image patches, performing regression and classification simultaneously. In regression, displacement vectors pointing from the center of image patches towards landmark locations are determined. In classification, presence of landmarks of interest in the patch is established. Global landmark locations are obtained by averaging the predicted displacement vectors, where the contribution of each displacement vector is weighted by the posterior classification probability of the patch that it is pointing from. Subsequently, for each landmark localized with global localization, local analysis is performed. Specialized FCNNs refine the global landmark locations by analyzing local sub-images in a similar manner, i.e. by performing regression and classification simultaneously and combining the results. Evaluation was performed through localization of 8 anatomical landmarks in CCTA scans, 2 landmarks in olfactory MR scans, and 19 landmarks in cephalometric X-rays. We demonstrate that the method performs similarly to a second observer and is able to localize landmarks in a diverse set of medical images, differing in image modality, image dimensionality, and anatomical coverage.
Collapse
|
22
|
Zeng M, Yan Z, Liu S, Zhou Y, Qiu L. Cascaded convolutional networks for automatic cephalometric landmark detection. Med Image Anal 2020; 68:101904. [PMID: 33290934 DOI: 10.1016/j.media.2020.101904] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2019] [Revised: 06/15/2020] [Accepted: 11/11/2020] [Indexed: 11/17/2022]
Abstract
Cephalometric analysis is a fundamental examination which is widely used in orthodontic diagnosis and treatment planning. Its key step is to detect the anatomical landmarks in lateral cephalograms, which is time-consuming in traditional manual way. To solve this problem, we propose a novel approach with a cascaded three-stage convolutional neural networks to predict cephalometric landmarks automatically. In the first stage, high-level features of the craniofacial structures are extracted to locate the lateral face area which helps to overcome the appearance variations. Next, we process the aligned face area to estimate the locations of all landmarks simultaneously. At the last stage, each landmark is refined through a dedicated network using high-resolution image data around the initial position to achieve more accurate result. We evaluate the proposed method on several anatomical landmark datasets and the experimental results show that our method achieved competitive performance compared with the other methods.
Collapse
Affiliation(s)
- Minmin Zeng
- Fourth Clinical Division, School and Hospital of Stomatology, Peking University, Beijing, China.
| | | | - Shuai Liu
- Second Clinical Division, School and Hospital of Stomatology, Peking University, Beijing, China
| | - Yanheng Zhou
- Department of orthodontics, School and Hospital of Stomatology, Peking University, Beijing, China
| | - Lixin Qiu
- Fourth Clinical Division, School and Hospital of Stomatology, Peking University, Beijing, China
| |
Collapse
|
23
|
Kim H, Shim E, Park J, Kim YJ, Lee U, Kim Y. Web-based fully automated cephalometric analysis by deep learning. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2020; 194:105513. [PMID: 32403052 DOI: 10.1016/j.cmpb.2020.105513] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 04/17/2020] [Accepted: 04/18/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND AND OBJECTIVE An accurate lateral cephalometric analysis is vital in orthodontic diagnosis. Identification of anatomic landmarks on lateral cephalograms is tedious, and errors may occur depending on the doctor's experience. Several attempts have been made to reduce this time-consuming process by automating the process through machine learning; however, they only dealt with a small amount of data from one institute. This study aims to develop a fully automated cephalometric analysis method using deep learning and a corresponding web-based application that can be used without high-specification hardware. METHODS We built our own dataset comprising 2,075 lateral cephalograms and ground truth positions of 23 landmarks from two institutes and trained a two-stage automated algorithm with a stacked hourglass deep learning model specialized for detecting landmarks in images. Additionally, a web-based application with the proposed algorithm for fully automated cephalometric analysis was developed for better accessibility regardless of the user's computer hardware, which is essential for a deep learning-based method. RESULTS The algorithm was evaluated with datasets from various devices and institutes, including a widely used open dataset and achieved 1.37 ± 1.79 mm of point-to-point errors with ground truth positions for 23 cephalometric landmarks. Based on the predicted positions, anatomical types of the subjects were automatically classified and compared with the ground truth, and the automated algorithm achieved a successful classification rate of 88.43%. CONCLUSIONS We expect that this fully automated cephalometric analysis algorithm and the web-based application can be widely used in various medical environments to save time and effort for manual marking and diagnosis.
Collapse
Affiliation(s)
- Hannah Kim
- Center for Bionics, Korea Institute of Science and Technology, 5, Hwarang-ro 14-gil, Seongbuk-gu, Seoul, 02792, Republic of Korea; Division of Bio-Medical Science & Technology, KIST School, Korea University of Science and Technology, 5, Hwarang-ro 14-gil, Seongbuk-gu, Seoul, 02792, Republic of Korea.
| | - Eungjune Shim
- Center for Bionics, Korea Institute of Science and Technology, 5, Hwarang-ro 14-gil, Seongbuk-gu, Seoul, 02792, Republic of Korea.
| | - Jungeun Park
- Department of Orthodontics, Graduate School, Yonsei University College of Dentistry, 50-1, Yonseiro, Seodaemun-gu, Seoul, 03722, Republic of Korea.
| | - Yoon-Ji Kim
- Department of Orthodontics, Korea University Anam Hospital, 73 Inchon-ro, Seongbuk-gu, Seoul, 02841, Republic of Korea.
| | - Uilyong Lee
- Department of Oral and Maxillofacial Surgery, Chungang University Hospital, 102, Heukseok-ro, Dongjak-gu, Seoul, 06973, Republic of Korea; Tooth Bioengineering National Research Laboratory, BK21, School of Dentistry, Seoul National University, Daehak-ro 101, Jongno-gu, Seoul, 03080, Republic of Korea.
| | - Youngjun Kim
- Center for Bionics, Korea Institute of Science and Technology, 5, Hwarang-ro 14-gil, Seongbuk-gu, Seoul, 02792, Republic of Korea; Division of Bio-Medical Science & Technology, KIST School, Korea University of Science and Technology, 5, Hwarang-ro 14-gil, Seongbuk-gu, Seoul, 02792, Republic of Korea.
| |
Collapse
|
24
|
Augmented reality for inner ear procedures: visualization of the cochlear central axis in microscopic videos. Int J Comput Assist Radiol Surg 2020; 15:1703-1711. [PMID: 32737858 DOI: 10.1007/s11548-020-02240-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2019] [Accepted: 07/20/2020] [Indexed: 10/23/2022]
Abstract
PURPOSE Visualization of the cochlea is impossible due to the delicate and intricate ear anatomy. Augmented reality may be used to perform auditory nerve implantation by transmodiolar approach in patients with profound hearing loss. METHODS We present an augmented reality system for the visualization of the cochlear axis in surgical videos. The system starts with an automatic anatomical landmark detection in preoperative computed tomography images based on deep reinforcement learning. These landmarks are used to register the preoperative geometry with the real-time microscopic video captured inside the auditory canal. Three-dimensional pose of the cochlear axis is determined using the registration projection matrices. In addition, the patient microscope movements are tracked using an image feature-based tracking process. RESULTS The landmark detection stage yielded an average localization error of [Formula: see text] mm ([Formula: see text]). The target registration error was [Formula: see text] mm for the cochlear apex and [Formula: see text] for the cochlear axis. CONCLUSION We developed an augmented reality system to visualize the cochlear axis in intraoperative videos. The system yielded millimetric accuracy and remained stable throughout the experimental study despite camera movements throughout the procedure in experimental conditions.
Collapse
|
25
|
Wang X, Zhai S, Niu Y. Left ventricle landmark localization and identification in cardiac MRI by deep metric learning-assisted CNN regression. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.02.069] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
26
|
Štern D, Payer C, Urschler M. Automated age estimation from MRI volumes of the hand. Med Image Anal 2019; 58:101538. [PMID: 31400620 DOI: 10.1016/j.media.2019.101538] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Revised: 02/21/2019] [Indexed: 10/26/2022]
Abstract
Highly relevant for both clinical and legal medicine applications, the established radiological methods for estimating unknown age in children and adolescents are based on visual examination of bone ossification in X-ray images of the hand. Our group has initiated the development of fully automatic age estimation methods from 3D MRI scans of the hand, in order to simultaneously overcome the problems of the radiological methods including (1) exposure to ionizing radiation, (2) necessity to define new, MRI specific staging systems, and (3) subjective influence of the examiner. The present work provides a theoretical background for understanding the nonlinear regression problem of biological age estimation and chronological age approximation. Based on this theoretical background, we comprehensively evaluate machine learning methods (random forests, deep convolutional neural networks) with different simplifications of the image information used as an input for learning. Trained on a large dataset of 328 MR images, we compare the performance of the different input strategies and demonstrate unprecedented results. For estimating biological age, we obtain a mean absolute error of 0.37 ± 0.51 years for the age range of the subjects ≤ 18 years, i.e. where bone ossification has not yet saturated. Finally, we validate our findings by adapting our best performing method to 2D images and applying it to a publicly available dataset of X-ray images, showing that we are in line with the state-of-the-art automatic methods for this task.
Collapse
Affiliation(s)
- Darko Štern
- Ludwig Boltzmann Institute for Clinical Forensic Imaging, Graz, Austria; BioTechMed-Graz, Medical University Graz, Graz, Austria
| | - Christian Payer
- Ludwig Boltzmann Institute for Clinical Forensic Imaging, Graz, Austria; Institute of Computer Graphics and Vision, Graz University of Technology, Graz, Austria
| | - Martin Urschler
- Ludwig Boltzmann Institute for Clinical Forensic Imaging, Graz, Austria; School of Computer Science, The University of Auckland, Auckland, New Zealand.
| |
Collapse
|
27
|
Integrating spatial configuration into heatmap regression based CNNs for landmark localization. Med Image Anal 2019; 54:207-219. [DOI: 10.1016/j.media.2019.03.007] [Citation(s) in RCA: 105] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2018] [Revised: 01/11/2019] [Accepted: 03/21/2019] [Indexed: 11/23/2022]
|
28
|
Bier B, Goldmann F, Zaech JN, Fotouhi J, Hegeman R, Grupp R, Armand M, Osgood G, Navab N, Maier A, Unberath M. Learning to detect anatomical landmarks of the pelvis in X-rays from arbitrary views. Int J Comput Assist Radiol Surg 2019; 14:1463-1473. [PMID: 31006106 DOI: 10.1007/s11548-019-01975-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 04/09/2019] [Indexed: 10/27/2022]
Abstract
PURPOSE Minimally invasive alternatives are now available for many complex surgeries. These approaches are enabled by the increasing availability of intra-operative image guidance. Yet, fluoroscopic X-rays suffer from projective transformation and thus cannot provide direct views onto anatomy. Surgeons could highly benefit from additional information, such as the anatomical landmark locations in the projections, to support intra-operative decision making. However, detecting landmarks is challenging since the viewing direction changes substantially between views leading to varying appearance of the same landmark. Therefore, and to the best of our knowledge, view-independent anatomical landmark detection has not been investigated yet. METHODS In this work, we propose a novel approach to detect multiple anatomical landmarks in X-ray images from arbitrary viewing directions. To this end, a sequential prediction framework based on convolutional neural networks is employed to simultaneously regress all landmark locations. For training, synthetic X-rays are generated with a physically accurate forward model that allows direct application of the trained model to real X-ray images of the pelvis. View invariance is achieved via data augmentation by sampling viewing angles on a spherical segment of [Formula: see text]. RESULTS On synthetic data, a mean prediction error of 5.6 ± 4.5 mm is achieved. Further, we demonstrate that the trained model can be directly applied to real X-rays and show that these detections define correspondences to a respective CT volume, which allows for analytic estimation of the 11 degree of freedom projective mapping. CONCLUSION We present the first tool to detect anatomical landmarks in X-ray images independent of their viewing direction. Access to this information during surgery may benefit decision making and constitutes a first step toward global initialization of 2D/3D registration without the need of calibration. As such, the proposed concept has a strong prospect to facilitate and enhance applications and methods in the realm of image-guided surgery.
Collapse
Affiliation(s)
- Bastian Bier
- Computer Aided Medical Procedures, Johns Hopkins University, Baltimore, USA. .,Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.
| | - Florian Goldmann
- Computer Aided Medical Procedures, Johns Hopkins University, Baltimore, USA.,Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Jan-Nico Zaech
- Computer Aided Medical Procedures, Johns Hopkins University, Baltimore, USA.,Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Javad Fotouhi
- Computer Aided Medical Procedures, Johns Hopkins University, Baltimore, USA.,Department of Computer Science, Johns Hopkins University, Baltimore, USA
| | - Rachel Hegeman
- Applied Physics Laboratory, Johns Hopkins University, Baltimore, USA
| | - Robert Grupp
- Department of Computer Science, Johns Hopkins University, Baltimore, USA
| | - Mehran Armand
- Applied Physics Laboratory, Johns Hopkins University, Baltimore, USA.,Department of Orthopedic Surgery, Johns Hopkins Hospital, Baltimore, USA
| | - Greg Osgood
- Department of Orthopedic Surgery, Johns Hopkins Hospital, Baltimore, USA
| | - Nassir Navab
- Computer Aided Medical Procedures, Johns Hopkins University, Baltimore, USA.,Department of Computer Science, Johns Hopkins University, Baltimore, USA
| | - Andreas Maier
- Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Mathias Unberath
- Computer Aided Medical Procedures, Johns Hopkins University, Baltimore, USA.,Department of Computer Science, Johns Hopkins University, Baltimore, USA
| |
Collapse
|
29
|
Torosdagli N, Liberton DK, Verma P, Sincan M, Lee JS, Bagci U. Deep Geodesic Learning for Segmentation and Anatomical Landmarking. IEEE TRANSACTIONS ON MEDICAL IMAGING 2019; 38:919-931. [PMID: 30334750 PMCID: PMC6475529 DOI: 10.1109/tmi.2018.2875814] [Citation(s) in RCA: 61] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
In this paper, we propose a novel deep learning framework for anatomy segmentation and automatic landmarking. Specifically, we focus on the challenging problem of mandible segmentation from cone-beam computed tomography (CBCT) scans and identification of 9 anatomical landmarks of the mandible on the geodesic space. The overall approach employs three inter-related steps. In the first step, we propose a deep neural network architecture with carefully designed regularization, and network hyper-parameters to perform image segmentation without the need for data augmentation and complex post-processing refinement. In the second step, we formulate the landmark localization problem directly on the geodesic space for sparsely-spaced anatomical landmarks. In the third step, we utilize a long short-term memory network to identify the closely-spaced landmarks, which is rather difficult to obtain using other standard networks. The proposed fully automated method showed superior efficacy compared to the state-of-the-art mandible segmentation and landmarking approaches in craniofacial anomalies and diseased states. We used a very challenging CBCT data set of 50 patients with a high-degree of craniomaxillofacial variability that is realistic in clinical practice. The qualitative visual inspection was conducted for distinct CBCT scans from 250 patients with high anatomical variability. We have also shown the state-of-the-art performance in an independent data set from the MICCAI Head-Neck Challenge (2015).
Collapse
|
30
|
Alansary A, Oktay O, Li Y, Folgoc LL, Hou B, Vaillant G, Kamnitsas K, Vlontzos A, Glocker B, Kainz B, Rueckert D. Evaluating reinforcement learning agents for anatomical landmark detection. Med Image Anal 2019; 53:156-164. [PMID: 30784956 PMCID: PMC7610752 DOI: 10.1016/j.media.2019.02.007] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Revised: 02/01/2019] [Accepted: 02/12/2019] [Indexed: 11/29/2022]
Abstract
Automatic detection of anatomical landmarks is an important step for a wide range of applications in medical image analysis. Manual annotation of landmarks is a tedious task and prone to observer errors. In this paper, we evaluate novel deep reinforcement learning (RL) strategies to train agents that can precisely and robustly localize target landmarks in medical scans. An artificial RL agent learns to identify the optimal path to the landmark by interacting with an environment, in our case 3D images. Furthermore, we investigate the use of fixed- and multi-scale search strategies with novel hierarchical action steps in a coarse-to-fine manner. Several deep Q-network (DQN) architectures are evaluated for detecting multiple landmarks using three different medical imaging datasets: fetal head ultrasound (US), adult brain and cardiac magnetic resonance imaging (MRI). The performance of our agents surpasses state-of-the-art supervised and RL methods. Our experiments also show that multi-scale search strategies perform significantly better than fixed-scale agents in images with large field of view and noisy background such as in cardiac MRI. Moreover, the novel hierarchical steps can significantly speed up the searching process by a factor of 4-5 times.
Collapse
Affiliation(s)
- Amir Alansary
- Biomedical Image Analysis Group (BioMedIA), Imperial College London, London, UK.
| | - Ozan Oktay
- Biomedical Image Analysis Group (BioMedIA), Imperial College London, London, UK
| | - Yuanwei Li
- Biomedical Image Analysis Group (BioMedIA), Imperial College London, London, UK
| | - Loic Le Folgoc
- Biomedical Image Analysis Group (BioMedIA), Imperial College London, London, UK
| | - Benjamin Hou
- Biomedical Image Analysis Group (BioMedIA), Imperial College London, London, UK
| | - Ghislain Vaillant
- Biomedical Image Analysis Group (BioMedIA), Imperial College London, London, UK
| | | | - Athanasios Vlontzos
- Biomedical Image Analysis Group (BioMedIA), Imperial College London, London, UK
| | - Ben Glocker
- Biomedical Image Analysis Group (BioMedIA), Imperial College London, London, UK
| | - Bernhard Kainz
- Biomedical Image Analysis Group (BioMedIA), Imperial College London, London, UK
| | - Daniel Rueckert
- Biomedical Image Analysis Group (BioMedIA), Imperial College London, London, UK
| |
Collapse
|
31
|
Stern D, Payer C, Giuliani N, Urschler M. Automatic Age Estimation and Majority Age Classification From Multi-Factorial MRI Data. IEEE J Biomed Health Inform 2018; 23:1392-1403. [PMID: 31059459 DOI: 10.1109/jbhi.2018.2869606] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Age estimation from radiologic data is an important topic both in clinical medicine as well as in forensic applications, where it is used to assess unknown chronological age or to discriminate minors from adults. In this paper, we propose an automatic multi-factorial age estimation method based on MRI data of hand, clavicle, and teeth to extend the maximal age range from up to 19 years, as commonly used for age assessment based on hand bones, to up to 25 years, when combined with clavicle bones and wisdom teeth. Fusing age-relevant information from all three anatomical sites, our method utilizes a deep convolutional neural network that is trained on a dataset of 322 subjects in the age range between 13 and 25 years, to achieve a mean absolute prediction error in regressing chronological age of 1.01±0.74 years. Furthermore, when used for majority age classification, we show that a classifier derived from thresholding our regression-based predictor is better suited than a classifier directly trained with a classification loss, especially when taking into account that those cases of minors being wrongly classified as adults need to be minimized. In conclusion, we overcome the limitations of the multi-factorial methods currently used in forensic practice, i.e., dependence on ionizing radiation, subjectivity in quantifying age-relevant information, and lack of an established approach to fuse this information from individual anatomical sites.
Collapse
|
32
|
Ghesu FC, Georgescu B, Grbic S, Maier A, Hornegger J, Comaniciu D. Towards intelligent robust detection of anatomical structures in incomplete volumetric data. Med Image Anal 2018; 48:203-213. [PMID: 29966940 DOI: 10.1016/j.media.2018.06.007] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2018] [Revised: 06/11/2018] [Accepted: 06/18/2018] [Indexed: 12/27/2022]
Abstract
Robust and fast detection of anatomical structures represents an important component of medical image analysis technologies. Current solutions for anatomy detection are based on machine learning, and are generally driven by suboptimal and exhaustive search strategies. In particular, these techniques do not effectively address cases of incomplete data, i.e., scans acquired with a partial field-of-view. We address these challenges by following a new paradigm, which reformulates the detection task to teaching an intelligent artificial agent how to actively search for an anatomical structure. Using the principles of deep reinforcement learning with multi-scale image analysis, artificial agents are taught optimal navigation paths in the scale-space representation of an image, while accounting for structures that are missing from the field-of-view. The spatial coherence of the observed anatomical landmarks is ensured using elements from statistical shape modeling and robust estimation theory. Experiments show that our solution outperforms marginal space deep learning, a powerful deep learning method, at detecting different anatomical structures without any failure. The dataset contains 5043 3D-CT volumes from over 2000 patients, totaling over 2,500,000 image slices. In particular, our solution achieves 0% false-positive and 0% false-negative rates at detecting whether the landmarks are captured in the field-of-view of the scan (excluding all border cases), with an average detection accuracy of 2.78 mm. In terms of runtime, we reduce the detection-time of the marginal space deep learning method by 20-30 times to under 40 ms, an unmatched performance for high resolution incomplete 3D-CT data.
Collapse
Affiliation(s)
- Florin C Ghesu
- Siemens Healthineers, Medical Imaging Technologies, Princeton, NJ, USA; Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany.
| | - Bogdan Georgescu
- Siemens Healthineers, Medical Imaging Technologies, Princeton, NJ, USA
| | - Sasa Grbic
- Siemens Healthineers, Medical Imaging Technologies, Princeton, NJ, USA
| | - Andreas Maier
- Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| | - Joachim Hornegger
- Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| | - Dorin Comaniciu
- Siemens Healthineers, Medical Imaging Technologies, Princeton, NJ, USA
| |
Collapse
|
33
|
Heinrich MP, Blendowski M, Oktay O. TernaryNet: faster deep model inference without GPUs for medical 3D segmentation using sparse and binary convolutions. Int J Comput Assist Radiol Surg 2018; 13:1311-1320. [PMID: 29850978 DOI: 10.1007/s11548-018-1797-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2018] [Accepted: 05/21/2018] [Indexed: 10/16/2022]
Abstract
PURPOSE Deep convolutional neural networks (DCNN) are currently ubiquitous in medical imaging. While their versatility and high-quality results for common image analysis tasks including segmentation, localisation and prediction is astonishing, the large representational power comes at the cost of highly demanding computational effort. This limits their practical applications for image-guided interventions and diagnostic (point-of-care) support using mobile devices without graphics processing units (GPU). METHODS We propose a new scheme that approximates both trainable weights and neural activations in deep networks by ternary values and tackles the open question of backpropagation when dealing with non-differentiable functions. Our solution enables the removal of the expensive floating-point matrix multiplications throughout any convolutional neural network and replaces them by energy- and time-preserving binary operators and population counts. RESULTS We evaluate our approach for the segmentation of the pancreas in CT. Here, our ternary approximation within a fully convolutional network leads to more than 90% memory reductions and high accuracy (without any post-processing) with a Dice overlap of 71.0% that comes close to the one obtained when using networks with high-precision weights and activations. We further provide a concept for sub-second inference without GPUs and demonstrate significant improvements in comparison with binary quantisation and without our proposed ternary hyperbolic tangent continuation. CONCLUSIONS We present a key enabling technique for highly efficient DCNN inference without GPUs that will help to bring the advances of deep learning to practical clinical applications. It has also great promise for improving accuracies in large-scale medical data retrieval.
Collapse
Affiliation(s)
- Mattias P Heinrich
- Institute of Medical Informatics, University of Lübeck, Ratzeburger Allee 160, 23562, Lübeck, Germany.
| | - Max Blendowski
- Institute of Medical Informatics, University of Lübeck, Ratzeburger Allee 160, 23562, Lübeck, Germany
| | - Ozan Oktay
- Biomedical Image Analysis Group, Department of Computing, Imperial College London, London, SW7 2AZ, UK
| |
Collapse
|