1
|
Almalioglu Y, Turan M, Trigoni N, Markham A. Deep learning-based robust positioning for all-weather autonomous driving. NAT MACH INTELL 2022; 4:749-760. [PMID: 37790900 PMCID: PMC10543073 DOI: 10.1038/s42256-022-00520-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 07/12/2022] [Indexed: 11/08/2022]
Abstract
Interest in autonomous vehicles (AVs) is growing at a rapid pace due to increased convenience, safety benefits and potential environmental gains. Although several leading AV companies predicted that AVs would be on the road by 2020, they are still limited to relatively small-scale trials. The ability to know their precise location on the map is a challenging prerequisite for safe and reliable AVs due to sensor imperfections under adverse environmental and weather conditions, posing a formidable obstacle to their widespread use. Here we propose a deep learning-based self-supervised approach for ego-motion estimation that is a robust and complementary localization solution under inclement weather conditions. The proposed approach is a geometry-aware method that attentively fuses the rich representation capability of visual sensors and the weather-immune features provided by radars using an attention-based learning technique. Our method predicts reliability masks for the sensor measurements, eliminating the deficiencies in the multimodal data. In various experiments we demonstrate the robust all-weather performance and effective cross-domain generalizability under harsh weather conditions such as rain, fog and snow, as well as day and night conditions. Furthermore, we employ a game-theoretic approach to analyse the interpretability of the model predictions, illustrating the independent and uncorrelated failure modes of the multimodal system. We anticipate our work will bring AVs one step closer to safe and reliable all-weather autonomous driving.
Collapse
Affiliation(s)
- Yasin Almalioglu
- Department of Computer Science, University of Oxford, Oxford, UK
| | - Mehmet Turan
- Department of Computer Engineering, Bogazici University, Istanbul, Turkey
| | - Niki Trigoni
- Department of Computer Science, University of Oxford, Oxford, UK
| | - Andrew Markham
- Department of Computer Science, University of Oxford, Oxford, UK
| |
Collapse
|
2
|
Ozyoruk KB, Gokceler GI, Bobrow TL, Coskun G, Incetan K, Almalioglu Y, Mahmood F, Curto E, Perdigoto L, Oliveira M, Sahin H, Araujo H, Alexandrino H, Durr NJ, Gilbert HB, Turan M. EndoSLAM dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos. Med Image Anal 2021; 71:102058. [PMID: 33930829 DOI: 10.1016/j.media.2021.102058] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 01/23/2021] [Accepted: 03/29/2021] [Indexed: 02/07/2023]
Abstract
Deep learning techniques hold promise to develop dense topography reconstruction and pose estimation methods for endoscopic videos. However, currently available datasets do not support effective quantitative benchmarking. In this paper, we introduce a comprehensive endoscopic SLAM dataset consisting of 3D point cloud data for six porcine organs, capsule and standard endoscopy recordings, synthetically generated data as well as clinically in use conventional endoscope recording of the phantom colon with computed tomography(CT) scan ground truth. A Panda robotic arm, two commercially available capsule endoscopes, three conventional endoscopes with different camera properties, two high precision 3D scanners, and a CT scanner were employed to collect data from eight ex-vivo porcine gastrointestinal (GI)-tract organs and a silicone colon phantom model. In total, 35 sub-datasets are provided with 6D pose ground truth for the ex-vivo part: 18 sub-datasets for colon, 12 sub-datasets for stomach, and 5 sub-datasets for small intestine, while four of these contain polyp-mimicking elevations carried out by an expert gastroenterologist. To verify the applicability of this data for use with real clinical systems, we recorded a video sequence with a state-of-the-art colonoscope from a full representation silicon colon phantom. Synthetic capsule endoscopy frames from stomach, colon, and small intestine with both depth and pose annotations are included to facilitate the study of simulation-to-real transfer learning algorithms. Additionally, we propound Endo-SfMLearner, an unsupervised monocular depth and pose estimation method that combines residual networks with a spatial attention module in order to dictate the network to focus on distinguishable and highly textured tissue regions. The proposed approach makes use of a brightness-aware photometric loss to improve the robustness under fast frame-to-frame illumination changes that are commonly seen in endoscopic videos. To exemplify the use-case of the EndoSLAM dataset, the performance of Endo-SfMLearner is extensively compared with the state-of-the-art: SC-SfMLearner, Monodepth2, and SfMLearner. The codes and the link for the dataset are publicly available at https://github.com/CapsuleEndoscope/EndoSLAM. A video demonstrating the experimental setup and procedure is accessible as Supplementary Video 1.
Collapse
Affiliation(s)
| | | | - Taylor L Bobrow
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Gulfize Coskun
- Institute of Biomedical Engineering, Bogazici University, Turkey
| | - Kagan Incetan
- Institute of Biomedical Engineering, Bogazici University, Turkey
| | | | - Faisal Mahmood
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Cancer Data Science, Dana Farber Cancer Institute, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Eva Curto
- Institute for Systems and Robotics, University of Coimbra, Portugal
| | - Luis Perdigoto
- Institute for Systems and Robotics, University of Coimbra, Portugal
| | - Marina Oliveira
- Institute for Systems and Robotics, University of Coimbra, Portugal
| | - Hasan Sahin
- Institute of Biomedical Engineering, Bogazici University, Turkey
| | - Helder Araujo
- Institute for Systems and Robotics, University of Coimbra, Portugal
| | - Henrique Alexandrino
- Faculty of Medicine, Clinical Academic Center of Coimbra, University of Coimbra, Coimbra, Portugal
| | - Nicholas J Durr
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Hunter B Gilbert
- Department of Mechanical and Industrial Engineering, Louisiana State University, Baton Rouge, LA, USA
| | - Mehmet Turan
- Institute of Biomedical Engineering, Bogazici University, Turkey.
| |
Collapse
|
3
|
İncetan K, Celik IO, Obeid A, Gokceler GI, Ozyoruk KB, Almalioglu Y, Chen RJ, Mahmood F, Gilbert H, Durr NJ, Turan M. VR-Caps: A Virtual Environment for Capsule Endoscopy. Med Image Anal 2021; 70:101990. [PMID: 33609920 DOI: 10.1016/j.media.2021.101990] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 02/01/2021] [Accepted: 02/02/2021] [Indexed: 02/06/2023]
Abstract
Current capsule endoscopes and next-generation robotic capsules for diagnosis and treatment of gastrointestinal diseases are complex cyber-physical platforms that must orchestrate complex software and hardware functions. The desired tasks for these systems include visual localization, depth estimation, 3D mapping, disease detection and segmentation, automated navigation, active control, path realization and optional therapeutic modules such as targeted drug delivery and biopsy sampling. Data-driven algorithms promise to enable many advanced functionalities for capsule endoscopes, but real-world data is challenging to obtain. Physically-realistic simulations providing synthetic data have emerged as a solution to the development of data-driven algorithms. In this work, we present a comprehensive simulation platform for capsule endoscopy operations and introduce VR-Caps, a virtual active capsule environment that simulates a range of normal and abnormal tissue conditions (e.g., inflated, dry, wet etc.) and varied organ types, capsule endoscope designs (e.g., mono, stereo, dual and 360∘ camera), and the type, number, strength, and placement of internal and external magnetic sources that enable active locomotion. VR-Caps makes it possible to both independently or jointly develop, optimize, and test medical imaging and analysis software for the current and next-generation endoscopic capsule systems. To validate this approach, we train state-of-the-art deep neural networks to accomplish various medical image analysis tasks using simulated data from VR-Caps and evaluate the performance of these models on real medical data. Results demonstrate the usefulness and effectiveness of the proposed virtual platform in developing algorithms that quantify fractional coverage, camera trajectory, 3D map reconstruction, and disease classification. All of the code, pre-trained weights and created 3D organ models of the virtual environment with detailed instructions how to setup and use the environment are made publicly available at https://github.com/CapsuleEndoscope/VirtualCapsuleEndoscopy and a video demonstration can be seen in the supplementary videos (Video-I).
Collapse
Affiliation(s)
- Kağan İncetan
- Institute of Biomedical Engineering, Bogazici University, Istanbul, Turkey
| | - Ibrahim Omer Celik
- Department of Computer Engineering, Bogazici University, Istanbul, Turkey
| | - Abdulhamid Obeid
- Institute of Biomedical Engineering, Bogazici University, Istanbul, Turkey
| | | | | | | | - Richard J Chen
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Faisal Mahmood
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Cancer Data Science, Dana Farber Cancer Institute, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Hunter Gilbert
- Deparment of Mechanical and Industrial Engineering, Louisiana State University, Baton Rouge, LA USA
| | - Nicholas J Durr
- Department of Biomedical Engineering, Johns Hopkins University (JHU), Baltimore, MD, USA
| | - Mehmet Turan
- Institute of Biomedical Engineering, Bogazici University, Istanbul, Turkey.
| |
Collapse
|
4
|
Almalioglu Y, Bengisu Ozyoruk K, Gokce A, Incetan K, Irem Gokceler G, Ali Simsek M, Ararat K, Chen RJ, Durr NJ, Mahmood F, Turan M. EndoL2H: Deep Super-Resolution for Capsule Endoscopy. IEEE Trans Med Imaging 2020; 39:4297-4309. [PMID: 32795966 DOI: 10.1109/tmi.2020.3016744] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Although wireless capsule endoscopy is the preferred modality for diagnosis and assessment of small bowel diseases, the poor camera resolution is a substantial limitation for both subjective and automated diagnostics. Enhanced-resolution endoscopy has shown to improve adenoma detection rate for conventional endoscopy and is likely to do the same for capsule endoscopy. In this work, we propose and quantitatively validate a novel framework to learn a mapping from low-to-high-resolution endoscopic images. We combine conditional adversarial networks with a spatial attention block to improve the resolution by up to factors of 8× , 10× , 12× , respectively. Quantitative and qualitative studies demonstrate the superiority of EndoL2H over state-of-the-art deep super-resolution methods Deep Back-Projection Networks (DBPN), Deep Residual Channel Attention Networks (RCAN) and Super Resolution Generative Adversarial Network (SRGAN). Mean Opinion Score (MOS) tests were performed by 30 gastroenterologists qualitatively assess and confirm the clinical relevance of the approach. EndoL2H is generally applicable to any endoscopic capsule system and has the potential to improve diagnosis and better harness computational approaches for polyp detection and characterization. Our code and trained models are available at https://github.com/CapsuleEndoscope/EndoL2H.
Collapse
|
5
|
Saputra MRU, de Gusmao PPB, Lu CX, Almalioglu Y, Rosa S, Chen C, Wahlstrom J, Wang W, Markham A, Trigoni N. DeepTIO: A Deep Thermal-Inertial Odometry With Visual Hallucination. IEEE Robot Autom Lett 2020. [DOI: 10.1109/lra.2020.2969170] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
6
|
Turan M, Almalioglu Y, Gilbert HB, Mahmood F, Durr NJ, Araujo H, Sari AE, Ajay A, Sitti M. Learning to Navigate Endoscopic Capsule Robots. IEEE Robot Autom Lett 2019. [DOI: 10.1109/lra.2019.2924846] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
7
|
Turan M, Almalioglu Y, Araujo H, Konukoglu E, Sitti M. Deep EndoVO: A recurrent convolutional neural network (RCNN) based visual odometry approach for endoscopic capsule robots. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.10.014] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|