101
|
Lyu X, Cheng L, Zhang S. The RETA Benchmark for Retinal Vascular Tree Analysis. Sci Data 2022; 9:397. [PMID: 35817778 PMCID: PMC9273761 DOI: 10.1038/s41597-022-01507-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 06/28/2022] [Indexed: 12/23/2022] Open
Abstract
Topological and geometrical analysis of retinal blood vessels could be a cost-effective way to detect various common diseases. Automated vessel segmentation and vascular tree analysis models require powerful generalization capability in clinical applications. In this work, we constructed a novel benchmark RETA with 81 labelled vessel masks aiming to facilitate retinal vessel analysis. A semi-automated coarse-to-fine workflow was proposed for vessel annotation task. During database construction, we strived to control inter-annotator and intra-annotator variability by means of multi-stage annotation and label disambiguation on self-developed dedicated software. In addition to binary vessel masks, we obtained other types of annotations including artery/vein masks, vascular skeletons, bifurcations, trees and abnormalities. Subjective and objective quality validations of the annotated vessel masks demonstrated significantly improved quality over the existing open datasets. Our annotation software is also made publicly available serving the purpose of pixel-level vessel visualization. Researchers could develop vessel segmentation algorithms and evaluate segmentation performance using RETA. Moreover, it might promote the study of cross-modality tubular structure segmentation and analysis.
Collapse
Affiliation(s)
- Xingzheng Lyu
- College of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, China.
| | - Li Cheng
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, T6G 1H9, Canada
| | - Sanyuan Zhang
- College of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, China.
| |
Collapse
|
102
|
Garcia Santa Cruz B, Slter J, Gomez-Giro G, Saraiva C, Sabate-Soler S, Modamio J, Barmpa K, Schwamborn JC, Hertel F, Jarazo J, Husch A. Generalising from conventional pipelines using deep learning in high-throughput screening workflows. Sci Rep 2022; 12:11465. [PMID: 35794231 PMCID: PMC9259641 DOI: 10.1038/s41598-022-15623-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 06/27/2022] [Indexed: 11/09/2022] Open
Abstract
The study of complex diseases relies on large amounts of data to build models toward precision medicine. Such data acquisition is feasible in the context of high-throughput screening, in which the quality of the results relies on the accuracy of the image analysis. Although state-of-the-art solutions for image segmentation employ deep learning approaches, the high cost of manually generating ground truth labels for model training hampers the day-to-day application in experimental laboratories. Alternatively, traditional computer vision-based solutions do not need expensive labels for their implementation. Our work combines both approaches by training a deep learning network using weak training labels automatically generated with conventional computer vision methods. Our network surpasses the conventional segmentation quality by generalising beyond noisy labels, providing a 25% increase of mean intersection over union, and simultaneously reducing the development and inference times. Our solution was embedded into an easy-to-use graphical user interface that allows researchers to assess the predictions and correct potential inaccuracies with minimal human input. To demonstrate the feasibility of training a deep learning solution on a large dataset of noisy labels automatically generated by a conventional pipeline, we compared our solution against the common approach of training a model from a small manually curated dataset by several experts. Our work suggests that humans perform better in context interpretation, such as error assessment, while computers outperform in pixel-by-pixel fine segmentation. Such pipelines are illustrated with a case study on image segmentation for autophagy events. This work aims for better translation of new technologies to real-world settings in microscopy-image analysis.
Collapse
Affiliation(s)
- Beatriz Garcia Santa Cruz
- National Department of Neurosurgery, Centre Hospitalier de Luxembourg, 4, Rue Ernest Barble, 1210, Luxembourg (City), Luxembourg.
- Interventional Neuroscience Group, Luxembourg Center for Systems Biomedicine, University of Luxembourg, 6, Avenue du Swing, 4367, Belvaux, Luxembourg.
| | - Jan Slter
- Interventional Neuroscience Group, Luxembourg Center for Systems Biomedicine, University of Luxembourg, 6, Avenue du Swing, 4367, Belvaux, Luxembourg
| | - Gemma Gomez-Giro
- Developmental and Cellular Biology, Luxembourg Center for Systems Biomedicine, University of Luxembourg, 6, Avenue du Swing, 4367, Belvaux, Luxembourg
| | - Claudia Saraiva
- Developmental and Cellular Biology, Luxembourg Center for Systems Biomedicine, University of Luxembourg, 6, Avenue du Swing, 4367, Belvaux, Luxembourg
| | - Sonia Sabate-Soler
- Developmental and Cellular Biology, Luxembourg Center for Systems Biomedicine, University of Luxembourg, 6, Avenue du Swing, 4367, Belvaux, Luxembourg
| | - Jennifer Modamio
- Developmental and Cellular Biology, Luxembourg Center for Systems Biomedicine, University of Luxembourg, 6, Avenue du Swing, 4367, Belvaux, Luxembourg
| | - Kyriaki Barmpa
- Developmental and Cellular Biology, Luxembourg Center for Systems Biomedicine, University of Luxembourg, 6, Avenue du Swing, 4367, Belvaux, Luxembourg
| | - Jens Christian Schwamborn
- Developmental and Cellular Biology, Luxembourg Center for Systems Biomedicine, University of Luxembourg, 6, Avenue du Swing, 4367, Belvaux, Luxembourg
| | - Frank Hertel
- National Department of Neurosurgery, Centre Hospitalier de Luxembourg, 4, Rue Ernest Barble, 1210, Luxembourg (City), Luxembourg
- Interventional Neuroscience Group, Luxembourg Center for Systems Biomedicine, University of Luxembourg, 6, Avenue du Swing, 4367, Belvaux, Luxembourg
| | - Javier Jarazo
- Developmental and Cellular Biology, Luxembourg Center for Systems Biomedicine, University of Luxembourg, 6, Avenue du Swing, 4367, Belvaux, Luxembourg
- OrganoTherapeutics SARL, 6A, avenue des Hauts-Fourneaux, 4365, Esch-sur-Alzette, Luxembourg
| | - Andreas Husch
- Interventional Neuroscience Group, Luxembourg Center for Systems Biomedicine, University of Luxembourg, 6, Avenue du Swing, 4367, Belvaux, Luxembourg.
- Systems Control Group, Luxembourg Centere for Systems Biomedicine, University of Luxembourg, 6, Avenue du Swing, 4367, Belvaux, Luxembourg.
| |
Collapse
|
103
|
Blake N, Gaifulina R, Griffin LD, Bell IM, Thomas GMH. Machine Learning of Raman Spectroscopy Data for Classifying Cancers: A Review of the Recent Literature. Diagnostics (Basel) 2022; 12:diagnostics12061491. [PMID: 35741300 PMCID: PMC9222091 DOI: 10.3390/diagnostics12061491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 06/13/2022] [Accepted: 06/14/2022] [Indexed: 11/16/2022] Open
Abstract
Raman Spectroscopy has long been anticipated to augment clinical decision making, such as classifying oncological samples. Unfortunately, the complexity of Raman data has thus far inhibited their routine use in clinical settings. Traditional machine learning models have been used to help exploit this information, but recent advances in deep learning have the potential to improve the field. However, there are a number of potential pitfalls with both traditional and deep learning models. We conduct a literature review to ascertain the recent machine learning methods used to classify cancers using Raman spectral data. We find that while deep learning models are popular, and ostensibly outperform traditional learning models, there are many methodological considerations which may be leading to an over-estimation of performance; primarily, small sample sizes which compound sub-optimal choices regarding sampling and validation strategies. Amongst several recommendations is a call to collate large benchmark Raman datasets, similar to those that have helped transform digital pathology, which researchers can use to develop and refine deep learning models.
Collapse
Affiliation(s)
- Nathan Blake
- Department of Cell and Developmental Biology, University College London, London WC1E 6BT, UK; (N.B.); (R.G.)
| | - Riana Gaifulina
- Department of Cell and Developmental Biology, University College London, London WC1E 6BT, UK; (N.B.); (R.G.)
| | - Lewis D. Griffin
- Department of Computer Science, University College London, London WC1E 6BT, UK;
| | - Ian M. Bell
- Spectroscopy Products Division, Renishaw plc, Wotton-under-Edge GL12 8JR, UK;
| | - Geraint M. H. Thomas
- Department of Cell and Developmental Biology, University College London, London WC1E 6BT, UK; (N.B.); (R.G.)
- Correspondence: ; Tel.: +44-20-3549-5456
| |
Collapse
|
104
|
Xue C, Yu L, Chen P, Dou Q, Heng PA. Robust Medical Image Classification From Noisy Labeled Data With Global and Local Representation Guided Co-Training. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:1371-1382. [PMID: 34982680 DOI: 10.1109/tmi.2021.3140140] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Deep neural networks have achieved remarkable success in a wide variety of natural image and medical image computing tasks. However, these achievements indispensably rely on accurately annotated training data. If encountering some noisy-labeled images, the network training procedure would suffer from difficulties, leading to a sub-optimal classifier. This problem is even more severe in the medical image analysis field, as the annotation quality of medical images heavily relies on the expertise and experience of annotators. In this paper, we propose a novel collaborative training paradigm with global and local representation learning for robust medical image classification from noisy-labeled data to combat the lack of high quality annotated medical data. Specifically, we employ the self-ensemble model with a noisy label filter to efficiently select the clean and noisy samples. Then, the clean samples are trained by a collaborative training strategy to eliminate the disturbance from imperfect labeled samples. Notably, we further design a novel global and local representation learning scheme to implicitly regularize the networks to utilize noisy samples in a self-supervised manner. We evaluated our proposed robust learning strategy on four public medical image classification datasets with three types of label noise, i.e., random noise, computer-generated label noise, and inter-observer variability noise. Our method outperforms other learning from noisy label methods and we also conducted extensive experiments to analyze each component of our method.
Collapse
|
105
|
Lopez-Almazan H, Javier Pérez-Benito F, Larroza A, Perez-Cortes JC, Pollan M, Perez-Gomez B, Salas Trejo D, Casals M, Llobet R. A deep learning framework to classify breast density with noisy labels regularization. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 221:106885. [PMID: 35594581 DOI: 10.1016/j.cmpb.2022.106885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 04/12/2022] [Accepted: 05/10/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND AND OBJECTIVE Breast density assessed from digital mammograms is a biomarker for higher risk of developing breast cancer. Experienced radiologists assess breast density using the Breast Image and Data System (BI-RADS) categories. Supervised learning algorithms have been developed with this objective in mind, however, the performance of these algorithms depends on the quality of the ground-truth information which is usually labeled by expert readers. These labels are noisy approximations of the ground truth, as there is often intra- and inter-reader variability among labels. Thus, it is crucial to provide a reliable method to obtain digital mammograms matching BI-RADS categories. This paper presents RegL (Labels Regularizer), a methodology that includes different image pre-processes to allow both a correct breast segmentation and the enhancement of image quality through an intensity adjustment, thus allowing the use of deep learning to classify the mammograms into BI-RADS categories. The Confusion Matrix (CM) - CNN network used implements an architecture that models each radiologist's noisy label. The final methodology pipeline was determined after comparing the performance of image pre-processes combined with different DL architectures. METHODS A multi-center study composed of 1395 women whose mammograms were classified into the four BI-RADS categories by three experienced radiologists is presented. A total of 892 mammograms were used as the training corpus, 224 formed the validation corpus, and 279 the test corpus. RESULTS The combination of five networks implementing the RegL methodology achieved the best results among all the models in the test set. The ensemble model obtained an accuracy of (0.85) and a kappa index of 0.71. CONCLUSIONS The proposed methodology has a similar performance to the experienced radiologists in the classification of digital mammograms into BI-RADS categories. This suggests that the pre-processing steps and modelling of each radiologist's label allows for a better estimation of the unknown ground truth labels.
Collapse
Affiliation(s)
- Hector Lopez-Almazan
- Instituto Tecnológico de la Informática, Universitat Politècnica de València,Camino de Vera, s/n, 46022 València, Spain.
| | - Francisco Javier Pérez-Benito
- Instituto Tecnológico de la Informática, Universitat Politècnica de València,Camino de Vera, s/n, 46022 València, Spain.
| | - Andrés Larroza
- Instituto Tecnológico de la Informática, Universitat Politècnica de València,Camino de Vera, s/n, 46022 València, Spain.
| | - Juan-Carlos Perez-Cortes
- Instituto Tecnológico de la Informática, Universitat Politècnica de València,Camino de Vera, s/n, 46022 València, Spain.
| | - Marina Pollan
- National Center for Epidemiology, Carlos III Institute of Health, Monforte de lemos, 5, 28029 Madrid, Spain; Consortium for Biomedical Research in Epidemiology and Public Health (CIBER en Epidemiología y Salud Pública - CIBERESP), Carlos III Institute of Health, Monforte de Lemos 5, 28029 Madrid, Spain.
| | - Beatriz Perez-Gomez
- National Center for Epidemiology, Carlos III Institute of Health, Monforte de lemos, 5, 28029 Madrid, Spain; Consortium for Biomedical Research in Epidemiology and Public Health (CIBER en Epidemiología y Salud Pública - CIBERESP), Carlos III Institute of Health, Monforte de Lemos 5, 28029 Madrid, Spain.
| | - Dolores Salas Trejo
- Valencian Breast Cancer Screening Program, General Directorate of Public Health, València, Spain; Centro Superior de Investigación en Salud Pública CSISP, FISABIO, València, Spain.
| | - María Casals
- Valencian Breast Cancer Screening Program, General Directorate of Public Health, València, Spain; Centro Superior de Investigación en Salud Pública CSISP, FISABIO, València, Spain.
| | - Rafael Llobet
- Instituto Tecnológico de la Informática, Universitat Politècnica de València,Camino de Vera, s/n, 46022 València, Spain.
| |
Collapse
|
106
|
Ju L, Wang X, Wang L, Mahapatra D, Zhao X, Zhou Q, Liu T, Ge Z. Improving Medical Images Classification With Label Noise Using Dual-Uncertainty Estimation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:1533-1546. [PMID: 34995185 DOI: 10.1109/tmi.2022.3141425] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Deep neural networks are known to be data-driven and label noise can have a marked impact on model performance. Recent studies have shown great robustness to classic image recognition even under a high noisy rate. In medical applications, learning from datasets with label noise is more challenging since medical imaging datasets tend to have instance-dependent noise (IDN) and suffer from high observer variability. In this paper, we systematically discuss the two common types of label noise in medical images - disagreement label noise from inconsistency expert opinions and single-target label noise from biased aggregation of individual annotations. We then propose an uncertainty estimation-based framework to handle these two label noise amid the medical image classification task. We design a dual-uncertainty estimation approach to measure the disagreement label noise and single-target label noise via improved Direct Uncertainty Prediction and Monte-Carlo-Dropout. A boosting-based curriculum training procedure is later introduced for robust learning. We demonstrate the effectiveness of our method by conducting extensive experiments on three different diseases with synthesized and real-world label noise: skin lesions, prostate cancer, and retinal diseases. We also release a large re-engineered database that consists of annotations from more than ten ophthalmologists with an unbiased golden standard dataset for evaluation and benchmarking. The dataset is available at https://mmai.group/peoples/julie/.
Collapse
|
107
|
Barragán-Montero A, Bibal A, Dastarac MH, Draguet C, Valdés G, Nguyen D, Willems S, Vandewinckele L, Holmström M, Löfman F, Souris K, Sterpin E, Lee JA. Towards a safe and efficient clinical implementation of machine learning in radiation oncology by exploring model interpretability, explainability and data-model dependency. Phys Med Biol 2022; 67:10.1088/1361-6560/ac678a. [PMID: 35421855 PMCID: PMC9870296 DOI: 10.1088/1361-6560/ac678a] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 04/14/2022] [Indexed: 01/26/2023]
Abstract
The interest in machine learning (ML) has grown tremendously in recent years, partly due to the performance leap that occurred with new techniques of deep learning, convolutional neural networks for images, increased computational power, and wider availability of large datasets. Most fields of medicine follow that popular trend and, notably, radiation oncology is one of those that are at the forefront, with already a long tradition in using digital images and fully computerized workflows. ML models are driven by data, and in contrast with many statistical or physical models, they can be very large and complex, with countless generic parameters. This inevitably raises two questions, namely, the tight dependence between the models and the datasets that feed them, and the interpretability of the models, which scales with its complexity. Any problems in the data used to train the model will be later reflected in their performance. This, together with the low interpretability of ML models, makes their implementation into the clinical workflow particularly difficult. Building tools for risk assessment and quality assurance of ML models must involve then two main points: interpretability and data-model dependency. After a joint introduction of both radiation oncology and ML, this paper reviews the main risks and current solutions when applying the latter to workflows in the former. Risks associated with data and models, as well as their interaction, are detailed. Next, the core concepts of interpretability, explainability, and data-model dependency are formally defined and illustrated with examples. Afterwards, a broad discussion goes through key applications of ML in workflows of radiation oncology as well as vendors' perspectives for the clinical implementation of ML.
Collapse
Affiliation(s)
- Ana Barragán-Montero
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| | - Adrien Bibal
- PReCISE, NaDI Institute, Faculty of Computer Science, UNamur and CENTAL, ILC, UCLouvain, Belgium
| | - Margerie Huet Dastarac
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| | - Camille Draguet
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
- Department of Oncology, Laboratory of Experimental Radiotherapy, KU Leuven, Belgium
| | - Gilmer Valdés
- Department of Radiation Oncology, Department of Epidemiology and Biostatistics, University of California, San Francisco, United States of America
| | - Dan Nguyen
- Medical Artificial Intelligence and Automation (MAIA) Laboratory, Department of Radiation Oncology, UT Southwestern Medical Center, United States of America
| | - Siri Willems
- ESAT/PSI, KU Leuven Belgium & MIRC, UZ Leuven, Belgium
| | | | | | | | - Kevin Souris
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| | - Edmond Sterpin
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
- Department of Oncology, Laboratory of Experimental Radiotherapy, KU Leuven, Belgium
| | - John A Lee
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| |
Collapse
|
108
|
Lin AA, Nimgaonkar V, Issadore D, Carpenter EL. Extracellular Vesicle-Based Multianalyte Liquid Biopsy as a Diagnostic for Cancer. Annu Rev Biomed Data Sci 2022; 5:269-292. [PMID: 35562850 DOI: 10.1146/annurev-biodatasci-122120-113218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Liquid biopsy is the analysis of materials shed by tumors into circulation, such as circulating tumor cells, nucleic acids, and extracellular vesicles (EVs), for the diagnosis and management of cancer. These assays have rapidly evolved with recent FDA approvals of single biomarkers in patients with advanced metastatic disease. However, they have lacked sensitivity or specificity as a diagnostic in early-stage cancer, primarily due to low concentrations in circulating plasma. EVs, membrane-enclosed nanoscale vesicles shed by tumor and other cells into circulation, are a promising liquid biopsy analyte owing to their protein and nucleic acid cargoes carried from their mother cells, their surface proteins specific to their cells of origin, and their higher concentrations over other noninvasive biomarkers across disease stages. Recently, the combination of EVs with non-EV biomarkers has driven improvements in sensitivity and accuracy; this has been fueled by the use of machine learning (ML) to algorithmically identify and combine multiple biomarkers into a composite biomarker for clinical prediction. This review presents an analysis of EV isolation methods, surveys approaches for and issues with using ML in multianalyte EV datasets, and describes best practices for bringing multianalyte liquid biopsy to clinical implementation. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Andrew A Lin
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA; .,Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Vivek Nimgaonkar
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA;
| | - David Issadore
- Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Erica L Carpenter
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA;
| |
Collapse
|
109
|
Yang S, Wang G, Sun H, Luo X, Sun P, Li K, Wang Q, Zhang S. Learning COVID-19 Pneumonia Lesion Segmentation from Imperfect Annotations via Divergence-Aware Selective Training. IEEE J Biomed Health Inform 2022; 26:3673-3684. [PMID: 35522641 DOI: 10.1109/jbhi.2022.3172978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The COVID-19 pandemic has spread the world like no other crisis in recent history. Automatic segmentation of COVID-19 pneumonia lesions is critical for quantitative measurement for diagnosis and treatment management. For this task, deep learning is the state-of-the-art method while requires a large set of accurately annotated images for training, which is difficult to obtain due to limited access to experts and the time-consuming annotation process. To address this problem, we aim to train the segmentation network from imperfect annotations, where the training set consists of a small clean set of accurately annotated images by experts and a large noisy set of inaccurate annotations by non-experts. To avoid the labels with different qualities corrupting the segmentation model, we propose a new approach to train segmentation networks to deal with noisy labels. We introduce a dual-branch network to separately learn from the accurate and noisy annotations. To fully exploit the imperfect annotations as well as suppressing the noise, we design a Divergence-Aware Selective Training (DAST) strategy, where a divergence-aware noisiness score is used to identify severely noisy annotations and slightly noisy annotations. For severely noisy samples we use an unsupervised regularization through dual-branch consistency between predictions from the two branches. We also refine slightly noisy samples and use them as supplementary data for the clean branch to avoid overfitting. Experimental results show that our method achieves a higher performance than standard training process for COVID-19 pneumonia lesion segmentation when learning from imperfect labels, and our framework outperforms the state-of-the-art noise-tolerate methods significantly with various clean label percentages.
Collapse
|
110
|
Wood DA, Kafiabadi S, Busaidi AA, Guilhem E, Montvila A, Lynch J, Townend M, Agarwal S, Mazumder A, Barker GJ, Ourselin S, Cole JH, Booth TC. Deep learning models for triaging hospital head MRI examinations. Med Image Anal 2022; 78:102391. [PMID: 35183876 DOI: 10.1016/j.media.2022.102391] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 01/19/2022] [Accepted: 02/10/2022] [Indexed: 10/19/2022]
Abstract
The growing demand for head magnetic resonance imaging (MRI) examinations, along with a global shortage of radiologists, has led to an increase in the time taken to report head MRI scans in recent years. For many neurological conditions, this delay can result in poorer patient outcomes and inflated healthcare costs. Potentially, computer vision models could help reduce reporting times for abnormal examinations by flagging abnormalities at the time of imaging, allowing radiology departments to prioritise limited resources into reporting these scans first. To date, however, the difficulty of obtaining large, clinically-representative labelled datasets has been a bottleneck to model development. In this work, we present a deep learning framework, based on convolutional neural networks, for detecting clinically-relevant abnormalities in minimally processed, hospital-grade axial T2-weighted and axial diffusion-weighted head MRI scans. The models were trained at scale using a Transformer-based neuroradiology report classifier to generate a labelled dataset of 70,206 examinations from two large UK hospital networks, and demonstrate fast (< 5 s), accurate (area under the receiver operating characteristic curve (AUC) > 0.9), and interpretable classification, with good generalisability between hospitals (ΔAUC ≤ 0.02). Through a simulation study we show that our best model would reduce the mean reporting time for abnormal examinations from 28 days to 14 days and from 9 days to 5 days at the two hospital networks, demonstrating feasibility for use in a clinical triage environment.
Collapse
Affiliation(s)
- David A Wood
- School of Biomedical Engineering and Imaging Sciences, King's College London, United Kingdom
| | - Sina Kafiabadi
- King's College Hospital NHS Foundation Trust, United Kingdom
| | | | - Emily Guilhem
- King's College Hospital NHS Foundation Trust, United Kingdom
| | | | - Jeremy Lynch
- King's College Hospital NHS Foundation Trust, United Kingdom
| | | | - Siddharth Agarwal
- School of Biomedical Engineering and Imaging Sciences, King's College London, United Kingdom
| | - Asif Mazumder
- Guy's and St Thomas' NHS Foundation Trust, United Kingdom
| | - Gareth J Barker
- Department of Neuroimaging, Institute of Psychiatry, Psychology, & Neuroscience, King's College London, United Kingdom
| | - Sebastien Ourselin
- School of Biomedical Engineering and Imaging Sciences, King's College London, United Kingdom
| | - James H Cole
- Department of Neuroimaging, Institute of Psychiatry, Psychology, & Neuroscience, King's College London, United Kingdom; Dementia Research Centre, Institute of Neurology, University College London, United Kingdom; Centre for Medical Image Computing, Department of Computer Science, University College London, United Kingdom
| | - Thomas C Booth
- School of Biomedical Engineering and Imaging Sciences, King's College London, United Kingdom; King's College Hospital NHS Foundation Trust, United Kingdom.
| |
Collapse
|
111
|
Li W, Li J, Wang Z, Polson J, Sisk AE, Sajed DP, Speier W, Arnold CW. PathAL: An Active Learning Framework for Histopathology Image Analysis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:1176-1187. [PMID: 34898432 PMCID: PMC9199991 DOI: 10.1109/tmi.2021.3135002] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Deep neural networks, in particular convolutional networks, have rapidly become a popular choice for analyzing histopathology images. However, training these models relies heavily on a large number of samples manually annotated by experts, which is cumbersome and expensive. In addition, it is difficult to obtain a perfect set of labels due to the variability between expert annotations. This paper presents a novel active learning (AL) framework for histopathology image analysis, named PathAL. To reduce the required number of expert annotations, PathAL selects two groups of unlabeled data in each training iteration: one "informative" sample that requires additional expert annotation, and one "confident predictive" sample that is automatically added to the training set using the model's pseudo-labels. To reduce the impact of the noisy-labeled samples in the training set, PathAL systematically identifies noisy samples and excludes them to improve the generalization of the model. Our model advances the existing AL method for medical image analysis in two ways. First, we present a selection strategy to improve classification performance with fewer manual annotations. Unlike traditional methods focusing only on finding the most uncertain samples with low prediction confidence, we discover a large number of high confidence samples from the unlabeled set and automatically add them for training with assigned pseudo-labels. Second, we design a method to distinguish between noisy samples and hard samples using a heuristic approach. We exclude the noisy samples while preserving the hard samples to improve model performance. Extensive experiments demonstrate that our proposed PathAL framework achieves promising results on a prostate cancer Gleason grading task, obtaining similar performance with 40% fewer annotations compared to the fully supervised learning scenario. An ablation study is provided to analyze the effectiveness of each component in PathAL, and a pathologist reader study is conducted to validate our proposed algorithm.
Collapse
|
112
|
Iorga M, Drakopoulos M, Naidech AM, Katsaggelos AK, Parrish TB, Hill VB. Labeling Noncontrast Head CT Reports for Common Findings Using Natural Language Processing. AJNR Am J Neuroradiol 2022; 43:721-726. [PMID: 35483905 PMCID: PMC9089256 DOI: 10.3174/ajnr.a7500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 03/14/2022] [Indexed: 11/07/2022]
Abstract
BACKGROUND AND PURPOSE Prioritizing reading of noncontrast head CT examinations through an automated triage system may improve time to care for patients with acute neuroradiologic findings. We present a natural language-processing approach for labeling findings in noncontrast head CT reports, which permits creation of a large, labeled dataset of head CT images for development of emergent-finding detection and reading-prioritization algorithms. MATERIALS AND METHODS In this retrospective study, 1002 clinical radiology reports from noncontrast head CTs collected between 2008 and 2013 were manually labeled across 12 common neuroradiologic finding categories. Each report was then encoded using an n-gram model of unigrams, bigrams, and trigrams. A logistic regression model was then trained to label each report for every common finding. Models were trained and assessed using a combination of L2 regularization and 5-fold cross-validation. RESULTS Model performance was strongest for the fracture, hemorrhage, herniation, mass effect, pneumocephalus, postoperative status, and volume loss models in which the area under the receiver operating characteristic curve exceeded 0.95. Performance was relatively weaker for the edema, hydrocephalus, infarct, tumor, and white-matter disease models (area under the receiver operating characteristic curve > 0.85). Analysis of coefficients revealed finding-specific words among the top coefficients in each model. Class output probabilities were found to be a useful indicator of predictive error on individual report examples in higher-performing models. CONCLUSIONS Combining logistic regression with n-gram encoding is a robust approach to labeling common findings in noncontrast head CT reports.
Collapse
Affiliation(s)
- M Iorga
- From the Departments of Radiology (M.I., M.D., T.B.P., V.B.H.)
- Departments of Biomedical Engineering (M.I., A.K.K., T.B.P.)
| | - M Drakopoulos
- From the Departments of Radiology (M.I., M.D., T.B.P., V.B.H.)
| | - A M Naidech
- Neurology (A.M.N.), Northwestern University Feinberg School of Medicine, Chicago, Illinois
| | - A K Katsaggelos
- Departments of Biomedical Engineering (M.I., A.K.K., T.B.P.)
- Electrical and Computer Engineering (A.K.K.)
- Computer Science (A.K.K.), Northwestern University, Chicago, Illinois
| | - T B Parrish
- From the Departments of Radiology (M.I., M.D., T.B.P., V.B.H.)
- Departments of Biomedical Engineering (M.I., A.K.K., T.B.P.)
| | - V B Hill
- From the Departments of Radiology (M.I., M.D., T.B.P., V.B.H.)
| |
Collapse
|
113
|
Sousa J, Pereira T, Neves I, Silva F, Oliveira HP. The Influence of a Coherent Annotation and Synthetic Addition of Lung Nodules for Lung Segmentation in CT Scans. SENSORS 2022; 22:s22093443. [PMID: 35591132 PMCID: PMC9100675 DOI: 10.3390/s22093443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Revised: 04/13/2022] [Accepted: 04/27/2022] [Indexed: 12/10/2022]
Abstract
Lung cancer is a highly prevalent pathology and a leading cause of cancer-related deaths. Most patients are diagnosed when the disease has manifested itself, which usually is a sign of lung cancer in an advanced stage and, as a consequence, the 5-year survival rates are low. To increase the chances of survival, improving the cancer early detection capacity is crucial, for which computed tomography (CT) scans represent a key role. The manual evaluation of the CTs is a time-consuming task and computer-aided diagnosis (CAD) systems can help relieve that burden. The segmentation of the lung is one of the first steps in these systems, yet it is very challenging given the heterogeneity of lung diseases usually present and associated with cancer development. In our previous work, a segmentation model based on a ResNet34 and U-Net combination was developed on a cross-cohort dataset that yielded good segmentation masks for multiple pathological conditions but misclassified some of the lung nodules. The multiple datasets used for the model development were originated from different annotation protocols, which generated inconsistencies for the learning process, and the annotations are usually not adequate for lung cancer studies since they did not comprise lung nodules. In addition, the initial datasets used for training presented a reduced number of nodules, which was showed not to be enough to allow the segmentation model to learn to include them as a lung part. In this work, an objective protocol for the lung mask’s segmentation was defined and the previous annotations were carefully reviewed and corrected to create consistent and adequate ground-truth masks for the development of the segmentation model. Data augmentation with domain knowledge was used to create lung nodules in the cases used to train the model. The model developed achieved a Dice similarity coefficient (DSC) above 0.9350 for all test datasets and it showed an ability to cope, not only with a variety of lung patterns, but also with the presence of lung nodules as well. This study shows the importance of using consistent annotations for the supervised learning process, which is a very time-consuming task, but that has great importance to healthcare applications. Due to the lack of massive datasets in the medical field, which consequently brings a lack of wide representativity, data augmentation with domain knowledge could represent a promising help to overcome this limitation for learning models development.
Collapse
Affiliation(s)
- Joana Sousa
- INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal; (T.P.); (F.S.); (H.P.O.)
- FEUP—Faculty of Engineering, University of Porto, 4200-465 Porto, Portugal
- Correspondence:
| | - Tania Pereira
- INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal; (T.P.); (F.S.); (H.P.O.)
| | - Inês Neves
- ICBAS—Abel Salazar Biomedical Sciences Institute, University of Porto, 4050-313 Porto, Portugal;
| | - Francisco Silva
- INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal; (T.P.); (F.S.); (H.P.O.)
- FCUP—Faculty of Science, University of Porto, 4169-007 Porto, Portugal
| | - Hélder P. Oliveira
- INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal; (T.P.); (F.S.); (H.P.O.)
- FCUP—Faculty of Science, University of Porto, 4169-007 Porto, Portugal
| |
Collapse
|
114
|
Ramachandram D, Ramirez-GarciaLuna JL, Fraser RDJ, Martínez-Jiménez MA, Arriaga-Caballero JE, Allport J. Fully Automated Wound Tissue Segmentation Using Deep Learning on Mobile Devices: Cohort Study. JMIR Mhealth Uhealth 2022; 10:e36977. [PMID: 35451982 PMCID: PMC9077502 DOI: 10.2196/36977] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 03/14/2022] [Accepted: 03/21/2022] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Composition of tissue types within a wound is a useful indicator of its healing progression. Tissue composition is clinically used in wound healing tools (eg, Bates-Jensen Wound Assessment Tool) to assess risk and recommend treatment. However, wound tissue identification and the estimation of their relative composition is highly subjective. Consequently, incorrect assessments could be reported, leading to downstream impacts including inappropriate dressing selection, failure to identify wounds at risk of not healing, or failure to make appropriate referrals to specialists. OBJECTIVE This study aimed to measure inter- and intrarater variability in manual tissue segmentation and quantification among a cohort of wound care clinicians and determine if an objective assessment of tissue types (ie, size and amount) can be achieved using deep neural networks. METHODS A data set of 58 anonymized wound images of various types of chronic wounds from Swift Medical's Wound Database was used to conduct the inter- and intrarater agreement study. The data set was split into 3 subsets with 50% overlap between subsets to measure intrarater agreement. In this study, 4 different tissue types (epithelial, granulation, slough, and eschar) within the wound bed were independently labeled by the 5 wound clinicians at 1-week intervals using a browser-based image annotation tool. In addition, 2 deep convolutional neural network architectures were developed for wound segmentation and tissue segmentation and were used in sequence in the workflow. These models were trained using 465,187 and 17,000 image-label pairs, respectively. This is the largest and most diverse reported data set used for training deep learning models for wound and wound tissue segmentation. The resulting models offer robust performance in diverse imaging conditions, are unbiased toward skin tones, and could execute in near real time on mobile devices. RESULTS A poor to moderate interrater agreement in identifying tissue types in chronic wound images was reported. A very poor Krippendorff α value of .014 for interrater variability when identifying epithelization was observed, whereas granulation was most consistently identified by the clinicians. The intrarater intraclass correlation (3,1), however, indicates that raters were relatively consistent when labeling the same image multiple times over a period. Our deep learning models achieved a mean intersection over union of 0.8644 and 0.7192 for wound and tissue segmentation, respectively. A cohort of wound clinicians, by consensus, rated 91% (53/58) of the tissue segmentation results to be between fair and good in terms of tissue identification and segmentation quality. CONCLUSIONS The interrater agreement study validates that clinicians exhibit considerable variability when identifying and visually estimating wound tissue proportion. The proposed deep learning technique provides objective tissue identification and measurements to assist clinicians in documenting the wound more accurately and could have a significant impact on wound care when deployed at scale.
Collapse
|
115
|
Liu H, Xu WD, Shang ZH, Wang XD, Zhou HY, Ma KW, Zhou H, Qi JL, Jiang JR, Tan LL, Zeng HM, Cai HJ, Wang KS, Qian YL. Breast Cancer Molecular Subtype Prediction on Pathological Images with Discriminative Patch Selection and Multi-Instance Learning. Front Oncol 2022; 12:858453. [PMID: 35494021 PMCID: PMC9046851 DOI: 10.3389/fonc.2022.858453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Accepted: 03/14/2022] [Indexed: 11/13/2022] Open
Abstract
Molecular subtypes of breast cancer are important references to personalized clinical treatment. For cost and labor savings, only one of the patient’s paraffin blocks is usually selected for subsequent immunohistochemistry (IHC) to obtain molecular subtypes. Inevitable block sampling error is risky due to the tumor heterogeneity and could result in a delay in treatment. Molecular subtype prediction from conventional H&E pathological whole slide images (WSI) using the AI method is useful and critical to assist pathologists to pre-screen proper paraffin block for IHC. It is a challenging task since only WSI-level labels of molecular subtypes from IHC can be obtained without detailed local region information. Gigapixel WSIs are divided into a huge amount of patches to be computationally feasible for deep learning, while with coarse slide-level labels, patch-based methods may suffer from abundant noise patches, such as folds, overstained regions, or non-tumor tissues. A weakly supervised learning framework based on discriminative patch selection and multi-instance learning was proposed for breast cancer molecular subtype prediction from H&E WSIs. Firstly, co-teaching strategy using two networks was adopted to learn molecular subtype representations and filter out some noise patches. Then, a balanced sampling strategy was used to handle the imbalance in subtypes in the dataset. In addition, a noise patch filtering algorithm that used local outlier factor based on cluster centers was proposed to further select discriminative patches. Finally, a loss function integrating local patch with global slide constraint information was used to fine-tune MIL framework on obtained discriminative patches and further improve the prediction performance of molecular subtyping. The experimental results confirmed the effectiveness of the proposed AI method and our models outperformed even senior pathologists, which has the potential to assist pathologists to pre-screen paraffin blocks for IHC in clinic.
Collapse
Affiliation(s)
- Hong Liu
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- *Correspondence: Hong Liu, ; Kuan-Song Wang,
| | - Wen-Dong Xu
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zi-Hao Shang
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xiang-Dong Wang
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
| | - Hai-Yan Zhou
- Department of Pathology, Xiangya Hospital, Central South University, Changsha, China
| | - Ke-Wen Ma
- Department of Pathology, Xiangya Hospital, Central South University, Changsha, China
| | - Huan Zhou
- Department of Pathology, Xiangya Hospital, Central South University, Changsha, China
| | - Jia-Lin Qi
- Department of Pathology, Xiangya Hospital, Central South University, Changsha, China
| | - Jia-Rui Jiang
- Department of Pathology, Xiangya Hospital, Central South University, Changsha, China
| | - Li-Lan Tan
- Department of Pathology, Xiangya Hospital, Central South University, Changsha, China
| | - Hui-Min Zeng
- Department of Pathology, Xiangya Hospital, Central South University, Changsha, China
| | - Hui-Juan Cai
- Department of Pathology, Xiangya Hospital, Central South University, Changsha, China
| | - Kuan-Song Wang
- Department of Pathology, Xiangya Hospital, Central South University, Changsha, China
- School of Basic Medical Science, Central South University, Changsha, China
- *Correspondence: Hong Liu, ; Kuan-Song Wang,
| | - Yue-Liang Qian
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
116
|
Haghighat M, Browning L, Sirinukunwattana K, Malacrino S, Khalid Alham N, Colling R, Cui Y, Rakha E, Hamdy FC, Verrill C, Rittscher J. Automated quality assessment of large digitised histology cohorts by artificial intelligence. Sci Rep 2022; 12:5002. [PMID: 35322056 PMCID: PMC8943120 DOI: 10.1038/s41598-022-08351-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Accepted: 03/03/2022] [Indexed: 02/07/2023] Open
Abstract
Research using whole slide images (WSIs) of histopathology slides has increased exponentially over recent years. Glass slides from retrospective cohorts, some with patient follow-up data are digitised for the development and validation of artificial intelligence (AI) tools. Such resources, therefore, become very important, with the need to ensure that their quality is of the standard necessary for downstream AI development. However, manual quality control of large cohorts of WSIs by visual assessment is unfeasible, and whilst quality control AI algorithms exist, these focus on bespoke aspects of image quality, e.g. focus, or use traditional machine-learning methods, which are unable to classify the range of potential image artefacts that should be considered. In this study, we have trained and validated a multi-task deep neural network to automate the process of quality control of a large retrospective cohort of prostate cases from which glass slides have been scanned several years after production, to determine both the usability of the images at the diagnostic level (considered in this study to be the minimal standard for research) and the common image artefacts present. Using a two-layer approach, quality overlays of WSIs were generated from a quality assessment (QA) undertaken at patch-level at \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$5\times$$\end{document}5× magnification. From these quality overlays the slide-level quality scores were predicted and then compared to those generated by three specialist urological pathologists, with a Pearson correlation of 0.89 for overall ‘usability’ (at a diagnostic level), and 0.87 and 0.82 for focus and H&E staining quality scores respectively. To demonstrate its wider potential utility, we subsequently applied our QA pipeline to the TCGA prostate cancer cohort and to a colorectal cancer cohort, for comparison. Our model, designated as PathProfiler, indicates comparable predicted usability of images from the cohorts assessed (86–90% of WSIs predicted to be usable), and perhaps more significantly is able to predict WSIs that could benefit from an intervention such as re-scanning or re-staining for quality improvement. We have shown in this study that AI can be used to automate the process of quality control of large retrospective WSI cohorts to maximise their utility for research.
Collapse
Affiliation(s)
- Maryam Haghighat
- Department of Engineering Science, Institute of Biomedical Engineering (IBME), University of Oxford, Oxford, UK. .,CSIRO, Brisbane, QLD, Australia.
| | - Lisa Browning
- Department of Cellular Pathology, Oxford University Hospitals NHS Foundation Trust, John Radcliffe Hospital, Oxford, UK.,NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, Oxfordshire, UK
| | - Korsuk Sirinukunwattana
- Department of Engineering Science, Institute of Biomedical Engineering (IBME), University of Oxford, Oxford, UK
| | - Stefano Malacrino
- Nuffield Department of Surgical Sciences, University of Oxford, John Radcliffe Hospital, Oxford, UK
| | - Nasullah Khalid Alham
- Department of Engineering Science, Institute of Biomedical Engineering (IBME), University of Oxford, Oxford, UK
| | - Richard Colling
- Department of Cellular Pathology, Oxford University Hospitals NHS Foundation Trust, John Radcliffe Hospital, Oxford, UK.,Nuffield Department of Surgical Sciences, University of Oxford, John Radcliffe Hospital, Oxford, UK
| | - Ying Cui
- Nuffield Department of Surgical Sciences, University of Oxford, John Radcliffe Hospital, Oxford, UK
| | - Emad Rakha
- School of Medicine, University of Nottingham, Nottingham, UK
| | - Freddie C Hamdy
- NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, Oxfordshire, UK.,Nuffield Department of Surgical Sciences, University of Oxford, John Radcliffe Hospital, Oxford, UK
| | - Clare Verrill
- Department of Cellular Pathology, Oxford University Hospitals NHS Foundation Trust, John Radcliffe Hospital, Oxford, UK.,NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, Oxfordshire, UK.,Nuffield Department of Surgical Sciences, University of Oxford, John Radcliffe Hospital, Oxford, UK
| | - Jens Rittscher
- Department of Engineering Science, Institute of Biomedical Engineering (IBME), University of Oxford, Oxford, UK. .,NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, Oxfordshire, UK.
| |
Collapse
|
117
|
Lipkova J, Chen TY, Lu MY, Chen RJ, Shady M, Williams M, Wang J, Noor Z, Mitchell RN, Turan M, Coskun G, Yilmaz F, Demir D, Nart D, Basak K, Turhan N, Ozkara S, Banz Y, Odening KE, Mahmood F. Deep learning-enabled assessment of cardiac allograft rejection from endomyocardial biopsies. Nat Med 2022; 28:575-582. [PMID: 35314822 PMCID: PMC9353336 DOI: 10.1038/s41591-022-01709-2] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Accepted: 01/19/2022] [Indexed: 02/07/2023]
Abstract
Endomyocardial biopsy (EMB) screening represents the standard of care for detecting allograft rejections after heart transplant. Manual interpretation of EMBs is affected by substantial interobserver and intraobserver variability, which often leads to inappropriate treatment with immunosuppressive drugs, unnecessary follow-up biopsies and poor transplant outcomes. Here we present a deep learning-based artificial intelligence (AI) system for automated assessment of gigapixel whole-slide images obtained from EMBs, which simultaneously addresses detection, subtyping and grading of allograft rejection. To assess model performance, we curated a large dataset from the United States, as well as independent test cohorts from Turkey and Switzerland, which includes large-scale variability across populations, sample preparations and slide scanning instrumentation. The model detects allograft rejection with an area under the receiver operating characteristic curve (AUC) of 0.962; assesses the cellular and antibody-mediated rejection type with AUCs of 0.958 and 0.874, respectively; detects Quilty B lesions, benign mimics of rejection, with an AUC of 0.939; and differentiates between low-grade and high-grade rejections with an AUC of 0.833. In a human reader study, the AI system showed non-inferior performance to conventional assessment and reduced interobserver variability and assessment time. This robust evaluation of cardiac allograft rejection paves the way for clinical trials to establish the efficacy of AI-assisted EMB assessment and its potential for improving heart transplant outcomes.
Collapse
Affiliation(s)
- Jana Lipkova
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
| | - Tiffany Y Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
| | - Ming Y Lu
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Richard J Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Maha Shady
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Mane Williams
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Jingwen Wang
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Computer Science, University of California San Diego (UCSD), La Jolla, CA, USA
| | - Zahra Noor
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Richard N Mitchell
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Harvard-MIT Health Sciences and Technology (HST), Cambridge, MA, USA
| | - Mehmet Turan
- Institute of Biomedical Engineering, Bogazici University, Istanbul, Turkey
| | - Gulfize Coskun
- Institute of Biomedical Engineering, Bogazici University, Istanbul, Turkey
| | - Funda Yilmaz
- Faculty of Medicine, Department of Pathology, Ege University, Izmir, Turkey
| | - Derya Demir
- Faculty of Medicine, Department of Pathology, Ege University, Izmir, Turkey
| | - Deniz Nart
- Faculty of Medicine, Department of Pathology, Ege University, Izmir, Turkey
| | - Kayhan Basak
- Department of Pathology, University of Health Sciences, Ankara, Turkey
| | - Nesrin Turhan
- Department of Pathology, University of Health Sciences, Ankara, Turkey
| | - Selvinaz Ozkara
- Department of Pathology, University of Health Sciences, Ankara, Turkey
| | - Yara Banz
- Institute of Pathology, University of Bern, Bern, Switzerland
| | - Katja E Odening
- Department of Cardiology, Inselspital, Bern University Hospital, Bern, Switzerland
- Institute of Physiology, University of Bern, Bern, Switzerland
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA.
- Dana-Farber Cancer Institute, Boston, MA, USA.
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
- Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
118
|
Zhang J, Chatzichristos C, Vandecasteele K, Swinnen L, Broux V, Cleeren E, Van Paesschen W, De Vos M. Automatic annotation correction for wearable EEG based epileptic seizure detection. J Neural Eng 2022; 19. [PMID: 35158349 DOI: 10.1088/1741-2552/ac54c1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Accepted: 02/14/2022] [Indexed: 11/12/2022]
Abstract
OBJECTIVE Video-electroencephalography (vEEG), which defines the ground truth for the detection of epileptic seizures, is inadequate for long-term home monitoring. Thanks to their advantages in comfort and unobtrusiveness, wearable EEG devices have been suggested as a solution for home monitoring. However, one of the challenges in data-driven automated seizure detection with wearable EEG data is to have reliable seizure annotations. Seizure annotations on the gold-standard 25-channel vEEG recordings may not be optimal to delineate seizure activity on the concomitantly recorded wearable EEG, due to artifacts or absence of ictal activity on the limited set of electrodes of the wearable EEG. This paper aims to develop an automatic approach to correct the imperfect annotations of seizure activity on wearable EEG, which can be used to train seizure detection algorithms. APPROACH This paper first investigates the effectiveness of correcting the seizure annotations for the training set with a visual annotation correction. Then a novel approach has been proposed to automatically remove non-seizure data from wearable EEG in epochs annotated as seizures in gold-standard video-EEG recordings. The performance of the automatic annotation correction approach was evaluated by comparing the seizure detection models trained with 1. original vEEG seizure annotations, 2. visually corrected seizure annotations, and 3. automatically corrected seizure annotations. RESULTS The automatic seizure detection approach trained with automatically corrected seizure annotations was more sensitive and had fewer false-positive detections compared to the approach trained with visually corrected seizure annotations, and the approach trained with the original seizure annotations from gold-standard vEEG. SIGNIFICANCE The wearable EEG seizure detection approach performs better when trained with automatic seizure annotation correction.
Collapse
Affiliation(s)
- Jingwei Zhang
- Department of Electrical Engineering, STADIUS, KU Leuven, Kasteelpark Arenberg 10, Leuven, Flanders, 3000, BELGIUM
| | - Christos Chatzichristos
- Department of Electrical Engineering, STADIUS, KU Leuven, Kasteelpark Arenberg 10 - box 2446, Leuven, Flanders, 3000, BELGIUM
| | - Kaat Vandecasteele
- Department of Electrical Engineering, STADIUS, KU Leuven, Kasteelpark Arenberg 10, Leuven, Flanders, 3000, BELGIUM
| | - Lauren Swinnen
- KU Leuven, ON V Herestraat 49 - box 1022, Leuven, Flanders, 3000, BELGIUM
| | - Victoria Broux
- Katholieke Universiteit Leuven UZ Leuven, UZ Herestraat 49, Leuven, Flanders, 3000, BELGIUM
| | - Evy Cleeren
- Katholieke Universiteit Leuven UZ Leuven, ON II Herestraat 49 - box 1021, Leuven, Flanders, 3000, BELGIUM
| | - Wim Van Paesschen
- Katholieke Universiteit Leuven UZ Leuven, UZ Herestraat 49 - box 7003, Leuven, Flanders, 3000, BELGIUM
| | - Maarten De Vos
- Department of Electrical Engineering, KU Leuven, Kasteelpark Arenberg 10 - box 2440, Leuven, Flanders, 3000, BELGIUM
| |
Collapse
|
119
|
Adamson PM, Bhattbhatt V, Principi S, Beriwal S, Strain LS, Offe M, Wang AS, Vo N, Schmidt TG, Jordan P. Technical note: Evaluation of a V‐Net autosegmentation algorithm for pediatric CT scans: Performance, generalizability and application to patient‐specific CT dosimetry. Med Phys 2022; 49:2342-2354. [PMID: 35128672 PMCID: PMC9007850 DOI: 10.1002/mp.15521] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2021] [Revised: 12/23/2021] [Accepted: 01/08/2022] [Indexed: 11/09/2022] Open
Abstract
PURPOSE This study developed and evaluated a fully convolutional network (FCN) for pediatric CT organ segmentation and investigated the generalizability of the FCN across image heterogeneities such as CT scanner model protocols and patient age. We also evaluated the autosegmentation models as part of a software tool for patient-specific CT dose estimation. METHODS A collection of 359 pediatric CT datasets with expert organ contours were used for model development and evaluation. Autosegmentation models were trained for each organ using a modified FCN 3D V-Net. An independent test set of 60 patients was withheld for testing. To evaluate the impact of CT scanner model protocol and patient age heterogeneities, separate models were trained using a subset of scanner model protocols and pediatric age groups. Train and test sets were split to answer questions about the generalizability of pediatric FCN autosegmentation models to unseen age groups and scanner model protocols, as well as the merit of scanner model protocol or age-group-specific models. Finally, the organ contours resulting from the autosegmentation models were applied to patient-specific dose maps to evaluate the impact of segmentation errors on organ dose estimation. RESULTS Results demonstrate that the autosegmentation models generalize to CT scanner acquisition and reconstruction methods which were not present in the training dataset. While models are not equally generalizable across age groups, age-group-specific models do not hold any advantage over combining heterogeneous age groups into a single training set. Dice similarity coefficient (DSC) and mean surface distance results are presented for 19 organ structures, for example, median DSC of 0.52 (duodenum), 0.74 (pancreas), 0.92 (stomach), and 0.96 (heart). The FCN models achieve a mean dose error within 5% of expert segmentations for all 19 organs except for the spinal canal, where the mean error was 6.31%. CONCLUSIONS Overall, these results are promising for the adoption of FCN autosegmentation models for pediatric CT, including applications for patient-specific CT dose estimation.
Collapse
Affiliation(s)
| | | | - Sara Principi
- Department of Biomedical Engineering Marquette University and Medical College of Wisconsin Milwaukee WI 53201 United States
| | | | - Linda S. Strain
- Department of Radiology Children's Wisconsin and Medical College of Wisconsin Milwaukee WI 53226 United States
| | - Michael Offe
- Department of Biomedical Engineering Marquette University and Medical College of Wisconsin Milwaukee WI 53201 United States
| | - Adam S. Wang
- Department of Radiology Stanford University Stanford CA 94305 United States
| | - Nghia‐Jack Vo
- Department of Radiology Children's Wisconsin and Medical College of Wisconsin Milwaukee WI 53226 United States
| | - Taly Gilat Schmidt
- Department of Biomedical Engineering Marquette University and Medical College of Wisconsin Milwaukee WI 53201 United States
| | - Petr Jordan
- Varian Medical Systems Palo Alto CA 94304 United States
| |
Collapse
|
120
|
Vedmurthy P, Pinto ALR, Lin DDM, Comi AM, Ou Y. Study protocol: retrospectively mining multisite clinical data to presymptomatically predict seizure onset for individual patients with Sturge-Weber. BMJ Open 2022; 12:e053103. [PMID: 35121603 PMCID: PMC8819809 DOI: 10.1136/bmjopen-2021-053103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 01/13/2022] [Indexed: 11/11/2022] Open
Abstract
INTRODUCTION Secondary analysis of hospital-hosted clinical data can save time and cost compared with prospective clinical trials for neuroimaging biomarker development. We present such a study for Sturge-Weber syndrome (SWS), a rare neurovascular disorder that affects 1 in 20 000-50 000 newborns. Children with SWS are at risk for developing neurocognitive deficit by school age. A critical period for early intervention is before 2 years of age, but early diagnostic and prognostic biomarkers are lacking. We aim to retrospectively mine clinical data for SWS at two national centres to develop presymptomatic biomarkers. METHODS AND ANALYSIS We will retrospectively collect clinical, MRI and neurocognitive outcome data for patients with SWS who underwent brain MRI before 2 years of age at two national SWS care centres. Expert review of clinical records and MRI quality control will be used to refine the cohort. The merged multisite data will be used to develop algorithms for abnormality detection, lesion-symptom mapping to identify neural substrate and machine learning to predict individual outcomes (presence or absence of seizures) by 2 years of age. Presymptomatic treatment in 0-2 years and before seizure onset may delay or prevent the onset of seizures by 2 years of age, and thereby improve neurocognitive outcomes. The proposed work, if successful, will be one of the largest and most comprehensive multisite databases for the presymptomatic phase of this rare disease. ETHICS AND DISSEMINATION This study involves human participants and was approved by Boston Children's Hospital Institutional Review Board: IRB-P00014482 and IRB-P00025916 Johns Hopkins School of Medicine Institutional Review Board: NA_00043846. Participants gave informed consent to participate in the study before taking part. The Institutional Review Boards at Kennedy Krieger Institute and Boston Children's Hospital approval have been obtained at each site to retrospectively study this data. Results will be disseminated by presentations, publication and sharing of algorithms generated.
Collapse
Affiliation(s)
- Pooja Vedmurthy
- Department of Neurology and Developmental Medicine, Hugo Moser Research Institute, Baltimore, Maryland, USA
- Department of Neurology and Pediatrics, Kennedy Krieger Institute, Baltimore, MD, USA
| | - Anna L R Pinto
- Department of Neurology, Division of Epilepsy, Harvard Medical School, Boston, Massachusetts, USA
| | - Doris D M Lin
- Neuroradiology, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
| | - Anne M Comi
- Department of Neurology and Developmental Medicine, Hugo Moser Research Institute, Baltimore, Maryland, USA
- Department of Neurology and Pediatrics, Kennedy Krieger Institute, Baltimore, MD, USA
- Department of Neurology and Pediatrics, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
| | - Yangming Ou
- Fetal-Neonatal Neuroimaging and Developmental Science Center, Boston Children's Hospital, Boston, Massachusetts, USA
- Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA
- Department of Radiology, Boston Children's Hospital; Harvard Medical School, Boston, MA, USA
| |
Collapse
|
121
|
A Hybrid Robust-Learning Architecture for Medical Image Segmentation with Noisy Labels. FUTURE INTERNET 2022. [DOI: 10.3390/fi14020041] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Deep-learning models require large amounts of accurately labeled data. However, for medical image segmentation, high-quality labels rely on expert experience, and less-experienced operators provide noisy labels. How one might mitigate the negative effects caused by noisy labels for 3D medical image segmentation has not been fully investigated. In this paper, our purpose is to propose a novel hybrid robust-learning architecture to combat noisy labels for 3D medical image segmentation. Our method consists of three components. First, we focus on the noisy annotations of slices and propose a slice-level label-quality awareness method, which automatically generates label-quality scores for slices in a set. Second, we propose a shape-awareness regularization loss based on distance transform maps to introduce prior shape information and provide extra performance gains. Third, based on a re-weighting strategy, we propose an end-to-end hybrid robust-learning architecture to weaken the negative effects caused by noisy labels. Extensive experiments are performed on two representative datasets (i.e., liver segmentation and multi-organ segmentation). Our hybrid noise-robust architecture has shown competitive performance, compared to other methods. Ablation studies also demonstrate the effectiveness of slice-level label-quality awareness and a shape-awareness regularization loss for combating noisy labels.
Collapse
|
122
|
Ashraf M, Robles WRQ, Kim M, Ko YS, Yi MY. A loss-based patch label denoising method for improving whole-slide image analysis using a convolutional neural network. Sci Rep 2022; 12:1392. [PMID: 35082315 PMCID: PMC8791954 DOI: 10.1038/s41598-022-05001-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 01/05/2022] [Indexed: 12/24/2022] Open
Abstract
This paper proposes a deep learning-based patch label denoising method (LossDiff) for improving the classification of whole-slide images of cancer using a convolutional neural network (CNN). Automated whole-slide image classification is often challenging, requiring a large amount of labeled data. Pathologists annotate the region of interest by marking malignant areas, which pose a high risk of introducing patch-based label noise by involving benign regions that are typically small in size within the malignant annotations, resulting in low classification accuracy with many Type-II errors. To overcome this critical problem, this paper presents a simple yet effective method for noisy patch classification. The proposed method, validated using stomach cancer images, provides a significant improvement compared to other existing methods in patch-based cancer classification, with accuracies of 98.81%, 97.30% and 89.47% for binary, ternary, and quaternary classes, respectively. Moreover, we conduct several experiments at different noise levels using a publicly available dataset to further demonstrate the robustness of the proposed method. Given the high cost of producing explicit annotations for whole-slide images and the unavoidable error-prone nature of the human annotation of medical images, the proposed method has practical implications for whole-slide image annotation and automated cancer diagnosis.
Collapse
|
123
|
Mordhorst L, Morozova M, Papazoglou S, Fricke B, Oeschger JM, Tabarin T, Rusch H, Jäger C, Geyer S, Weiskopf N, Morawski M, Mohammadi S. Towards a representative reference for MRI-based human axon radius assessment using light microscopy. Neuroimage 2022; 249:118906. [PMID: 35032659 DOI: 10.1016/j.neuroimage.2022.118906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 01/06/2022] [Accepted: 01/11/2022] [Indexed: 11/26/2022] Open
Abstract
Non-invasive assessment of axon radii via MRI bears great potential for clinical and neuroscience research as it is a main determinant of the neuronal conduction velocity. However, there is a lack of representative histological reference data at the scale of the cross-section of MRI voxels for validating the MRI-visible, effective radius (reff). Because the current gold standard stems from neuroanatomical studies designed to estimate the bulk-determined arithmetic mean radius (rarith) on small ensembles of axons, it is unsuited to estimate the tail-weighted reff. We propose CNN-based segmentation on high-resolution, large-scale light microscopy (lsLM) data to generate a representative reference for reff. In a human corpus callosum, we assessed estimation accuracy and bias of rarith and reff. Furthermore, we investigated whether mapping anatomy-related variation of rarith and reff is confounded by low-frequency variation of the image intensity, e.g., due to staining heterogeneity. Finally, we analyzed the error due to outstandingly large axons in reff. Compared to rarith, reff was estimated with higher accuracy (maximum normalized-root-mean-square-error of reff: 8.5 %; rarith: 19.5 %) and lower bias (maximum absolute normalized-mean-bias-error of reff: 4.8 %; rarith: 13.4 %). While rarith was confounded by variation of the image intensity, variation of reff seemed anatomy-related. The largest axons contributed between 0.8 % and 2.9 % to reff. In conclusion, the proposed method is a step towards representatively estimating reff at MRI voxel resolution. Further investigations are required to assess generalization to other brains and brain areas with different axon radii distributions.
Collapse
Affiliation(s)
- Laurin Mordhorst
- Institute of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
| | - Maria Morozova
- Department of Neurophysics, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; Paul Flechsig Institute of Brain Research, Medical Faculty, Leipzig University, Leipzig, Germany
| | - Sebastian Papazoglou
- Institute of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Björn Fricke
- Institute of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Jan Malte Oeschger
- Institute of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Thibault Tabarin
- Institute of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Henriette Rusch
- Paul Flechsig Institute of Brain Research, Medical Faculty, Leipzig University, Leipzig, Germany
| | - Carsten Jäger
- Department of Neurophysics, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Stefan Geyer
- Department of Neurophysics, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Nikolaus Weiskopf
- Department of Neurophysics, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; Felix Bloch Institute for Solid State Physics, Leipzig University, Leipzig, Germany
| | - Markus Morawski
- Department of Neurophysics, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; Paul Flechsig Institute of Brain Research, Medical Faculty, Leipzig University, Leipzig, Germany
| | - Siawoosh Mohammadi
- Institute of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany; Department of Neurophysics, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
124
|
Jaber MM, Abd SK, Ali SM. Adam Optimized Deep Learning Model for Segmenting ROI Region in Medical Imaging. PROCEEDINGS OF INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND INTELLIGENT SYSTEMS 2022:669-691. [DOI: 10.1007/978-3-030-85990-9_54] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
|
125
|
Robinet F, Akl Y, Ullah K, Nozarian F, Muller C, Frank R. Striving for Less: Minimally-Supervised Pseudo-Label Generation for Monocular Road Segmentation. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3193463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Francois Robinet
- Interdisciplinary Center for Security, Reliability and Trust ( SnT), University of Luxembourg, Luxembourg
| | - Yussef Akl
- Interdisciplinary Center for Security, Reliability and Trust ( SnT), University of Luxembourg, Luxembourg
| | - Kaleem Ullah
- Interdisciplinary Center for Security, Reliability and Trust ( SnT), University of Luxembourg, Luxembourg
| | - Farzad Nozarian
- German Research Center for Artificial Intelligence ( DFKI), Saarbrücken, Germany
| | - Christian Muller
- German Research Center for Artificial Intelligence ( DFKI), Saarbrücken, Germany
| | - Raphael Frank
- Interdisciplinary Center for Security, Reliability and Trust ( SnT), University of Luxembourg, Luxembourg
| |
Collapse
|
126
|
Sanders JW, Mok H, Hanania AN, Venkatesan AM, Tang C, Bruno TL, Thames HD, Kudchadker RJ, Frank SJ. Computer-aided segmentation on MRI for prostate radiotherapy, part II: Comparing human and computer observer populations and the influence of annotator variability on algorithm variability. Radiother Oncol 2021; 169:132-139. [PMID: 34979213 DOI: 10.1016/j.radonc.2021.12.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 12/21/2021] [Accepted: 12/22/2021] [Indexed: 11/29/2022]
Abstract
BACKGROUND AND PURPOSE Comparing deep learning (DL) algorithms to human interobserver variability, one of the largest sources of noise in human-performed annotations, is necessary to inform the clinical application, use, and quality assurance of DL for prostate radiotherapy. MATERIALS AND METHODS One hundred fourteen DL algorithms were developed on 295 prostate MRIs to segment the prostate, external urinary sphincter (EUS), seminal vesicles (SV), rectum, and bladder. Fifty prostate MRIs of 25 patients undergoing MRI-based low-dose-rate prostate brachytherapy were acquired as an independent test set. Groups of DL algorithms were created based on the loss functions used to train them, and the spatial entropy (SE) of their predictions on the 50 test MRIs was computed. Five human observers contoured the 50 test MRIs, and SE maps of their contours were compared with those of the groups of the DL algorithms. Additionally, similarity metrics were computed between DL algorithm predictions and consensus annotations of the 5 human observers' contours of the 50 test MRIs. RESULTS A DL algorithm yielded statistically significantly higher similarity metrics for the prostate than did the human observers (H) (prostate Matthew's correlation coefficient, DL vs. H: planning-0.931 vs. 0.903, p < 0.001; postimplant-0.925 vs. 0.892, p < 0.001); the same was true for the 4 organs at risk. The SE maps revealed that the DL algorithms and human annotators were most variable in similar anatomical regions: the prostate-EUS, prostate-SV, prostate-rectum, and prostate-bladder junctions. CONCLUSIONS Annotation quality is an important consideration when developing, evaluating, and using DL algorithms clinically.
Collapse
Affiliation(s)
- Jeremiah W Sanders
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, United States.
| | - Henry Mok
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, United States
| | - Alexander N Hanania
- Department of Radiation Oncology, Baylor College of Medicine, Houston, United States
| | - Aradhana M Venkatesan
- Department of Diagnostic Radiology, The University of Texas MD Anderson Cancer Center, Houston, United States
| | - Chad Tang
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, United States
| | - Teresa L Bruno
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, United States
| | - Howard D Thames
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, United States
| | - Rajat J Kudchadker
- Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, United States
| | - Steven J Frank
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, United States
| |
Collapse
|
127
|
Beekman C, van Beek S, Stam J, Sonke JJ, Remeijer P. Improving predictive CTV segmentation on CT and CBCT for cervical cancer by diffeomorphic registration of a prior. Med Phys 2021; 49:1701-1711. [PMID: 34964986 DOI: 10.1002/mp.15421] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 11/14/2021] [Accepted: 11/26/2021] [Indexed: 11/07/2022] Open
Abstract
PURPOSE Automatic cervix-uterus segmentation of the clinical target volume (CTV) on CT and cone beam CT (CBCT) scans is challenged by the limited visibility and the non-anatomical definition of certain border regions. We study potential performance gain of convolutional neural networks by regulating the segmentation predictions as diffeomorphic deformations of a segmentation prior. METHODS We introduce a 3D convolutional neural network (CNN) which segments the target scan by joint voxel-wise classification and the registration of a given prior. We compare this network to two other 3D baseline models: one treating segmentation as a classification problem (segmentation-only), the other as a registration problem (deformation-only). For reference and to highlight benefits of a 3D model, these models are also benchmarked against a 2D segmentation model. Network performances are reported for CT and CBCT segmentation of the cervix-uterus CTV. We train the networks on data of 84 patients. The prior is provided by the CTV segmentation of a planning CT. Repeat CT or CBCT scans constitute the target scans to be segmented. RESULTS All 3D models outperformed the 2D segmentation model. For CT segmentation, combining classification and registration in the proposed joint model proved beneficial, achieving a Dice score of 0.87 and a mean squared error (MSE) of the surface distance below 1.7 mm. No such synergy was observed for CBCT segmentation, for which the joint and the deformation-only model performed similarly, achieving a Dice score of about 0.80 and a MSE surface distance of 2.5 mm. However, the segmentation-only model performed notably worse in this low contrast regime. Visual inspection revealed that this performance drop translated into geometric inconsistencies between the prior and target segmentation. Such inconsistencies where not observed for the deformation-based models. CONCLUSION Constraining the solution space of admissible segmentation predictions to those reachable by a diffeomorphic deformation of the prior proved beneficial as it improved geometric consistency. Especially for CBCT, with its poor soft tissue contrast, this type of regularization becomes important as shown by quantitative and qualitative evaluation. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Chris Beekman
- Department of Radiation Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Suzanne van Beek
- Department of Radiation Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Jikke Stam
- Department of Radiation Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Jan-Jakob Sonke
- Department of Radiation Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Peter Remeijer
- Department of Radiation Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| |
Collapse
|
128
|
Wang L, Wang H, Huang Y, Yan B, Chang Z, Liu Z, Zhao M, Cui L, Song J, Li F. Trends in the application of deep learning networks in medical image analysis: Evolution between 2012 and 2020. Eur J Radiol 2021; 146:110069. [PMID: 34847395 DOI: 10.1016/j.ejrad.2021.110069] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 11/10/2021] [Accepted: 11/22/2021] [Indexed: 12/21/2022]
Abstract
PURPOSE To evaluate the general rules and future trajectories of deep learning (DL) networks in medical image analysis through bibliometric and hot spot analysis of original articles published between 2012 and 2020. METHODS Original articles related to DL and medical imaging were retrieved from the PubMed database. For the analysis, data regarding radiological subspecialties; imaging techniques; DL networks; sample size; study purposes, setting, origins and design; statistical analysis; funding sources; authors; and first authors' affiliation was manually extracted from each article. The Bibliographic Item Co-Occurrence Matrix Builder and VOSviewer were used to identify the research topics of the included articles and illustrate the future trajectories of studies. RESULTS The study included 2685 original articles. The number of publications on DL and medical imaging has increased substantially since 2017, accounting for 97.2% of all included articles. We evaluated the rules of the application of 47 DL networks to eight radiological tasks on 11 human organ sites. Neuroradiology, thorax, and abdomen were frequent research subjects, while thyroid was under-represented. Segmentation and classification tasks were the primary purposes. U-Net, ResNet, and VGG were the most frequently used Convolutional neural network-derived networks. GAN-derived networks were widely developed and applied in 2020, and transfer learning was highlighted in the COVID-19 studies. Brain, prostate, and diabetic retinopathy-related studies were mature research topics in the field. Breast- and lung-related studies were in a stage of rapid development. CONCLUSIONS This study evaluates the general rules and future trajectories of DL network application in medical image analyses and provides guidance for future studies.
Collapse
Affiliation(s)
- Lu Wang
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, PR China
| | - Hairui Wang
- Department of Radiology, Shengjing Hospital of China Medical University, Shenyang, Liaoning 110004, PR China
| | - Yingna Huang
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, PR China
| | - Baihui Yan
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, PR China
| | - Zhihui Chang
- Department of Radiology, Shengjing Hospital of China Medical University, Shenyang, Liaoning 110004, PR China
| | - Zhaoyu Liu
- Department of Radiology, Shengjing Hospital of China Medical University, Shenyang, Liaoning 110004, PR China
| | - Mingfang Zhao
- Department of Medical Oncology, The First Hospital of China Medical University, Shenyang, Liaoning 110001, PR China
| | - Lei Cui
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, PR China
| | - Jiangdian Song
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, PR China.
| | - Fan Li
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, PR China
| |
Collapse
|
129
|
Liu J, Li R, Sun C. Co-Correcting: Noise-Tolerant Medical Image Classification via Mutual Label Correction. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:3580-3592. [PMID: 34152981 DOI: 10.1109/tmi.2021.3091178] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
With the development of deep learning, medical image classification has been significantly improved. However, deep learning requires massive data with labels. While labeling the samples by human experts is expensive and time-consuming, collecting labels from crowd-sourcing suffers from the noises which may degenerate the accuracy of classifiers. Therefore, approaches that can effectively handle label noises are highly desired. Unfortunately, recent progress on handling label noise in deep learning has gone largely unnoticed by the medical image. To fill the gap, this paper proposes a noise-tolerant medical image classification framework named Co-Correcting, which significantly improves classification accuracy and obtains more accurate labels through dual-network mutual learning, label probability estimation, and curriculum label correcting. On two representative medical image datasets and the MNIST dataset, we test six latest Learning-with-Noisy-Labels methods and conduct comparative studies. The experiments show that Co-Correcting achieves the best accuracy and generalization under different noise ratios in various tasks. Our project can be found at: https://github.com/JiarunLiu/Co-Correcting.
Collapse
|
130
|
Leitner C, Jarolim R, Englmair B, Kruse A, Hernandez KAL, Konrad A, Su EYS, Schrottner J, Kelly LA, Lichtwark GA, Tilp M, Baumgartner C. A Human-Centered Machine-Learning Approach for Muscle-Tendon Junction Tracking in Ultrasound Images. IEEE Trans Biomed Eng 2021; 69:1920-1930. [PMID: 34818187 DOI: 10.1109/tbme.2021.3130548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Biomechanical and clinical gait research observes muscles and tendons in limbs to study their functions and behaviour. Therefore, movements of distinct anatomical landmarks, such as muscle-tendon junctions, are frequently measured. We propose a reliable and time efficient machine-learning approach to track these junctions in ultrasound videos and support clinical biomechanists in gait analysis. In order to facilitate this process, a method based on deep-learning was introduced. We gathered an extensive data set, covering 3 functional movements, 2 muscles, collected on 123 healthy and 38 impaired subjects with 3 different ultrasound systems, and providing a total of 66864 annotated ultrasound images in our network training. Furthermore, we used data collected across independent laboratories and curated by researchers with varying levels of experience. For the evaluation of our method a diverse test-set was selected that is independently verified by four specialists. We show that our model achieves similar performance scores to the four human specialists in identifying the muscle-tendon junction position. Our method provides time-efficient tracking of muscle-tendon junctions, with prediction times of up to 0.078 seconds per frame (approx. 100 times faster than manual labeling). All our codes, trained models and test-set were made publicly available and our model is provided as a free-to-use online service on https://deepmtj.org/.
Collapse
|
131
|
Yousefirizi F, Pierre Decazes, Amyar A, Ruan S, Saboury B, Rahmim A. AI-Based Detection, Classification and Prediction/Prognosis in Medical Imaging:: Towards Radiophenomics. PET Clin 2021; 17:183-212. [PMID: 34809866 DOI: 10.1016/j.cpet.2021.09.010] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Artificial intelligence (AI) techniques have significant potential to enable effective, robust, and automated image phenotyping including the identification of subtle patterns. AI-based detection searches the image space to find the regions of interest based on patterns and features. There is a spectrum of tumor histologies from benign to malignant that can be identified by AI-based classification approaches using image features. The extraction of minable information from images gives way to the field of "radiomics" and can be explored via explicit (handcrafted/engineered) and deep radiomics frameworks. Radiomics analysis has the potential to be used as a noninvasive technique for the accurate characterization of tumors to improve diagnosis and treatment monitoring. This work reviews AI-based techniques, with a special focus on oncological PET and PET/CT imaging, for different detection, classification, and prediction/prognosis tasks. We also discuss needed efforts to enable the translation of AI techniques to routine clinical workflows, and potential improvements and complementary techniques such as the use of natural language processing on electronic health records and neuro-symbolic AI techniques.
Collapse
Affiliation(s)
- Fereshteh Yousefirizi
- Department of Integrative Oncology, BC Cancer Research Institute, 675 West 10th Avenue, Vancouver, British Columbia V5Z 1L3, Canada.
| | - Pierre Decazes
- Department of Nuclear Medicine, Henri Becquerel Centre, Rue d'Amiens - CS 11516 - 76038 Rouen Cedex 1, France; QuantIF-LITIS, Faculty of Medicine and Pharmacy, Research Building - 1st floor, 22 boulevard Gambetta, 76183 Rouen Cedex, France
| | - Amine Amyar
- QuantIF-LITIS, Faculty of Medicine and Pharmacy, Research Building - 1st floor, 22 boulevard Gambetta, 76183 Rouen Cedex, France; General Electric Healthcare, Buc, France
| | - Su Ruan
- QuantIF-LITIS, Faculty of Medicine and Pharmacy, Research Building - 1st floor, 22 boulevard Gambetta, 76183 Rouen Cedex, France
| | - Babak Saboury
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Bethesda, MD, USA; Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, Baltimore, MD, USA; Department of Radiology, Hospital of the University of Pennsylvania, Philadelphia, PA, USA
| | - Arman Rahmim
- Department of Integrative Oncology, BC Cancer Research Institute, 675 West 10th Avenue, Vancouver, British Columbia V5Z 1L3, Canada; Department of Radiology, University of British Columbia, Vancouver, British Columbia, Canada; Department of Physics, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
132
|
TopoResNet: A Hybrid Deep Learning Architecture and Its Application to Skin Lesion Classification. MATHEMATICS 2021. [DOI: 10.3390/math9222924] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
The application of artificial intelligence (AI) to various medical subfields has been a popular topic of research in recent years. In particular, deep learning has been widely used and has proven effective in many cases. Topological data analysis (TDA)—a rising field at the intersection of mathematics, statistics, and computer science—offers new insights into data. In this work, we develop a novel deep learning architecture that we call TopoResNet that integrates topological information into the residual neural network architecture. To demonstrate TopoResNet, we apply it to a skin lesion classification problem. We find that TopoResNet improves the accuracy and the stability of the training process.
Collapse
|
133
|
Towards targeted ultrasound-guided prostate biopsy by incorporating model and label uncertainty in cancer detection. Int J Comput Assist Radiol Surg 2021; 17:121-128. [PMID: 34783976 DOI: 10.1007/s11548-021-02485-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 08/16/2021] [Indexed: 10/19/2022]
Abstract
PURPOSE Systematic prostate biopsy is widely used for cancer diagnosis. The procedure is blind to underlying prostate tissue micro-structure; hence, it can lead to a high rate of false negatives. Development of a machine-learning model that can reliably identify suspicious cancer regions is highly desirable. However, the models proposed to-date do not consider the uncertainty present in their output or the data to benefit clinical decision making for targeting biopsy. METHODS We propose a deep network for improved detection of prostate cancer in systematic biopsy considering both the label and model uncertainty. The architecture of our model is based on U-Net, trained with temporal enhanced ultrasound (TeUS) data. We estimate cancer detection uncertainty using test-time augmentation and test-time dropout. We then use uncertainty metrics to report the cancer probability for regions with high confidence to help the clinical decision making during the biopsy procedure. RESULTS Experiments for prostate cancer classification includes data from 183 prostate biopsy cores of 41 patients. We achieve an area under the curve, sensitivity, specificity and balanced accuracy of 0.79, 0.78, 0.71 and 0.75, respectively. CONCLUSION Our key contribution is to automatically estimate model and label uncertainty towards enabling targeted ultrasound-guided prostate biopsy. We anticipate that such information about uncertainty can decrease the number of unnecessary biopsy with a higher rate of cancer yield.
Collapse
|
134
|
Zhang Y, Lu Q, Monsoor T, Hussain SA, Qiao JX, Salamon N, Fallah A, Sim MS, Asano E, Sankar R, Staba RJ, Engel J, Speier W, Roychowdhury V, Nariai H. Refining epileptogenic high-frequency oscillations using deep learning: a reverse engineering approach. Brain Commun 2021; 4:fcab267. [PMID: 35169696 PMCID: PMC8833577 DOI: 10.1093/braincomms/fcab267] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Revised: 09/29/2021] [Accepted: 10/04/2021] [Indexed: 11/12/2022] Open
Abstract
Intracranially recorded interictal high-frequency oscillations have been proposed as a promising spatial biomarker of the epileptogenic zone. However, its visual verification is time-consuming and exhibits poor inter-rater reliability. Furthermore, no method is currently available to distinguish high-frequency oscillations generated from the epileptogenic zone (epileptogenic high-frequency oscillations) from those generated from other areas (non-epileptogenic high-frequency oscillations). To address these issues, we constructed a deep learning-based algorithm using chronic intracranial EEG data via subdural grids from 19 children with medication-resistant neocortical epilepsy to: (i) replicate human expert annotation of artefacts and high-frequency oscillations with or without spikes, and (ii) discover epileptogenic high-frequency oscillations by designing a novel weakly supervised model. The ‘purification power’ of deep learning is then used to automatically relabel the high-frequency oscillations to distill epileptogenic high-frequency oscillations. Using 12 958 annotated high-frequency oscillation events from 19 patients, the model achieved 96.3% accuracy on artefact detection (F1 score = 96.8%) and 86.5% accuracy on classifying high-frequency oscillations with or without spikes (F1 score = 80.8%) using patient-wise cross-validation. Based on the algorithm trained from 84 602 high-frequency oscillation events from nine patients who achieved seizure-freedom after resection, the majority of such discovered epileptogenic high-frequency oscillations were found to be ones with spikes (78.6%, P < 0.001). While the resection ratio of detected high-frequency oscillations (number of resected events/number of detected events) did not correlate significantly with post-operative seizure freedom (the area under the curve = 0.76, P = 0.06), the resection ratio of epileptogenic high-frequency oscillations positively correlated with post-operative seizure freedom (the area under the curve = 0.87, P = 0.01). We discovered that epileptogenic high-frequency oscillations had a higher signal intensity associated with ripple (80–250 Hz) and fast ripple (250–500 Hz) bands at the high-frequency oscillation onset and with a lower frequency band throughout the event time window (the inverted T-shaped), compared to non-epileptogenic high-frequency oscillations. We then designed perturbations on the input of the trained model for non-epileptogenic high-frequency oscillations to determine the model’s decision-making logic. The model confidence significantly increased towards epileptogenic high-frequency oscillations by the artificial introduction of the inverted T-shaped signal template (mean probability increase: 0.285, P < 0.001), and by the artificial insertion of spike-like signals into the time domain (mean probability increase: 0.452, P < 0.001). With this deep learning-based framework, we reliably replicated high-frequency oscillation classification tasks by human experts. Using a reverse engineering technique, we distinguished epileptogenic high-frequency oscillations from others and identified its salient features that aligned with current knowledge.
Collapse
Affiliation(s)
- Yipeng Zhang
- Department of Electrical and Computer Engineering, University of California, Los Angeles, CA 90095, USA
| | - Qiujing Lu
- Department of Electrical and Computer Engineering, University of California, Los Angeles, CA 90095, USA
| | - Tonmoy Monsoor
- Department of Electrical and Computer Engineering, University of California, Los Angeles, CA 90095, USA
| | - Shaun A. Hussain
- Division of Pediatric Neurology, Department of Pediatrics, UCLA Mattel Children’s Hospital, David Geffen School of Medicine, Los Angeles, CA 90095, USA
| | - Joe X. Qiao
- Division of Neuroradiology, Department of Radiology, UCLA Medical Center, David Geffen School of Medicine, Los Angeles, CA 90095, USA
| | - Noriko Salamon
- Division of Neuroradiology, Department of Radiology, UCLA Medical Center, David Geffen School of Medicine, Los Angeles, CA 90095, USA
| | - Aria Fallah
- Department of Neurosurgery, UCLA Medical Center, David Geffen School of Medicine, Los Angeles, CA 90095, USA
| | - Myung Shin Sim
- Department of Medicine, Statistics Core, University of California, Los Angeles, CA 90095, USA
| | - Eishi Asano
- Department of Pediatrics and Neurology, Children’s Hospital of Michigan, Wayne State University School of Medicine, Detroit, MI 48201, USA
| | - Raman Sankar
- Division of Pediatric Neurology, Department of Pediatrics, UCLA Mattel Children’s Hospital, David Geffen School of Medicine, Los Angeles, CA 90095, USA
- Department of Neurology, UCLA Medical Center, David Geffen School of Medicine, Los Angeles, CA 90095, USA
- The UCLA Children’s Discovery and Innovation Institute, Los Angeles, CA, USA
| | - Richard J. Staba
- Department of Neurology, UCLA Medical Center, David Geffen School of Medicine, Los Angeles, CA 90095, USA
| | - Jerome Engel
- Department of Neurology, UCLA Medical Center, David Geffen School of Medicine, Los Angeles, CA 90095, USA
- Department of Neurobiology, University of California, Los Angeles, CA 90095, USA
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA 90095, USA
- The Brain Research Institute, University of California, Los Angeles, CA 90095, USA
| | - William Speier
- Department of Radiological Sciences, University of California, Los Angeles, CA 90095, USA
- Department of Bioengineering, University of California, Los Angeles, CA 90095, USA
| | - Vwani Roychowdhury
- Department of Electrical and Computer Engineering, University of California, Los Angeles, CA 90095, USA
| | - Hiroki Nariai
- Division of Pediatric Neurology, Department of Pediatrics, UCLA Mattel Children’s Hospital, David Geffen School of Medicine, Los Angeles, CA 90095, USA
- The UCLA Children’s Discovery and Innovation Institute, Los Angeles, CA, USA
| |
Collapse
|
135
|
Ding W, Abdel-Basset M, Hawash H. RCTE: A reliable and consistent temporal-ensembling framework for semi-supervised segmentation of COVID-19 lesions. Inf Sci (N Y) 2021; 578:559-573. [PMID: 34305162 PMCID: PMC8294559 DOI: 10.1016/j.ins.2021.07.059] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 06/17/2021] [Accepted: 07/17/2021] [Indexed: 12/16/2022]
Abstract
The segmentation of COVID-19 lesions from computed tomography (CT) scans is crucial to develop an efficient automated diagnosis system. Deep learning (DL) has shown success in different segmentation tasks. However, an efficient DL approach requires a large amount of accurately annotated data, which is difficult to aggregate owing to the urgent situation of COVID-19. Inaccurate annotation can easily occur without experts, and segmentation performance is substantially worsened by noisy annotations. Therefore, this study presents a reliable and consistent temporal-ensembling (RCTE) framework for semi-supervised lesion segmentation. A segmentation network is integrated into a teacher-student architecture to segment infection regions from a limited number of annotated CT scans and a large number of unannotated CT scans. The network generates reliable and unreliable targets, and to evenly handle these targets potentially degrades performance. To address this, a reliable teacher-student architecture is introduced, where a reliable teacher network is the exponential moving average (EMA) of a reliable student network that is reliably renovated by restraining the student involvement to EMA when its loss is larger. We also present a noise-aware loss based on improvements to generalized cross-entropy loss to lead the segmentation performance toward noisy annotations. Comprehensive analysis validates the robustness of RCTE over recent cutting-edge semi-supervised segmentation techniques, with a 65.87% Dice score.
Collapse
Affiliation(s)
- Weiping Ding
- School of Information Science and Technology, Nantong University, Nantong 226019, China
| | - Mohamed Abdel-Basset
- Zagazig Univesitry, Shaibet an Nakareyah, Zagazig 2, 44519 Ash Sharqia Governorate, Egypt
| | - Hossam Hawash
- Zagazig Univesitry, Shaibet an Nakareyah, Zagazig 2, 44519 Ash Sharqia Governorate, Egypt
| |
Collapse
|
136
|
Kurian NC, Singh G, Hebbar P, Kodate S, Rane S, Sethi A. Robust Classification of Histology Images Exploiting Adversarial Auto Encoders. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:2871-2874. [PMID: 34891846 DOI: 10.1109/embc46164.2021.9630477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Deep learning (DL) thrives on the availability of a large number of high quality images with reliable labels. Due to the large size of whole slide images in digital pathology, patches of manageable size are often mined for use in DL models. These patches are variable in quality, weakly supervised, individually less informative, and noisily labelled. To improve classification accuracy even with these noisy inputs and labels in histopathology, we propose a novel method for robust feature generation using an adversarial autoencoder (AAE). We utilize the likelihood of the features in the latent space of AAE as a criterion to weigh the training samples. We propose different weighting schemes for our framework and evaluate the effectiveness of our methods on the publically available BreakHis and BACH histopathology datasets. We observe consistent improvement in AUC scores using our methods, and conclude that robust supervision strategies should be further explored for computational pathology.
Collapse
|
137
|
Robust autofocusing for scanning electron microscopy based on a dual deep learning network. Sci Rep 2021; 11:20933. [PMID: 34686722 PMCID: PMC8536763 DOI: 10.1038/s41598-021-00412-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 10/11/2021] [Indexed: 11/23/2022] Open
Abstract
Scanning electron microscopy (SEM) is a high-resolution imaging technique with subnanometer spatial resolution that is widely used in materials science, basic science, and nanofabrication. However, conducting SEM is rather complex due to the nature of using an electron beam and the many parameters that must be adjusted to acquire high-quality images. Only trained operators can use SEM equipment properly, meaning that the use of SEM is restricted. To broaden the usability of SEM, we propose an autofocus method for a SEM system based on a dual deep learning network, which consists of an autofocusing-evaluation network (AENet) and an autofocusing-control network (ACNet). The AENet was designed to evaluate the quality of given images, with scores ranging from 0 to 9 regardless of the magnification. The ACNet can delicately control the focus of SEM online based on the AENet’s outputs for any lateral sample position and magnification. The results of these dual networks showed successful autofocus performance on three trained samples. Moreover, the robustness of the proposed method was demonstrated by autofocusing on unseen samples. We expect that our autofocusing system will not only contribute to expanding the versatility of SEM but will also be applicable to various microscopes.
Collapse
|
138
|
Oliveira E Carmo L, van den Merkhof A, Olczak J, Gordon M, Jutte PC, Jaarsma RL, IJpma FFA, Doornberg JN, Prijs J. An increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics : are these externally validated and ready for clinical application? Bone Jt Open 2021; 2:879-885. [PMID: 34669518 PMCID: PMC8558452 DOI: 10.1302/2633-1462.210.bjo-2021-0133] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Aims The number of convolutional neural networks (CNN) available for fracture detection and classification is rapidly increasing. External validation of a CNN on a temporally separate (separated by time) or geographically separate (separated by location) dataset is crucial to assess generalizability of the CNN before application to clinical practice in other institutions. We aimed to answer the following questions: are current CNNs for fracture recognition externally valid?; which methods are applied for external validation (EV)?; and, what are reported performances of the EV sets compared to the internal validation (IV) sets of these CNNs? Methods The PubMed and Embase databases were systematically searched from January 2010 to October 2020 according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. The type of EV, characteristics of the external dataset, and diagnostic performance characteristics on the IV and EV datasets were collected and compared. Quality assessment was conducted using a seven-item checklist based on a modified Methodologic Index for NOn-Randomized Studies instrument (MINORS). Results Out of 1,349 studies, 36 reported development of a CNN for fracture detection and/or classification. Of these, only four (11%) reported a form of EV. One study used temporal EV, one conducted both temporal and geographical EV, and two used geographical EV. When comparing the CNN’s performance on the IV set versus the EV set, the following were found: AUCs of 0.967 (IV) versus 0.975 (EV), 0.976 (IV) versus 0.985 to 0.992 (EV), 0.93 to 0.96 (IV) versus 0.80 to 0.89 (EV), and F1-scores of 0.856 to 0.863 (IV) versus 0.757 to 0.840 (EV). Conclusion The number of externally validated CNNs in orthopaedic trauma for fracture recognition is still scarce. This greatly limits the potential for transfer of these CNNs from the developing institute to another hospital to achieve similar diagnostic performance. We recommend the use of geographical EV and statements such as the Consolidated Standards of Reporting Trials–Artificial Intelligence (CONSORT-AI), the Standard Protocol Items: Recommendations for Interventional Trials–Artificial Intelligence (SPIRIT-AI) and the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis–Machine Learning (TRIPOD-ML) to critically appraise performance of CNNs and improve methodological rigor, quality of future models, and facilitate eventual implementation in clinical practice. Cite this article: Bone Jt Open 2021;2(10):879–885.
Collapse
Affiliation(s)
- Luisa Oliveira E Carmo
- Department of Orthopaedic Surgery, University Medical Centre, University of Groningen, Groningen, Groningen, Netherlands
| | - Anke van den Merkhof
- Department of Orthopaedic Surgery, Flinders Medical Centre, Bedford Park, Adelaide, South Australia, Australia.,Flinders University, Bedford Park, Adelaide, South Australia, Australia
| | - Jakub Olczak
- Institute of Clinical Sciences, Danderyd University Hospital, Karolinska Institute, Stockholm, Sweden
| | - Max Gordon
- Institute of Clinical Sciences, Danderyd University Hospital, Karolinska Institute, Stockholm, Sweden
| | - Paul C Jutte
- Department of Orthopaedic Surgery, University Medical Centre, University of Groningen, Groningen, Groningen, Netherlands
| | - Ruurd L Jaarsma
- Department of Orthopaedic Surgery, Flinders Medical Centre, Bedford Park, Adelaide, South Australia, Australia.,Flinders University, Bedford Park, Adelaide, South Australia, Australia
| | - Frank F A IJpma
- Department of Trauma Surgery, University Medical Centre Groningen, University of Groningen, Groningen, Groningen, Netherlands
| | - Job N Doornberg
- Department of Orthopaedic Surgery, University Medical Centre, University of Groningen, Groningen, Groningen, Netherlands.,Department of Orthopaedic Surgery, Flinders Medical Centre, Bedford Park, Adelaide, South Australia, Australia.,Flinders University, Bedford Park, Adelaide, South Australia, Australia.,Department of Trauma Surgery, University Medical Centre Groningen, University of Groningen, Groningen, Groningen, Netherlands
| | - Jasper Prijs
- Department of Orthopaedic Surgery, University Medical Centre, University of Groningen, Groningen, Groningen, Netherlands.,Department of Orthopaedic Surgery, Flinders Medical Centre, Bedford Park, Adelaide, South Australia, Australia.,Flinders University, Bedford Park, Adelaide, South Australia, Australia.,Department of Trauma Surgery, University Medical Centre Groningen, University of Groningen, Groningen, Groningen, Netherlands
| | -
- Machine Learning Consortium
| |
Collapse
|
139
|
Wang S, Li C, Wang R, Liu Z, Wang M, Tan H, Wu Y, Liu X, Sun H, Yang R, Liu X, Chen J, Zhou H, Ben Ayed I, Zheng H. Annotation-efficient deep learning for automatic medical image segmentation. Nat Commun 2021; 12:5915. [PMID: 34625565 PMCID: PMC8501087 DOI: 10.1038/s41467-021-26216-9] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 09/22/2021] [Indexed: 01/17/2023] Open
Abstract
Automatic medical image segmentation plays a critical role in scientific research and medical care. Existing high-performance deep learning methods typically rely on large training datasets with high-quality manual annotations, which are difficult to obtain in many clinical applications. Here, we introduce Annotation-effIcient Deep lEarning (AIDE), an open-source framework to handle imperfect training datasets. Methodological analyses and empirical evaluations are conducted, and we demonstrate that AIDE surpasses conventional fully-supervised models by presenting better performance on open datasets possessing scarce or noisy annotations. We further test AIDE in a real-life case study for breast tumor segmentation. Three datasets containing 11,852 breast images from three medical centers are employed, and AIDE, utilizing 10% training annotations, consistently produces segmentation maps comparable to those generated by fully-supervised counterparts or provided by independent radiologists. The 10-fold enhanced efficiency in utilizing expert labels has the potential to promote a wide range of biomedical applications.
Collapse
Affiliation(s)
- Shanshan Wang
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China.
- Peng Cheng Laboratory, Shenzhen, Guangdong, China.
- Pazhou Laboratory, Guangzhou, Guangdong, China.
| | - Cheng Li
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China.
| | - Rongpin Wang
- Department of Medical Imaging, Guizhou Provincial People's Hospital, Guiyang, Guizhou, China
| | - Zaiyi Liu
- Department of Medical Imaging, Guangdong General Hospital, Guangdong Academy of Medical Sciences, Guangzhou, Guangdong, China
| | - Meiyun Wang
- Department of Medical Imaging, Henan Provincial People's Hospital & the People's Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Hongna Tan
- Department of Medical Imaging, Henan Provincial People's Hospital & the People's Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Yaping Wu
- Department of Medical Imaging, Henan Provincial People's Hospital & the People's Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Xinfeng Liu
- Department of Medical Imaging, Guizhou Provincial People's Hospital, Guiyang, Guizhou, China
| | - Hui Sun
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Rui Yang
- Department of Urology, Renmin Hospital of Wuhan University, Wuhan, Hubei, China
| | - Xin Liu
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Jie Chen
- Peng Cheng Laboratory, Shenzhen, Guangdong, China
- School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University, Shenzhen, Guangdong, China
| | - Huihui Zhou
- Brain Cognition and Brain Disease Institute, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | | | - Hairong Zheng
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China.
| |
Collapse
|
140
|
Schmarje L, Brünger J, Santarossa M, Schröder SM, Kiko R, Koch R. Fuzzy Overclustering: Semi-Supervised Classification of Fuzzy Labels with Overclustering and Inverse Cross-Entropy. SENSORS (BASEL, SWITZERLAND) 2021; 21:6661. [PMID: 34640981 PMCID: PMC8512301 DOI: 10.3390/s21196661] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 10/01/2021] [Accepted: 10/02/2021] [Indexed: 11/17/2022]
Abstract
Deep learning has been successfully applied to many classification problems including underwater challenges. However, a long-standing issue with deep learning is the need for large and consistently labeled datasets. Although current approaches in semi-supervised learning can decrease the required amount of annotated data by a factor of 10 or even more, this line of research still uses distinct classes. For underwater classification, and uncurated real-world datasets in general, clean class boundaries can often not be given due to a limited information content in the images and transitional stages of the depicted objects. This leads to different experts having different opinions and thus producing fuzzy labels which could also be considered ambiguous or divergent. We propose a novel framework for handling semi-supervised classifications of such fuzzy labels. It is based on the idea of overclustering to detect substructures in these fuzzy labels. We propose a novel loss to improve the overclustering capability of our framework and show the benefit of overclustering for fuzzy labels. We show that our framework is superior to previous state-of-the-art semi-supervised methods when applied to real-world plankton data with fuzzy labels. Moreover, we acquire 5 to 10% more consistent predictions of substructures.
Collapse
Affiliation(s)
- Lars Schmarje
- Multimedia Information Processing Group, Kiel University, 24118 Kiel, Germany; (J.B.); (M.S.); (S.-M.S.); (R.K.)
| | - Johannes Brünger
- Multimedia Information Processing Group, Kiel University, 24118 Kiel, Germany; (J.B.); (M.S.); (S.-M.S.); (R.K.)
| | - Monty Santarossa
- Multimedia Information Processing Group, Kiel University, 24118 Kiel, Germany; (J.B.); (M.S.); (S.-M.S.); (R.K.)
| | - Simon-Martin Schröder
- Multimedia Information Processing Group, Kiel University, 24118 Kiel, Germany; (J.B.); (M.S.); (S.-M.S.); (R.K.)
| | - Rainer Kiko
- Laboratoire d’Océanographie de Villefranche, Sorbonne Université, 06230 Villefranche-sur-Mer, France;
| | - Reinhard Koch
- Multimedia Information Processing Group, Kiel University, 24118 Kiel, Germany; (J.B.); (M.S.); (S.-M.S.); (R.K.)
| |
Collapse
|
141
|
Lu M, Zhao Q, Poston KL, Sullivan EV, Pfefferbaum A, Shahid M, Katz M, Montaser-Kouhsari L, Schulman K, Milstein A, Niebles JC, Henderson VW, Fei-Fei L, Pohl KM, Adeli E. Quantifying Parkinson's disease motor severity under uncertainty using MDS-UPDRS videos. Med Image Anal 2021; 73:102179. [PMID: 34340101 PMCID: PMC8453121 DOI: 10.1016/j.media.2021.102179] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 06/28/2021] [Accepted: 07/13/2021] [Indexed: 11/15/2022]
Abstract
Parkinson's disease (PD) is a brain disorder that primarily affects motor function, leading to slow movement, tremor, and stiffness, as well as postural instability and difficulty with walking/balance. The severity of PD motor impairments is clinically assessed by part III of the Movement Disorder Society Unified Parkinson's Disease Rating Scale (MDS-UPDRS), a universally-accepted rating scale. However, experts often disagree on the exact scoring of individuals. In the presence of label noise, training a machine learning model using only scores from a single rater may introduce bias, while training models with multiple noisy ratings is a challenging task due to the inter-rater variabilities. In this paper, we introduce an ordinal focal neural network to estimate the MDS-UPDRS scores from input videos, to leverage the ordinal nature of MDS-UPDRS scores and combat class imbalance. To handle multiple noisy labels per exam, the training of the network is regularized via rater confusion estimation (RCE), which encodes the rating habits and skills of raters via a confusion matrix. We apply our pipeline to estimate MDS-UPDRS test scores from their video recordings including gait (with multiple Raters, R=3) and finger tapping scores (single rater). On a sizable clinical dataset for the gait test (N=55), we obtained a classification accuracy of 72% with majority vote as ground-truth, and an accuracy of ∼84% of our model predicting at least one of the raters' scores. Our work demonstrates how computer-assisted technologies can be used to track patients and their motor impairments, even when there is uncertainty in the clinical ratings. The latest version of the code will be available at https://github.com/mlu355/PD-Motor-Severity-Estimation.
Collapse
Affiliation(s)
- Mandy Lu
- Department of Computer Science, Stanford University, Stanford CA 94305, USA
| | - Qingyu Zhao
- Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford CA 94305, USA
| | - Kathleen L Poston
- Department of Neurology & Neurological Sciences, Stanford University, Stanford CA 94305, USA
| | - Edith V Sullivan
- Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford CA 94305, USA
| | - Adolf Pfefferbaum
- Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford CA 94305, USA; Center for Health Sciences, SRI International, Menlo Park CA 94025, USA
| | - Marian Shahid
- Department of Neurology & Neurological Sciences, Stanford University, Stanford CA 94305, USA
| | - Maya Katz
- Department of Neurology & Neurological Sciences, Stanford University, Stanford CA 94305, USA
| | - Leila Montaser-Kouhsari
- Department of Neurology & Neurological Sciences, Stanford University, Stanford CA 94305, USA
| | - Kevin Schulman
- Department of Medicine, Stanford University, Stanford CA 94305, USA
| | - Arnold Milstein
- Department of Medicine, Stanford University, Stanford CA 94305, USA
| | | | - Victor W Henderson
- Department of Epidemiology & Population Health, Stanford University, Stanford CA 94305, USA; Department of Neurology & Neurological Sciences, Stanford University, Stanford CA 94305, USA
| | - Li Fei-Fei
- Department of Computer Science, Stanford University, Stanford CA 94305, USA
| | - Kilian M Pohl
- Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford CA 94305, USA; Center for Health Sciences, SRI International, Menlo Park CA 94025, USA
| | - Ehsan Adeli
- Department of Computer Science, Stanford University, Stanford CA 94305, USA; Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford CA 94305, USA.
| |
Collapse
|
142
|
Wang L, Guo D, Wang G, Zhang S. Annotation-Efficient Learning for Medical Image Segmentation Based on Noisy Pseudo Labels and Adversarial Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:2795-2807. [PMID: 33370237 DOI: 10.1109/tmi.2020.3047807] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Despite that deep learning has achieved state-of-the-art performance for medical image segmentation, its success relies on a large set of manually annotated images for training that are expensive to acquire. In this paper, we propose an annotation-efficient learning framework for segmentation tasks that avoids annotations of training images, where we use an improved Cycle-Consistent Generative Adversarial Network (GAN) to learn from a set of unpaired medical images and auxiliary masks obtained either from a shape model or public datasets. We first use the GAN to generate pseudo labels for our training images under the implicit high-level shape constraint represented by a Variational Auto-encoder (VAE)-based discriminator with the help of the auxiliary masks, and build a Discriminator-guided Generator Channel Calibration (DGCC) module which employs our discriminator's feedback to calibrate the generator for better pseudo labels. To learn from the pseudo labels that are noisy, we further introduce a noise-robust iterative learning method using noise-weighted Dice loss. We validated our framework with two situations: objects with a simple shape model like optic disc in fundus images and fetal head in ultrasound images, and complex structures like lung in X-Ray images and liver in CT images. Experimental results demonstrated that 1) Our VAE-based discriminator and DGCC module help to obtain high-quality pseudo labels. 2) Our proposed noise-robust learning method can effectively overcome the effect of noisy pseudo labels. 3) The segmentation performance of our method without using annotations of training images is close or even comparable to that of learning from human annotations.
Collapse
|
143
|
Towards an improved label noise proportion estimation in small data: a Bayesian approach. INT J MACH LEARN CYB 2021. [DOI: 10.1007/s13042-021-01423-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
144
|
Liu L, Zhang Z, Li S, Ma K, Zheng Y. S-CUDA: Self-cleansing unsupervised domain adaptation for medical image segmentation. Med Image Anal 2021; 74:102214. [PMID: 34464837 DOI: 10.1016/j.media.2021.102214] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 08/09/2021] [Accepted: 08/10/2021] [Indexed: 01/08/2023]
Abstract
Medical image segmentation tasks hitherto have achieved excellent progresses with large-scale datasets, which empowers us to train potent deep convolutional neural networks (DCNNs). However, labeling such large-scale datasets is laborious and error-prone, which leads the noisy (or incorrect) labels to be an ubiquitous problem in the real-world scenarios. In addition, data collected from different sites usually exhibit significant data distribution shift (or domain shift). As a result, noisy label and domain shift become two common problems in medical imaging application scenarios, especially in medical image segmentation, which degrade the performance of deep learning models significantly. In this paper, we identify a novel problem hidden in medical image segmentation, which is unsupervised domain adaptation on noisy labeled data, and propose a novel algorithm named "Self-Cleansing Unsupervised Domain Adaptation" (S-CDUA) to address such issue. S-CUDA sets up a realistic scenario to solve the above problems simultaneously where training data (i.e., source domain) not only shows domain shift w.r.t. unsupervised test data (i.e., target domain) but also contains noisy labels. The key idea of S-CUDA is to learn noise-excluding and domain invariant knowledge from noisy supervised data, which will be applied on the highly corrupted data for label cleansing and further data-recycling, as well as on the test data with domain shift for supervised propagation. To this end, we propose a novel framework leveraging noisy-label learning and domain adaptation techniques to cleanse the noisy labels and learn from trustable clean samples, thus enabling robust adaptation and prediction on the target domain. Specifically, we train two peer adversarial networks to identify high-confidence clean data and exchange them in companions to eliminate the error accumulation problem and narrow the domain gap simultaneously. In the meantime, the high-confidence noisy data are detected and cleansed in order to reuse the contaminated training data. Therefore, our proposed method can not only cleanse the noisy labels in the training set but also take full advantage of the existing noisy data to update the parameters of the network. For evaluation, we conduct experiments on two popular datasets (REFUGE and Drishti-GS) for optic disc (OD) and optic cup (OC) segmentation, and on another public multi-vendor dataset for spinal cord gray matter (SCGM) segmentation. Experimental results show that our proposed method can cleanse noisy labels efficiently and obtain a model with better generalization performance at the same time, which outperforms previous state-of-the-art methods by large margin. Our code can be found at https://github.com/zzdxjtu/S-cuda.
Collapse
Affiliation(s)
- Luyan Liu
- Tencent Jarvis Lab, Shenzhen 518040, China; Tencent Healthcare (Shenzhen) Co., LTD, China.
| | - Zhengdong Zhang
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing 100191, China
| | - Shuai Li
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing 100191, China
| | - Kai Ma
- Tencent Jarvis Lab, Shenzhen 518040, China; Tencent Healthcare (Shenzhen) Co., LTD, China
| | - Yefeng Zheng
- Tencent Jarvis Lab, Shenzhen 518040, China; Tencent Healthcare (Shenzhen) Co., LTD, China
| |
Collapse
|
145
|
Wang Y, Bi Z, Xie Y, Wu T, Zeng X, Chen S, Zhou D. Learning from Highly Confident Samples for Automatic Knee Osteoarthritis Severity Assessment: Data from the Osteoarthritis Initiative. IEEE J Biomed Health Inform 2021; 26:1239-1250. [PMID: 34347615 DOI: 10.1109/jbhi.2021.3102090] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Knee osteoarthritis (OA) is a chronic disease that considerably reduces patients' life quality. Preventive therapies require early detection and lifetime monitor of OA progression. In the clinical environment, the severity of OA is classified by Kellgren and Lawrence (KL) grading system, ranging from KL-0 to KL-4. Recently, deep learning methods were applied to OA severity assessment to improve the accuracy and efficiency. Researchers fine-tuned convolution neural networks (CNN) on the OA dataset and built end-to-end approaches. However, this task is still challenging due to the ambiguity between adjacent grading, especially in early-stage OA. Low confident samples, which are less representative than the typical ones, undermine the training process. Targeting the uncertainty in the OA dataset, we propose a novel learning scheme that dynamically separates the data into two sets according to their reliability. Besides, we design a hybrid loss function to help CNN learn from the two sets accordingly. With the proposed approach, we emphasize the typical samples and control the impacts of low confident cases. Experiments are conducted in a five-fold manner. Our method achieves a mean accuracy of 70.13\% on the five-class OA assessment task, which outperforms all other start-of-art methods. Despite that early-stage OA detection still benefits from the human intervention of lesion region selection, our approach achieves superior performance on the KL-0 vs. KL-2 task. Moreover, we design an experiment to validate large-scale automatic data refining during training. The result verifies the ability of characterizing low confidence samples by our approach. Dataset used in this paper was obtained from the osteoarthritis Initiative.
Collapse
|
146
|
Zadeh Shirazi A, McDonnell MD, Fornaciari E, Bagherian NS, Scheer KG, Samuel MS, Yaghoobi M, Ormsby RJ, Poonnoose S, Tumes DJ, Gomez GA. A deep convolutional neural network for segmentation of whole-slide pathology images identifies novel tumour cell-perivascular niche interactions that are associated with poor survival in glioblastoma. Br J Cancer 2021; 125:337-350. [PMID: 33927352 PMCID: PMC8329064 DOI: 10.1038/s41416-021-01394-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Revised: 03/16/2021] [Accepted: 04/08/2021] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND Glioblastoma is the most aggressive type of brain cancer with high-levels of intra- and inter-tumour heterogeneity that contribute to its rapid growth and invasion within the brain. However, a spatial characterisation of gene signatures and the cell types expressing these in different tumour locations is still lacking. METHODS We have used a deep convolutional neural network (DCNN) as a semantic segmentation model to segment seven different tumour regions including leading edge (LE), infiltrating tumour (IT), cellular tumour (CT), cellular tumour microvascular proliferation (CTmvp), cellular tumour pseudopalisading region around necrosis (CTpan), cellular tumour perinecrotic zones (CTpnz) and cellular tumour necrosis (CTne) in digitised glioblastoma histopathological slides from The Cancer Genome Atlas (TCGA). Correlation analysis between segmentation results from tumour images together with matched RNA expression data was performed to identify genetic signatures that are specific to different tumour regions. RESULTS We found that spatially resolved gene signatures were strongly correlated with survival in patients with defined genetic mutations. Further in silico cell ontology analysis along with single-cell RNA sequencing data from resected glioblastoma tissue samples showed that these tumour regions had different gene signatures, whose expression was driven by different cell types in the regional tumour microenvironment. Our results further pointed to a key role for interactions between microglia/pericytes/monocytes and tumour cells that occur in the IT and CTmvp regions, which may contribute to poor patient survival. CONCLUSIONS This work identified key histopathological features that correlate with patient survival and detected spatially associated genetic signatures that contribute to tumour-stroma interactions and which should be investigated as new targets in glioblastoma. The source codes and datasets used are available in GitHub: https://github.com/amin20/GBM_WSSM .
Collapse
Affiliation(s)
- Amin Zadeh Shirazi
- Centre for Cancer Biology, SA Pathology and University of South Australia, Adelaide, SA, Australia
- Computational Learning Systems Laboratory, UniSA STEM, University of South Australia, Mawson Lakes, SA, Australia
| | - Mark D McDonnell
- Computational Learning Systems Laboratory, UniSA STEM, University of South Australia, Mawson Lakes, SA, Australia
| | - Eric Fornaciari
- Department of Mathematics of Computation, University of California, Los Angeles (UCLA), CA, USA
| | | | - Kaitlin G Scheer
- Centre for Cancer Biology, SA Pathology and University of South Australia, Adelaide, SA, Australia
| | - Michael S Samuel
- Centre for Cancer Biology, SA Pathology and University of South Australia, Adelaide, SA, Australia
- Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia
| | - Mahdi Yaghoobi
- Electrical and Computer Engineering Department, Department of Artificial Intelligence, Islamic Azad University, Mashhad Branch, Mashhad, Iran
| | - Rebecca J Ormsby
- Flinders Health and Medical Research Institute, College of Medicine & Public Health, Flinders University, Adelaide, SA, Australia
| | - Santosh Poonnoose
- Flinders Health and Medical Research Institute, College of Medicine & Public Health, Flinders University, Adelaide, SA, Australia
- Department of Neurosurgery, Flinders Medical Centre, Bedford Park, SA, Australia
| | - Damon J Tumes
- Centre for Cancer Biology, SA Pathology and University of South Australia, Adelaide, SA, Australia
| | - Guillermo A Gomez
- Centre for Cancer Biology, SA Pathology and University of South Australia, Adelaide, SA, Australia.
| |
Collapse
|
147
|
Hsu W, Baumgartner C, Deserno TM. Notable Papers and New Directions in Sensors, Signals, and Imaging Informatics. Yearb Med Inform 2021; 30:150-158. [PMID: 34479386 PMCID: PMC8416210 DOI: 10.1055/s-0041-1726526] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
OBJECTIVE To identify and highlight research papers representing noteworthy developments in signals, sensors, and imaging informatics in 2020. METHOD A broad literature search was conducted on PubMed and Scopus databases. We combined Medical Subject Heading (MeSH) terms and keywords to construct particular queries for sensors, signals, and image informatics. We only considered papers that have been published in journals providing at least three articles in the query response. Section editors then independently reviewed the titles and abstracts of preselected papers assessed on a three-point Likert scale. Papers were rated from 1 (do not include) to 3 (should be included) for each topical area (sensors, signals, and imaging informatics) and those with an average score of 2 or above were subsequently read and assessed again by two of the three co-editors. Finally, the top 14 papers with the highest combined scores were considered based on consensus. RESULTS The search for papers was executed in January 2021. After removing duplicates and conference proceedings, the query returned a set of 101, 193, and 529 papers for sensors, signals, and imaging informatics, respectively. We filtered out journals that had less than three papers in the query results, reducing the number of papers to 41, 117, and 333, respectively. From these, the co-editors identified 22 candidate papers with more than 2 Likert points on average, from which 14 candidate best papers were nominated after intensive discussion. At least five external reviewers then rated the remaining papers. The four finalist papers were found using the composite rating of all external reviewers. These best papers were approved by consensus of the International Medical Informatics Association (IMIA) Yearbook editorial board. CONCLUSIONS Sensors, signals, and imaging informatics is a dynamic field of intense research. The four best papers represent advanced approaches for combining, processing, modeling, and analyzing heterogeneous sensor and imaging data. The selected papers demonstrate the combination and fusion of multiple sensors and sensor networks using electrocardiogram (ECG), electroencephalogram (EEG), or photoplethysmogram (PPG) with advanced data processing, deep and machine learning techniques, and present image processing modalities beyond state-of-the-art that significantly support and further improve medical decision making.
Collapse
Affiliation(s)
- William Hsu
- Medical & Imaging Informatics, Department of Radiological Sciences, David Geffen School of Medicine at UCLA, United States of America
| | - Christian Baumgartner
- Institute of Health Care Engineering with European Testing Center of Medical Devices, Graz University of Technology, Austria
| | - Thomas M. Deserno
- Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Braunschweig, Germany
| | | |
Collapse
|
148
|
Guo S, Xu L, Feng C, Xiong H, Gao Z, Zhang H. Multi-level semantic adaptation for few-shot segmentation on cardiac image sequences. Med Image Anal 2021; 73:102170. [PMID: 34380105 DOI: 10.1016/j.media.2021.102170] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Revised: 06/04/2021] [Accepted: 07/12/2021] [Indexed: 01/01/2023]
Abstract
Obtaining manual labels is time-consuming and labor-intensive on cardiac image sequences. Few-shot segmentation can utilize limited labels to learn new tasks. However, it suffers from two challenges: spatial-temporal distribution bias and long-term information bias. These challenges derive from the impact of the time dimension on cardiac image sequences, resulting in serious over-adaptation. In this paper, we propose the multi-level semantic adaptation (MSA) for few-shot segmentation on cardiac image sequences. The MSA addresses the two biases by exploring the domain adaptation and the weight adaptation on the semantic features in multiple levels, including sequence-level, frame-level, and pixel-level. First, the MSA proposes the dual-level feature adjustment for domain adaptation in spatial and temporal directions. This adjustment explicitly aligns the frame-level feature and the sequence-level feature to improve the model adaptation on diverse modalities. Second, the MSA explores the hierarchical attention metric for weight adaptation in the frame-level feature and the pixel-level feature. This metric focuses on the similar frame and the target region to promote the model discrimination on the border features. The extensive experiments demonstrate that our MSA is effective in few-shot segmentation on cardiac image sequences with three modalities, i.e. MR, CT, and Echo (e.g. the average Dice is 0.9243), as well as superior to the ten state-of-the-art methods.
Collapse
Affiliation(s)
- Saidi Guo
- School of Biomedical Engineering, Sun Yat-sen University, China
| | - Lin Xu
- General Hospital of the Southern Theatre Command, PLA, Guangdong, China; The First School of Clinical Medicine, Southern Medical University, Guangdong, China
| | - Cheng Feng
- Department of Ultrasound, The Third People's Hospital of Shenzhen, Guangdong, China
| | - Huahua Xiong
- Department of Ultrasound, The First Affiliated Hospital of Shenzhen University, Shenzhen Second People's Hospital, Guangdong, China
| | - Zhifan Gao
- School of Biomedical Engineering, Sun Yat-sen University, China.
| | - Heye Zhang
- School of Biomedical Engineering, Sun Yat-sen University, China.
| |
Collapse
|
149
|
Wood DA, Kafiabadi S, Al Busaidi A, Guilhem EL, Lynch J, Townend MK, Montvila A, Kiik M, Siddiqui J, Gadapa N, Benger MD, Mazumder A, Barker G, Ourselin S, Cole JH, Booth TC. Deep learning to automate the labelling of head MRI datasets for computer vision applications. Eur Radiol 2021; 32:725-736. [PMID: 34286375 PMCID: PMC8660736 DOI: 10.1007/s00330-021-08132-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 06/02/2021] [Accepted: 06/14/2021] [Indexed: 02/07/2023]
Abstract
Objectives The purpose of this study was to build a deep learning model to derive labels from neuroradiology reports and assign these to the corresponding examinations, overcoming a bottleneck to computer vision model development. Methods Reference-standard labels were generated by a team of neuroradiologists for model training and evaluation. Three thousand examinations were labelled for the presence or absence of any abnormality by manually scrutinising the corresponding radiology reports (‘reference-standard report labels’); a subset of these examinations (n = 250) were assigned ‘reference-standard image labels’ by interrogating the actual images. Separately, 2000 reports were labelled for the presence or absence of 7 specialised categories of abnormality (acute stroke, mass, atrophy, vascular abnormality, small vessel disease, white matter inflammation, encephalomalacia), with a subset of these examinations (n = 700) also assigned reference-standard image labels. A deep learning model was trained using labelled reports and validated in two ways: comparing predicted labels to (i) reference-standard report labels and (ii) reference-standard image labels. The area under the receiver operating characteristic curve (AUC-ROC) was used to quantify model performance. Accuracy, sensitivity, specificity, and F1 score were also calculated. Results Accurate classification (AUC-ROC > 0.95) was achieved for all categories when tested against reference-standard report labels. A drop in performance (ΔAUC-ROC > 0.02) was seen for three categories (atrophy, encephalomalacia, vascular) when tested against reference-standard image labels, highlighting discrepancies in the original reports. Once trained, the model assigned labels to 121,556 examinations in under 30 min. Conclusions Our model accurately classifies head MRI examinations, enabling automated dataset labelling for downstream computer vision applications. Key Points • Deep learning is poised to revolutionise image recognition tasks in radiology; however, a barrier to clinical adoption is the difficulty of obtaining large labelled datasets for model training. • We demonstrate a deep learning model which can derive labels from neuroradiology reports and assign these to the corresponding examinations at scale, facilitating the development of downstream computer vision models. • We rigorously tested our model by comparing labels predicted on the basis of neuroradiology reports with two sets of reference-standard labels: (1) labels derived by manually scrutinising each radiology report and (2) labels derived by interrogating the actual images. Supplementary Information The online version contains supplementary material available at 10.1007/s00330-021-08132-0.
Collapse
Affiliation(s)
- David A Wood
- School of Biomedical Engineering & Imaging Sciences, Kings College London, Rayne Institute, 4th Floor, Lambeth Wing, London, SE1 7EH, UK
| | - Sina Kafiabadi
- Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
| | - Aisha Al Busaidi
- Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
| | - Emily L Guilhem
- Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
| | - Jeremy Lynch
- Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
| | | | - Antanas Montvila
- Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK.,Hospital of Lithuanian University of Health Sciences, Kaunas Clinics, Kaunas, Lithuania
| | - Martin Kiik
- School of Biomedical Engineering & Imaging Sciences, Kings College London, Rayne Institute, 4th Floor, Lambeth Wing, London, SE1 7EH, UK
| | - Juveria Siddiqui
- Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
| | - Naveen Gadapa
- Department of Neurology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
| | - Matthew D Benger
- Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
| | - Asif Mazumder
- Guy's and St Thomas' NHS Foundation Trust, Westminster Bridge Road, London, SE1 7EH, UK
| | - Gareth Barker
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, SE5 8AF, UK
| | - Sebastian Ourselin
- School of Biomedical Engineering & Imaging Sciences, Kings College London, Rayne Institute, 4th Floor, Lambeth Wing, London, SE1 7EH, UK
| | - James H Cole
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, SE5 8AF, UK.,Centre for Medical Image Computing, Department of Computer Science, University College London, London, WC1V 6LJ, UK.,Dementia Research Centre, University College London, London, WC1N 3BG, UK
| | - Thomas C Booth
- School of Biomedical Engineering & Imaging Sciences, Kings College London, Rayne Institute, 4th Floor, Lambeth Wing, London, SE1 7EH, UK. .,Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK.
| |
Collapse
|
150
|
Liu K, Shen Y, Wu N, Chłędowski J, Fernandez-Granda C, Geras KJ. Weakly-supervised High-resolution Segmentation of Mammography Images for Breast Cancer Diagnosis. PROCEEDINGS OF MACHINE LEARNING RESEARCH 2021; 143:268-285. [PMID: 35088055 PMCID: PMC8791642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In the last few years, deep learning classifiers have shown promising results in image-based medical diagnosis. However, interpreting the outputs of these models remains a challenge. In cancer diagnosis, interpretability can be achieved by localizing the region of the input image responsible for the output, i.e. the location of a lesion. Alternatively, segmentation or detection models can be trained with pixel-wise annotations indicating the locations of malignant lesions. Unfortunately, acquiring such labels is labor-intensive and requires medical expertise. To overcome this difficulty, weakly-supervised localization can be utilized. These methods allow neural network classifiers to output saliency maps highlighting the regions of the input most relevant to the classification task (e.g. malignant lesions in mammograms) using only image-level labels (e.g. whether the patient has cancer or not) during training. When applied to high-resolution images, existing methods produce low-resolution saliency maps. This is problematic in applications in which suspicious lesions are small in relation to the image size. In this work, we introduce a novel neural network architecture to perform weakly-supervised segmentation of high-resolution images. The proposed model selects regions of interest via coarse-level localization, and then performs fine-grained segmentation of those regions. We apply this model to breast cancer diagnosis with screening mammography, and validate it on a large clinically-realistic dataset. Measured by Dice similarity score, our approach outperforms existing methods by a large margin in terms of localization performance of benign and malignant lesions, relatively improving the performance by 39.6% and 20.0%, respectively. Code and the weights of some of the models are available at https://github.com/nyukat/GLAM.
Collapse
Affiliation(s)
| | | | - Nan Wu
- NYU Center for Data Science
| | | | | | | |
Collapse
|