1
|
Dominguez-Morales JP, Duran-Lopez L, Marini N, Vicente-Diaz S, Linares-Barranco A, Atzori M, Müller H. A systematic comparison of deep learning methods for Gleason grading and scoring. Med Image Anal 2024; 95:103191. [PMID: 38728903 DOI: 10.1016/j.media.2024.103191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 01/16/2024] [Accepted: 05/02/2024] [Indexed: 05/12/2024]
Abstract
Prostate cancer is the second most frequent cancer in men worldwide after lung cancer. Its diagnosis is based on the identification of the Gleason score that evaluates the abnormality of cells in glands through the analysis of the different Gleason patterns within tissue samples. The recent advancements in computational pathology, a domain aiming at developing algorithms to automatically analyze digitized histopathology images, lead to a large variety and availability of datasets and algorithms for Gleason grading and scoring. However, there is no clear consensus on which methods are best suited for each problem in relation to the characteristics of data and labels. This paper provides a systematic comparison on nine datasets with state-of-the-art training approaches for deep neural networks (including fully-supervised learning, weakly-supervised learning, semi-supervised learning, Additive-MIL, Attention-Based MIL, Dual-Stream MIL, TransMIL and CLAM) applied to Gleason grading and scoring tasks. The nine datasets are collected from pathology institutes and openly accessible repositories. The results show that the best methods for Gleason grading and Gleason scoring tasks are fully supervised learning and CLAM, respectively, guiding researchers to the best practice to adopt depending on the task to solve and the labels that are available.
Collapse
Affiliation(s)
- Juan P Dominguez-Morales
- Robotics and Technology of Computers Lab., ETSII-EPS, Universidad de Sevilla, Sevilla 41012, Spain; SCORE Lab, I3US. Universidad de Sevilla, Spain.
| | - Lourdes Duran-Lopez
- Robotics and Technology of Computers Lab., ETSII-EPS, Universidad de Sevilla, Sevilla 41012, Spain; SCORE Lab, I3US. Universidad de Sevilla, Spain
| | - Niccolò Marini
- Information Systems Institute, University of Applied Sciences Western Switzerland (HES-SO Valais), Technopôle 3, Sierre 3960, Switzerland; Centre Universitaire d'Informatique, University of Geneva, Carouge 1227, Switzerland
| | - Saturnino Vicente-Diaz
- Robotics and Technology of Computers Lab., ETSII-EPS, Universidad de Sevilla, Sevilla 41012, Spain; SCORE Lab, I3US. Universidad de Sevilla, Spain
| | - Alejandro Linares-Barranco
- Robotics and Technology of Computers Lab., ETSII-EPS, Universidad de Sevilla, Sevilla 41012, Spain; SCORE Lab, I3US. Universidad de Sevilla, Spain
| | - Manfredo Atzori
- Information Systems Institute, University of Applied Sciences Western Switzerland (HES-SO Valais), Technopôle 3, Sierre 3960, Switzerland; Department of Neuroscience, University of Padua, Via Giustiniani 2, Padua, 35128, Italy
| | - Henning Müller
- Information Systems Institute, University of Applied Sciences Western Switzerland (HES-SO Valais), Technopôle 3, Sierre 3960, Switzerland; Medical faculty, University of Geneva, Geneva 1211, Switzerland
| |
Collapse
|
2
|
Claudio Quiros A, Coudray N, Yeaton A, Yang X, Liu B, Le H, Chiriboga L, Karimkhan A, Narula N, Moore DA, Park CY, Pass H, Moreira AL, Le Quesne J, Tsirigos A, Yuan K. Mapping the landscape of histomorphological cancer phenotypes using self-supervised learning on unannotated pathology slides. Nat Commun 2024; 15:4596. [PMID: 38862472 DOI: 10.1038/s41467-024-48666-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 05/08/2024] [Indexed: 06/13/2024] Open
Abstract
Cancer diagnosis and management depend upon the extraction of complex information from microscopy images by pathologists, which requires time-consuming expert interpretation prone to human bias. Supervised deep learning approaches have proven powerful, but are inherently limited by the cost and quality of annotations used for training. Therefore, we present Histomorphological Phenotype Learning, a self-supervised methodology requiring no labels and operating via the automatic discovery of discriminatory features in image tiles. Tiles are grouped into morphologically similar clusters which constitute an atlas of histomorphological phenotypes (HP-Atlas), revealing trajectories from benign to malignant tissue via inflammatory and reactive phenotypes. These clusters have distinct features which can be identified using orthogonal methods, linking histologic, molecular and clinical phenotypes. Applied to lung cancer, we show that they align closely with patient survival, with histopathologically recognised tumor types and growth patterns, and with transcriptomic measures of immunophenotype. These properties are maintained in a multi-cancer study.
Collapse
Affiliation(s)
- Adalberto Claudio Quiros
- School of Computing Science, University of Glasgow, Glasgow, Scotland, UK
- School of Cancer Sciences, University of Glasgow, Glasgow, Scotland, UK
| | - Nicolas Coudray
- Applied Bioinformatics Laboratories, NYU Grossman School of Medicine, New York, NY, USA
- Department of Cell Biology, NYU Grossman School of Medicine, New York, NY, USA
- Department of Medicine, Division of Precision Medicine, NYU Grossman School of Medicine, New York, USA
| | - Anna Yeaton
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - Xinyu Yang
- School of Computing Science, University of Glasgow, Glasgow, Scotland, UK
| | - Bojing Liu
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Soln, Sweden
| | - Hortense Le
- Department of Medicine, Division of Precision Medicine, NYU Grossman School of Medicine, New York, USA
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - Luis Chiriboga
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - Afreen Karimkhan
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - Navneet Narula
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - David A Moore
- Department of Cellular Pathology, University College London Hospital, London, UK
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
| | - Christopher Y Park
- Department of Medicine, Division of Precision Medicine, NYU Grossman School of Medicine, New York, USA
| | - Harvey Pass
- Department of Cardiothoracic Surgery, NYU Grossman School of Medicine, New York, NY, USA
| | - Andre L Moreira
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - John Le Quesne
- School of Cancer Sciences, University of Glasgow, Glasgow, Scotland, UK.
- Cancer Research UK Scotland Institute, Glasgow, Scotland, UK.
- Queen Elizabeth University Hospital, Greater Glasgow and Clyde NHS Trust, Glasgow, Scotland, UK.
| | - Aristotelis Tsirigos
- Applied Bioinformatics Laboratories, NYU Grossman School of Medicine, New York, NY, USA.
- Department of Medicine, Division of Precision Medicine, NYU Grossman School of Medicine, New York, USA.
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA.
| | - Ke Yuan
- School of Computing Science, University of Glasgow, Glasgow, Scotland, UK.
- School of Cancer Sciences, University of Glasgow, Glasgow, Scotland, UK.
- Cancer Research UK Scotland Institute, Glasgow, Scotland, UK.
| |
Collapse
|
3
|
Beau M, Herzfeld DJ, Naveros F, Hemelt ME, D’Agostino F, Oostland M, Sánchez-López A, Chung YY, Michael Maibach, Kyranakis S, Stabb HN, Martínez Lopera MG, Lajko A, Zedler M, Ohmae S, Hall NJ, Clark BA, Cohen D, Lisberger SG, Kostadinov D, Hull C, Häusser M, Medina JF. A deep-learning strategy to identify cell types across species from high-density extracellular recordings. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.30.577845. [PMID: 38352514 PMCID: PMC10862837 DOI: 10.1101/2024.01.30.577845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
High-density probes allow electrophysiological recordings from many neurons simultaneously across entire brain circuits but don't reveal cell type. Here, we develop a strategy to identify cell types from extracellular recordings in awake animals, revealing the computational roles of neurons with distinct functional, molecular, and anatomical properties. We combine optogenetic activation and pharmacology using the cerebellum as a testbed to generate a curated ground-truth library of electrophysiological properties for Purkinje cells, molecular layer interneurons, Golgi cells, and mossy fibers. We train a semi-supervised deep-learning classifier that predicts cell types with greater than 95% accuracy based on waveform, discharge statistics, and layer of the recorded neuron. The classifier's predictions agree with expert classification on recordings using different probes, in different laboratories, from functionally distinct cerebellar regions, and across animal species. Our classifier extends the power of modern dynamical systems analyses by revealing the unique contributions of simultaneously-recorded cell types during behavior.
Collapse
Affiliation(s)
- Maxime Beau
- Wolfson Institute for Biomedical Research, University College London, London, UK
| | - David J. Herzfeld
- Department of Neurobiology, Duke University School of Medicine, Durham, NC, USA
| | - Francisco Naveros
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Department of Computer Engineering, Automation and Robotics, Research Centre for Information and Communication Technologies, University of Granada, Granada, Spain
| | - Marie E. Hemelt
- Department of Neurobiology, Duke University School of Medicine, Durham, NC, USA
| | - Federico D’Agostino
- Wolfson Institute for Biomedical Research, University College London, London, UK
| | - Marlies Oostland
- Wolfson Institute for Biomedical Research, University College London, London, UK
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, the Netherlands
| | | | - Young Yoon Chung
- Wolfson Institute for Biomedical Research, University College London, London, UK
| | - Michael Maibach
- Wolfson Institute for Biomedical Research, University College London, London, UK
| | - Stephen Kyranakis
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
| | - Hannah N. Stabb
- Wolfson Institute for Biomedical Research, University College London, London, UK
| | | | - Agoston Lajko
- Wolfson Institute for Biomedical Research, University College London, London, UK
| | - Marie Zedler
- Wolfson Institute for Biomedical Research, University College London, London, UK
| | - Shogo Ohmae
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
| | - Nathan J. Hall
- Department of Neurobiology, Duke University School of Medicine, Durham, NC, USA
| | - Beverley A. Clark
- Wolfson Institute for Biomedical Research, University College London, London, UK
| | - Dana Cohen
- The Leslie and Susan Gonda Multidisciplinary Brain Research Center, Bar-Ilan University, Ramat-Gan, Israel
| | | | - Dimitar Kostadinov
- Wolfson Institute for Biomedical Research, University College London, London, UK
- Centre for Developmental Neurobiology, King’s College London, London, UK
| | - Court Hull
- Department of Neurobiology, Duke University School of Medicine, Durham, NC, USA
| | - Michael Häusser
- Wolfson Institute for Biomedical Research, University College London, London, UK
| | - Javier F. Medina
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
| |
Collapse
|
4
|
Durand M, Largouët C, de Beaufort LB, Dourmad JY, Gaillard C. Estimation of gestating sows' welfare status based on machine learning methods and behavioral data. Sci Rep 2023; 13:21042. [PMID: 38030686 PMCID: PMC10686986 DOI: 10.1038/s41598-023-46925-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 11/07/2023] [Indexed: 12/01/2023] Open
Abstract
Estimating the welfare status at an individual level on the farm is a current issue to improve livestock animal monitoring. New technologies showed opportunities to analyze livestock behavior with machine learning and sensors. The aim of the study was to estimate some components of the welfare status of gestating sows based on machine learning methods and behavioral data. The dataset used was a combination of individual and group measures of behavior (activity, social and feeding behaviors). A clustering method was used to estimate the welfare status of 69 sows (housed in four groups) during different periods (sum of 2 days per week) of gestation (between 6 and 10 periods, depending on the group). Three clusters were identified and labelled (scapegoat, gentle and aggressive). Environmental conditions and the sows' health influenced the proportion of sows in each cluster, contrary to the characteristics of the sow (age, body weight or body condition). The results also confirmed the importance of group behavior on the welfare of each individual. A decision tree was learned and used to classify the sows into the three categories of welfare issued from the clustering step. This classification relied on data obtained from an automatic feeder and automated video analysis, achieving an accuracy rate exceeding 72%. This study showed the potential of an automatic decision support system to categorize welfare based on the behavior of each gestating sow and the group of sows.
Collapse
Affiliation(s)
- Maëva Durand
- PEGASE, INRAE, Institut Agro, 35590, Saint Gilles, France
| | | | | | | | | |
Collapse
|
5
|
Yang Y, Sun K, Gao Y, Wang K, Yu G. Preparing Data for Artificial Intelligence in Pathology with Clinical-Grade Performance. Diagnostics (Basel) 2023; 13:3115. [PMID: 37835858 PMCID: PMC10572440 DOI: 10.3390/diagnostics13193115] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 09/27/2023] [Accepted: 09/28/2023] [Indexed: 10/15/2023] Open
Abstract
The pathology is decisive for disease diagnosis but relies heavily on experienced pathologists. In recent years, there has been growing interest in the use of artificial intelligence in pathology (AIP) to enhance diagnostic accuracy and efficiency. However, the impressive performance of deep learning-based AIP in laboratory settings often proves challenging to replicate in clinical practice. As the data preparation is important for AIP, the paper has reviewed AIP-related studies in the PubMed database published from January 2017 to February 2022, and 118 studies were included. An in-depth analysis of data preparation methods is conducted, encompassing the acquisition of pathological tissue slides, data cleaning, screening, and subsequent digitization. Expert review, image annotation, dataset division for model training and validation are also discussed. Furthermore, we delve into the reasons behind the challenges in reproducing the high performance of AIP in clinical settings and present effective strategies to enhance AIP's clinical performance. The robustness of AIP depends on a randomized collection of representative disease slides, incorporating rigorous quality control and screening, correction of digital discrepancies, reasonable annotation, and sufficient data volume. Digital pathology is fundamental in clinical-grade AIP, and the techniques of data standardization and weakly supervised learning methods based on whole slide image (WSI) are effective ways to overcome obstacles of performance reproduction. The key to performance reproducibility lies in having representative data, an adequate amount of labeling, and ensuring consistency across multiple centers. Digital pathology for clinical diagnosis, data standardization and the technique of WSI-based weakly supervised learning will hopefully build clinical-grade AIP.
Collapse
Affiliation(s)
- Yuanqing Yang
- Department of Biomedical Engineering, School of Basic Medical Sciences, Central South University, Changsha 410013, China; (Y.Y.); (K.S.)
- Department of Biomedical Engineering, School of Medical, Tsinghua University, Beijing 100084, China
| | - Kai Sun
- Department of Biomedical Engineering, School of Basic Medical Sciences, Central South University, Changsha 410013, China; (Y.Y.); (K.S.)
- Furong Laboratory, Changsha 410013, China
| | - Yanhua Gao
- Department of Ultrasound, Shaanxi Provincial People’s Hospital, Xi’an 710068, China;
| | - Kuansong Wang
- Department of Pathology, School of Basic Medical Sciences, Central South University, Changsha 410013, China;
- Department of Pathology, Xiangya Hospital, Central South University, Changsha 410013, China
| | - Gang Yu
- Department of Biomedical Engineering, School of Basic Medical Sciences, Central South University, Changsha 410013, China; (Y.Y.); (K.S.)
| |
Collapse
|
6
|
Xiang H, Shen J, Yan Q, Xu M, Shi X, Zhu X. Multi-scale representation attention based deep multiple instance learning for gigapixel whole slide image analysis. Med Image Anal 2023; 89:102890. [PMID: 37467642 DOI: 10.1016/j.media.2023.102890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 04/22/2023] [Accepted: 07/03/2023] [Indexed: 07/21/2023]
Abstract
Recently, convolutional neural networks (CNNs) directly using whole slide images (WSIs) for tumor diagnosis and analysis have attracted considerable attention, because they only utilize the slide-level label for model training without any additional annotations. However, it is still a challenging task to directly handle gigapixel WSIs, due to the billions of pixels and intra-variations in each WSI. To overcome this problem, in this paper, we propose a novel end-to-end interpretable deep MIL framework for WSI analysis, by using a two-branch deep neural network and a multi-scale representation attention mechanism to directly extract features from all patches of each WSI. Specifically, we first divide each WSI into bag-, patch- and cell-level images, and then assign the slide-level label to its corresponding bag-level images, so that WSI classification becomes a MIL problem. Additionally, we design a novel multi-scale representation attention mechanism, and embed it into a two-branch deep network to simultaneously mine the bag with a correct label, the significant patches and their cell-level information. Extensive experiments demonstrate the superior performance of the proposed framework over recent state-of-the-art methods, in term of classification accuracy and model interpretability. All source codes are released at: https://github.com/xhangchen/MRAN/.
Collapse
Affiliation(s)
- Hangchen Xiang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Junyi Shen
- Division of Liver Surgery, Department of General Surgery, West China Hospital, Sichuan University, Chengdu, 610044, China
| | - Qingguo Yan
- Department of Pathology Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, School of Medicine, Northwest University, 229 Taibai North Road, Xi'an 710069, China
| | - Meilian Xu
- School of Electronic Information and Artificial Intelligence, Leshan Normal University, Leshan, 614000, China.
| | - Xiaoshuang Shi
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
| | - Xiaofeng Zhu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| |
Collapse
|
7
|
Xu Y, Zheng X, Li Y, Ye X, Cheng H, Wang H, Lyu J. Exploring patient medication adherence and data mining methods in clinical big data: A contemporary review. J Evid Based Med 2023; 16:342-375. [PMID: 37718729 DOI: 10.1111/jebm.12548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 08/30/2023] [Indexed: 09/19/2023]
Abstract
BACKGROUND Increasingly, patient medication adherence data are being consolidated from claims databases and electronic health records (EHRs). Such databases offer an indirect avenue to gauge medication adherence in our data-rich healthcare milieu. The surge in data accessibility, coupled with the pressing need for its conversion to actionable insights, has spotlighted data mining, with machine learning (ML) emerging as a pivotal technique. Nonadherence poses heightened health risks and escalates medical costs. This paper elucidates the synergistic interaction between medical database mining for medication adherence and the role of ML in fostering knowledge discovery. METHODS We conducted a comprehensive review of EHR applications in the realm of medication adherence, leveraging ML techniques. We expounded on the evolution and structure of medical databases pertinent to medication adherence and harnessed both supervised and unsupervised ML paradigms to delve into adherence and its ramifications. RESULTS Our study underscores the applications of medical databases and ML, encompassing both supervised and unsupervised learning, for medication adherence in clinical big data. Databases like SEER and NHANES, often underutilized due to their intricacies, have gained prominence. Employing ML to excavate patient medication logs from these databases facilitates adherence analysis. Such findings are pivotal for clinical decision-making, risk stratification, and scholarly pursuits, aiming to elevate healthcare quality. CONCLUSION Advanced data mining in the era of big data has revolutionized medication adherence research, thereby enhancing patient care. Emphasizing bespoke interventions and research could herald transformative shifts in therapeutic modalities.
Collapse
Affiliation(s)
- Yixian Xu
- Department of Anesthesiology, The First Affiliated Hospital of Jinan University, Guangzhou, China
| | - Xinkai Zheng
- Department of Dermatology, The First Affiliated Hospital of Jinan University, Guangzhou, China
| | - Yuanjie Li
- Planning & Discipline Construction Office, The First Affiliated Hospital of Jinan University, Guangzhou, China
| | - Xinmiao Ye
- Department of Anesthesiology, The First Affiliated Hospital of Jinan University, Guangzhou, China
| | - Hongtao Cheng
- School of Nursing, Jinan University, Guangzhou, China
| | - Hao Wang
- Department of Anesthesiology, The First Affiliated Hospital of Jinan University, Guangzhou, China
| | - Jun Lyu
- Department of Clinical Research, The First Affiliated Hospital of Jinan University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Traditional Chinese Medicine Informatization, Guangzhou, China
| |
Collapse
|
8
|
Sun K, Chen Y, Bai B, Gao Y, Xiao J, Yu G. Automatic Classification of Histopathology Images across Multiple Cancers Based on Heterogeneous Transfer Learning. Diagnostics (Basel) 2023; 13:diagnostics13071277. [PMID: 37046497 PMCID: PMC10093253 DOI: 10.3390/diagnostics13071277] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 03/07/2023] [Accepted: 03/23/2023] [Indexed: 03/31/2023] Open
Abstract
Background: Current artificial intelligence (AI) in histopathology typically specializes on a single task, resulting in a heavy workload of collecting and labeling a sufficient number of images for each type of cancer. Heterogeneous transfer learning (HTL) is expected to alleviate the data bottlenecks and establish models with performance comparable to supervised learning (SL). Methods: An accurate source domain model was trained using 28,634 colorectal patches. Additionally, 1000 sentinel lymph node patches and 1008 breast patches were used to train two target domain models. The feature distribution difference between sentinel lymph node metastasis or breast cancer and CRC was reduced by heterogeneous domain adaptation, and the maximum mean difference between subdomains was used for knowledge transfer to achieve accurate classification across multiple cancers. Result: HTL on 1000 sentinel lymph node patches (L-HTL-1000) outperforms SL on 1000 sentinel lymph node patches (L-SL-1-1000) (average area under the curve (AUC) and standard deviation of L-HTL-1000 vs. L-SL-1-1000: 0.949 ± 0.004 vs. 0.931 ± 0.008, p value = 0.008). There is no significant difference between L-HTL-1000 and SL on 7104 patches (L-SL-2-7104) (0.949 ± 0.004 vs. 0.948 ± 0.008, p value = 0.742). Similar results are observed for breast cancer. B-HTL-1008 vs. B-SL-1-1008: 0.962 ± 0.017 vs. 0.943 ± 0.018, p value = 0.008; B-HTL-1008 vs. B-SL-2-5232: 0.962 ± 0.017 vs. 0.951 ± 0.023, p value = 0.148. Conclusions: HTL is capable of building accurate AI models for similar cancers using a small amount of data based on a large dataset for a certain type of cancer. HTL holds great promise for accelerating the development of AI in histopathology.
Collapse
|
9
|
Karasmanoglou A, Antonakakis M, Zervakis M. ECG-Based Semi-Supervised Anomaly Detection for Early Detection and Monitoring of Epileptic Seizures. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:5000. [PMID: 36981911 PMCID: PMC10049350 DOI: 10.3390/ijerph20065000] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 02/17/2023] [Accepted: 03/08/2023] [Indexed: 06/18/2023]
Abstract
Epilepsy is one of the most common brain diseases, characterized by frequent recurrent seizures or "ictal" states. A patient experiences uncontrollable muscular contractions, inducing loss of mobility and balance, which may result in injury or even death during these ictal states. Extensive investigation is vital to establish a systematic approach for predicting and informing patients about oncoming seizures ahead of time. Most methodologies developed are focused on the detection of abnormalities using mostly electroencephalogram (EEG) recordings. In this regard, research has indicated that certain pre-ictal alterations in the Autonomic Nervous System (ANS) can be detected in patient electrocardiogram (ECG) signals. The latter could potentially provide the basis for a robust seizure prediction approach. The recently proposed ECG-based seizure warning systems utilize machine learning models to classify a patient's condition. Such approaches require the incorporation of large, diverse, and thoroughly annotated ECG datasets, limiting their application potential. In this work, we investigate anomaly detection models in a patient-specific context with low supervision requirements. Specifically, we consider One-Class SVM (OCSVM), Minimum Covariance Determinant (MCD) Estimator, and Local Outlier Factor (LOF) models to quantify the novelty or abnormality of pre-ictal short-term (2-3 min) Heart Rate Variability (HRV) features of patients, trained on a reference interval considered to contain stable heart rate as the only form of supervision. Our models are evaluated against labels that were either hand-picked or automatically generated (weak labels) by a two-phase clustering procedure for samples of the "Post-Ictal Heart Rate Oscillations in Partial Epilepsy" (PIHROPE) dataset recorded by the Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, achieving detection in 9 out of 10 cases, with average AUCs of over 93% across all models and warning times ranging from 6 to 30 min prior to seizure. The proposed anomaly detection and monitoring approach can potentially pave the way for early detection and warning of seizure incidents based on body sensor inputs.
Collapse
|
10
|
Zhou H, Xu L, Ren Z, Zhu J, Lee C. Machine learning-augmented surface-enhanced spectroscopy toward next-generation molecular diagnostics. NANOSCALE ADVANCES 2023; 5:538-570. [PMID: 36756499 PMCID: PMC9890940 DOI: 10.1039/d2na00608a] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 11/06/2022] [Indexed: 06/17/2023]
Abstract
The world today is witnessing the significant role and huge demand for molecular detection and screening in healthcare and medical diagnosis, especially during the outbreak of COVID-19. Surface-enhanced spectroscopy techniques, including Surface-Enhanced Raman Scattering (SERS) and Infrared Absorption (SEIRA), provide lattice and molecular vibrational fingerprint information which is directly linked to the molecular constituents, chemical bonds, and configuration. These properties make them an unambiguous, nondestructive, and label-free toolkit for molecular diagnostics and screening. However, new issues in molecular diagnostics, such as increasing molecular species, faster spread of viruses, and higher requirements for detection accuracy and sensitivity, have brought great challenges to detection technology. Advancements in artificial intelligence and machine learning (ML) techniques show promising potential in empowering SERS and SEIRA with rapid analysis and automatic data processing to jointly tackle the challenge. This review introduces the combination of ML and SERS/SEIRA by investigating how ML algorithms can be beneficial to SERS/SEIRA, discussing the general process of combining ML and SEIRA/SERS, highlighting the molecular diagnostics and screening applications based on ML-combined SEIRA/SERS, and providing perspectives on the future development of ML-integrated SEIRA/SERS. In general, this review offers comprehensive knowledge about the recent advances and the future outlook regarding ML-integrated SEIRA/SERS for molecular diagnostics and screening.
Collapse
Affiliation(s)
- Hong Zhou
- Department of Electrical and Computer Engineering, National University of Singapore Singapore 117583
- Center for Intelligent Sensors and MEMS (CISM), National University of Singapore Singapore 117608
| | - Liangge Xu
- Department of Electrical and Computer Engineering, National University of Singapore Singapore 117583
- Center for Intelligent Sensors and MEMS (CISM), National University of Singapore Singapore 117608
- National Key Laboratory of Special Environment Composite Technology, Harbin Institute of Technology Harbin 150001 China
| | - Zhihao Ren
- Department of Electrical and Computer Engineering, National University of Singapore Singapore 117583
- Center for Intelligent Sensors and MEMS (CISM), National University of Singapore Singapore 117608
| | - Jiaqi Zhu
- National Key Laboratory of Special Environment Composite Technology, Harbin Institute of Technology Harbin 150001 China
| | - Chengkuo Lee
- Department of Electrical and Computer Engineering, National University of Singapore Singapore 117583
- Center for Intelligent Sensors and MEMS (CISM), National University of Singapore Singapore 117608
- NUS Suzhou Research Institute (NUSRI) Suzhou 215123 China
| |
Collapse
|
11
|
Lorenz C, Hao X, Tomka T, Rüttimann L, Hahnloser RH. Interactive extraction of diverse vocal units from a planar embedding without the need for prior sound segmentation. FRONTIERS IN BIOINFORMATICS 2023; 2:966066. [PMID: 36710910 PMCID: PMC9880044 DOI: 10.3389/fbinf.2022.966066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 11/14/2022] [Indexed: 01/15/2023] Open
Abstract
Annotating and proofreading data sets of complex natural behaviors such as vocalizations are tedious tasks because instances of a given behavior need to be correctly segmented from background noise and must be classified with minimal false positive error rate. Low-dimensional embeddings have proven very useful for this task because they can provide a visual overview of a data set in which distinct behaviors appear in different clusters. However, low-dimensional embeddings introduce errors because they fail to preserve distances; and embeddings represent only objects of fixed dimensionality, which conflicts with vocalizations that have variable dimensions stemming from their variable durations. To mitigate these issues, we introduce a semi-supervised, analytical method for simultaneous segmentation and clustering of vocalizations. We define a given vocalization type by specifying pairs of high-density regions in the embedding plane of sound spectrograms, one region associated with vocalization onsets and the other with offsets. We demonstrate our two-neighborhood (2N) extraction method on the task of clustering adult zebra finch vocalizations embedded with UMAP. We show that 2N extraction allows the identification of short and long vocal renditions from continuous data streams without initially committing to a particular segmentation of the data. Also, 2N extraction achieves much lower false positive error rate than comparable approaches based on a single defining region. Along with our method, we present a graphical user interface (GUI) for visualizing and annotating data.
Collapse
Affiliation(s)
- Corinna Lorenz
- Institute of Neuroinformatics and Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland,Université Paris-Saclay, CNRS, Institut des Neurosciences Paris-Saclay, Saclay, France
| | - Xinyu Hao
- Institute of Neuroinformatics and Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland,Tianjin University, School of Electrical and Information Engineering, Tianjin, China
| | - Tomas Tomka
- Institute of Neuroinformatics and Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Linus Rüttimann
- Institute of Neuroinformatics and Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Richard H.R. Hahnloser
- Institute of Neuroinformatics and Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland,*Correspondence: Richard H.R. Hahnloser,
| |
Collapse
|
12
|
Folmsbee J, Zhang L, Lu X, Rahman J, Gentry J, Conn B, Vered M, Roy P, Gupta R, Lin D, Samankan S, Dhorajiva P, Peter A, Wang M, Israel A, Brandwein-Weber M, Doyle S. Histology segmentation using active learning on regions of interest in oral cavity squamous cell carcinoma. J Pathol Inform 2022; 13:100146. [PMID: 36268093 PMCID: PMC9577135 DOI: 10.1016/j.jpi.2022.100146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 09/15/2022] [Accepted: 09/22/2022] [Indexed: 11/28/2022] Open
Abstract
In digital pathology, deep learning has been shown to have a wide range of applications, from cancer grading to segmenting structures like glomeruli. One of the main hurdles for digital pathology to be truly effective is the size of the dataset needed for generalization to address the spectrum of possible morphologies. Small datasets limit classifiers' ability to generalize. Yet, when we move to larger datasets of whole slide images (WSIs) of tissue, these datasets may cause network bottlenecks as each WSI at its original magnification can be upwards of 100 000 by 100 000 pixels, and over a gigabyte in file size. Compounding this problem, high quality pathologist annotations are difficult to obtain, as the volume of necessary annotations to create a classifier that can generalize would be extremely costly in terms of pathologist-hours. In this work, we use Active Learning (AL), a process for iterative interactive training, to create a modified U-net classifier on the region of interest (ROI) scale. We then compare this to Random Learning (RL), where images for addition to the dataset for retraining are randomly selected. Our hypothesis is that AL shows benefits for generating segmentation results versus randomly selecting images to annotate. We show that after 3 iterations, that AL, with an average Dice coefficient of 0.461, outperforms RL, with an average Dice Coefficient of 0.375, by 0.086.
Collapse
Affiliation(s)
- Jonathan Folmsbee
- Department of Pathology & Anatomical Sciences, University at Buffalo SUNY, Buffalo, NY, USA,Department of Biomedical Engineering, University at Buffalo SUNY, Buffalo, NY, USA,Corresponding author at: Jacobs School 955 Main Street, Room 4205, Pathology and Anatomical Sciences, Buffalo, NY 14203, USA
| | - Lei Zhang
- Department of Pathology & Anatomical Sciences, University at Buffalo SUNY, Buffalo, NY, USA
| | - Xulei Lu
- Icahn School of Medicine, The Mount Sinai Hospital, New York, NY, USA
| | - Jawaria Rahman
- Department of Pathology, Case Western University, Cleveland, OH, USA
| | - John Gentry
- Department of Pathology, Nebraska Medical Health System, Omaha, NE, USA
| | - Brendan Conn
- Department of Pathology, University of Edinburgh, Edinburgh, UK
| | - Marilena Vered
- Department of Oral Pathology, Oral Medicine and Maxillofacial Imaging, School of Dental Medicine, Tel Aviv University, Tel Aviv, IL, USA,Institute of Pathology, Sheba Medical Center, Tel Hashomer, Ramat Gan, IL, USA
| | - Paromita Roy
- Department of Pathology, Tata Memorial Cancer Center, Mumbai, IN, USA
| | - Ruta Gupta
- Department of Tissue Pathology and Diagnostic Oncology, NSW Health Pathology, Royal Prince Alfred Hospital and University of Sydney, Sydney, AU, USA
| | - Diana Lin
- Department of Pathology, The University of Alabama at Birmingham, Birmingham, AL, USA
| | - Shabnam Samankan
- Department of Pathology, George Washington University Hospital, Washington, DC, USA
| | - Pooja Dhorajiva
- Department of Oncologic Surgical Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Anu Peter
- Department of Pathology, University of Pennsylvania, Philadelphia, PA, USA
| | - Minhua Wang
- Department of Pathology, Yale University School of Medicine, New Haven, CT, USA
| | - Anna Israel
- Department of Anatomic Pathology, Robert J. Tomsich Pathology and Laboratory Medicine Institute, Cleveland Clinic, Cleveland, OH, USA
| | | | - Scott Doyle
- Department of Pathology & Anatomical Sciences, University at Buffalo SUNY, Buffalo, NY, USA,Department of Biomedical Engineering, University at Buffalo SUNY, Buffalo, NY, USA
| |
Collapse
|
13
|
Artificial Intelligence-Assisted Renal Pathology: Advances and Prospects. J Clin Med 2022; 11:jcm11164918. [PMID: 36013157 PMCID: PMC9410196 DOI: 10.3390/jcm11164918] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/30/2022] [Accepted: 08/11/2022] [Indexed: 11/17/2022] Open
Abstract
Digital imaging and advanced microscopy play a pivotal role in the diagnosis of kidney diseases. In recent years, great achievements have been made in digital imaging, providing novel approaches for precise quantitative assessments of nephropathology and relieving burdens of renal pathologists. Developing novel methods of artificial intelligence (AI)-assisted technology through multidisciplinary interaction among computer engineers, renal specialists, and nephropathologists could prove beneficial for renal pathology diagnoses. An increasing number of publications has demonstrated the rapid growth of AI-based technology in nephrology. In this review, we offer an overview of AI-assisted renal pathology, including AI concepts and the workflow of processing digital image data, focusing on the impressive advances of AI application in disease-specific backgrounds. In particular, this review describes the applied computer vision algorithms for the segmentation of kidney structures, diagnosis of specific pathological changes, and prognosis prediction based on images. Lastly, we discuss challenges and prospects to provide an objective view of this topic.
Collapse
|
14
|
Liu P, Qian W, Cao J, Xu D. Semi-supervised medical image classification via increasing prediction diversity. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04012-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
15
|
Semi-supervised learning framework for oil and gas pipeline failure detection. Sci Rep 2022; 12:13758. [PMID: 35962052 PMCID: PMC9374783 DOI: 10.1038/s41598-022-16830-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 07/18/2022] [Indexed: 11/28/2022] Open
Abstract
Quantifying failure events of oil and gas pipelines in real- or near-real-time facilitates a faster and more appropriate response plan. Developing a data-driven pipeline failure assessment model, however, faces a major challenge; failure history, in the form of incident reports, suffers from limited and missing information, making it difficult to incorporate a persistent input configuration to a supervised machine learning model. The literature falls short on the development of appropriate solutions to utilize incomplete databases and incident reports in the pipeline failure problem. This work proposes a semi-supervised machine learning framework which mines existing oil and gas pipeline failure databases. The proposed cluster-impute-classify (CIC) approach maps a relevant subset of the failure databases through which missing information in the incident report is reconstructed. A classifier is then trained on the fly to learn the functional relationship between the descriptors from a diverse feature set. The proposed approach, presented within an ensemble learning architecture, is easily scalable to various pipeline failure databases. The results show up to 91% detection accuracy and stable generalization ability against increased rate of missing information.
Collapse
|
16
|
Bashiri FS, Caskey JR, Mayampurath A, Dussault N, Dumanian J, Bhavani SV, Carey KA, Gilbert ER, Winslow CJ, Shah NS, Edelson DP, Afshar M, Churpek MM. Identifying infected patients using semi-supervised and transfer learning. J Am Med Inform Assoc 2022; 29:1696-1704. [PMID: 35869954 PMCID: PMC9471712 DOI: 10.1093/jamia/ocac109] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 06/13/2022] [Accepted: 07/01/2022] [Indexed: 11/12/2022] Open
Abstract
Abstract
Objectives
Early identification of infection improves outcomes, but developing models for early identification requires determining infection status with manual chart review, limiting sample size. Therefore, we aimed to compare semi-supervised and transfer learning algorithms with algorithms based solely on manual chart review for identifying infection in hospitalized patients.
Materials and Methods
This multicenter retrospective study of admissions to 6 hospitals included “gold-standard” labels of infection from manual chart review and “silver-standard” labels from nonchart-reviewed patients using the Sepsis-3 infection criteria based on antibiotic and culture orders. “Gold-standard” labeled admissions were randomly allocated to training (70%) and testing (30%) datasets. Using patient characteristics, vital signs, and laboratory data from the first 24 hours of admission, we derived deep learning and non-deep learning models using transfer learning and semi-supervised methods. Performance was compared in the gold-standard test set using discrimination and calibration metrics.
Results
The study comprised 432 965 admissions, of which 2724 underwent chart review. In the test set, deep learning and non-deep learning approaches had similar discrimination (area under the receiver operating characteristic curve of 0.82). Semi-supervised and transfer learning approaches did not improve discrimination over models fit using only silver- or gold-standard data. Transfer learning had the best calibration (unreliability index P value: .997, Brier score: 0.173), followed by self-learning gradient boosted machine (P value: .67, Brier score: 0.170).
Discussion
Deep learning and non-deep learning models performed similarly for identifying infection, as did models developed using Sepsis-3 and manual chart review labels.
Conclusion
In a multicenter study of almost 3000 chart-reviewed patients, semi-supervised and transfer learning models showed similar performance for model discrimination as baseline XGBoost, while transfer learning improved calibration.
Collapse
Affiliation(s)
- Fereshteh S Bashiri
- Department of Medicine, University of Wisconsin-Madison , Madison, Wisconsin, USA
| | - John R Caskey
- Department of Medicine, University of Wisconsin-Madison , Madison, Wisconsin, USA
| | - Anoop Mayampurath
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison , Madison, Wisconsin, USA
| | - Nicole Dussault
- Pritzker School of Medicine, University of Chicago , Chicago, Illinois, USA
| | - Jay Dumanian
- Pritzker School of Medicine, University of Chicago , Chicago, Illinois, USA
| | | | - Kyle A Carey
- Department of Medicine, University of Chicago , Chicago, Illinois, USA
| | - Emily R Gilbert
- Department of Medicine, Loyola University , Chicago, Illinois, USA
| | - Christopher J Winslow
- Department of Medicine, NorthShore University HealthSystem , Evanston, Illinois, USA
| | - Nirav S Shah
- Department of Medicine, University of Chicago , Chicago, Illinois, USA
- Department of Medicine, NorthShore University HealthSystem , Evanston, Illinois, USA
| | - Dana P Edelson
- Department of Medicine, University of Chicago , Chicago, Illinois, USA
| | - Majid Afshar
- Department of Medicine, University of Wisconsin-Madison , Madison, Wisconsin, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison , Madison, Wisconsin, USA
| | - Matthew M Churpek
- Department of Medicine, University of Wisconsin-Madison , Madison, Wisconsin, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison , Madison, Wisconsin, USA
| |
Collapse
|
17
|
Yi H. Efficient machine learning algorithm for electroencephalogram modeling in brain–computer interfaces. Neural Comput Appl 2022. [DOI: 10.1007/s00521-020-04861-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
18
|
Barragán-Montero A, Bibal A, Dastarac MH, Draguet C, Valdés G, Nguyen D, Willems S, Vandewinckele L, Holmström M, Löfman F, Souris K, Sterpin E, Lee JA. Towards a safe and efficient clinical implementation of machine learning in radiation oncology by exploring model interpretability, explainability and data-model dependency. Phys Med Biol 2022; 67:10.1088/1361-6560/ac678a. [PMID: 35421855 PMCID: PMC9870296 DOI: 10.1088/1361-6560/ac678a] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 04/14/2022] [Indexed: 01/26/2023]
Abstract
The interest in machine learning (ML) has grown tremendously in recent years, partly due to the performance leap that occurred with new techniques of deep learning, convolutional neural networks for images, increased computational power, and wider availability of large datasets. Most fields of medicine follow that popular trend and, notably, radiation oncology is one of those that are at the forefront, with already a long tradition in using digital images and fully computerized workflows. ML models are driven by data, and in contrast with many statistical or physical models, they can be very large and complex, with countless generic parameters. This inevitably raises two questions, namely, the tight dependence between the models and the datasets that feed them, and the interpretability of the models, which scales with its complexity. Any problems in the data used to train the model will be later reflected in their performance. This, together with the low interpretability of ML models, makes their implementation into the clinical workflow particularly difficult. Building tools for risk assessment and quality assurance of ML models must involve then two main points: interpretability and data-model dependency. After a joint introduction of both radiation oncology and ML, this paper reviews the main risks and current solutions when applying the latter to workflows in the former. Risks associated with data and models, as well as their interaction, are detailed. Next, the core concepts of interpretability, explainability, and data-model dependency are formally defined and illustrated with examples. Afterwards, a broad discussion goes through key applications of ML in workflows of radiation oncology as well as vendors' perspectives for the clinical implementation of ML.
Collapse
Affiliation(s)
- Ana Barragán-Montero
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| | - Adrien Bibal
- PReCISE, NaDI Institute, Faculty of Computer Science, UNamur and CENTAL, ILC, UCLouvain, Belgium
| | - Margerie Huet Dastarac
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| | - Camille Draguet
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
- Department of Oncology, Laboratory of Experimental Radiotherapy, KU Leuven, Belgium
| | - Gilmer Valdés
- Department of Radiation Oncology, Department of Epidemiology and Biostatistics, University of California, San Francisco, United States of America
| | - Dan Nguyen
- Medical Artificial Intelligence and Automation (MAIA) Laboratory, Department of Radiation Oncology, UT Southwestern Medical Center, United States of America
| | - Siri Willems
- ESAT/PSI, KU Leuven Belgium & MIRC, UZ Leuven, Belgium
| | | | | | | | - Kevin Souris
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| | - Edmond Sterpin
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
- Department of Oncology, Laboratory of Experimental Radiotherapy, KU Leuven, Belgium
| | - John A Lee
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| |
Collapse
|
19
|
Few-Shot Learning with Collateral Location Coding and Single-Key Global Spatial Attention for Medical Image Classification. ELECTRONICS 2022. [DOI: 10.3390/electronics11091510] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Humans are born with the ability to learn quickly by discerning objects from a few samples, to acquire new skills in a short period of time, and to make decisions based on limited prior experience and knowledge. The existing deep learning models for medical image classification often rely on a large number of labeled training samples, whereas the fast learning ability of deep neural networks has failed to develop. In addition, it requires a large amount of time and computing resource to retrain the model when the deep model encounters classes it has never seen before. However, for healthcare applications, enabling a model to generalize new clinical scenarios is of great importance. The existing image classification methods cannot explicitly use the location information of the pixel, making them insensitive to cues related only to the location. Besides, they also rely on local convolution and cannot properly utilize global information, which is essential for image classification. To alleviate these problems, we propose a collateral location coding to help the network explicitly exploit the location information of each pixel to make it easier for the network to recognize cues related to location only, and a single-key global spatial attention is designed to make the pixels at each location perceive the global spatial information in a low-cost way. Experimental results on three medical image benchmark datasets demonstrate that our proposed algorithm outperforms the state-of-the-art approaches in both effectiveness and generalization ability.
Collapse
|
20
|
Wu R, Das N, Chaba S, Gandhi S, Chau DH, Chu X. A Cluster-then-label Approach for Few-shot Learning with Application to Automatic Image Data Labeling. ACM JOURNAL OF DATA AND INFORMATION QUALITY 2022. [DOI: 10.1145/3491232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Few-shot learning (FSL) aims at learning to generalize from only a small number of labeled examples for a given target task. Most current state-of-the-art FSL methods typically have two limitations. First, they usually require access to a source dataset (in a similar domain) with abundant labeled examples, which may not always be possible due to privacy concerns and copyright issues. Second, they typically do not offer any estimation of the generalization error on the target FSL task because the handful of labeled examples must be used for training and cannot spare a validation subset. In this paper, we propose a cluster-then-label approach to perform few-shot learning. Our approach does not require access to the labeled source dataset and provides an estimation of generalization error. We show empirically, on four benchmark datasets, that our approach provides competitive predictive performance to state-of-the-art FSL approaches and our generalization error estimation is accurate. Finally, we explore the application of our proposed method to automatic image data labeling. We compare our method with existing automatic data labeling systems. The end-to-end performance of our method outperforms the state-of-the-art automatic data labeling system Snuba by 26% and is only 7% away from the fully supervised upper bound.
Collapse
Affiliation(s)
- Renzhi Wu
- Georgia Institute of Technology, USA
| | | | | | | | | | - Xu Chu
- Georgia Institute of Technology, USA
| |
Collapse
|
21
|
Ciga O, Xu T, Martel AL. Self supervised contrastive learning for digital histopathology. MACHINE LEARNING WITH APPLICATIONS 2022. [DOI: 10.1016/j.mlwa.2021.100198] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
|
22
|
McAlpine ED, Michelow P, Celik T. The Utility of Unsupervised Machine Learning in Anatomic Pathology. Am J Clin Pathol 2022; 157:5-14. [PMID: 34302331 DOI: 10.1093/ajcp/aqab085] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Accepted: 04/18/2021] [Indexed: 01/29/2023] Open
Abstract
OBJECTIVES Developing accurate supervised machine learning algorithms is hampered by the lack of representative annotated datasets. Most data in anatomic pathology are unlabeled and creating large, annotated datasets is a time consuming and laborious process. Unsupervised learning, which does not require annotated data, possesses the potential to assist with this challenge. This review aims to introduce the concept of unsupervised learning and illustrate how clustering, generative adversarial networks (GANs) and autoencoders have the potential to address the lack of annotated data in anatomic pathology. METHODS A review of unsupervised learning with examples from the literature was carried out. RESULTS Clustering can be used as part of semisupervised learning where labels are propagated from a subset of annotated data points to remaining unlabeled data points in a dataset. GANs may assist by generating large amounts of synthetic data and performing color normalization. Autoencoders allow training of a network on a large, unlabeled dataset and transferring learned representations to a classifier using a smaller, labeled subset (unsupervised pretraining). CONCLUSIONS Unsupervised machine learning techniques such as clustering, GANs, and autoencoders, used individually or in combination, may help address the lack of annotated data in pathology and improve the process of developing supervised learning models.
Collapse
Affiliation(s)
- Ewen D McAlpine
- Division of Anatomical Pathology, School of Pathology, University of the Witwatersrand, Johannesburg, South Africa
- National Health Laboratory Service, Johannesburg, South Africa
| | - Pamela Michelow
- Division of Anatomical Pathology, School of Pathology, University of the Witwatersrand, Johannesburg, South Africa
- National Health Laboratory Service, Johannesburg, South Africa
| | - Turgay Celik
- School of Electrical and Information Engineering, University of the Witwatersrand, Johannesburg, South Africa
- Wits Institute of Data Science, University of the Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
23
|
McAlpine ED, Michelow PM, Celik T. The Dynamics of Pathology Dataset Creation Using Urine Cytology as an Example. Acta Cytol 2021; 66:46-54. [PMID: 34662874 DOI: 10.1159/000519273] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 08/26/2021] [Indexed: 11/19/2022]
Abstract
INTRODUCTION Dataset creation is one of the first tasks required for training AI algorithms but is underestimated in pathology. High-quality data are essential for training algorithms and data should be labelled accurately and include sufficient morphological diversity. The dynamics and challenges of labelling a urine cytology dataset using The Paris System (TPS) criteria are presented. METHODS 2,454 images were labelled by pathologist consensus via video conferencing over a 14-day period. During the labelling sessions, the dynamics of the labelling process were recorded. Quality assurance images were randomly selected from images labelled in previous sessions within this study and randomly distributed throughout new labelling sessions. To assess the effect of time on the labelling process, the labelled set of images was split into 2 groups according to the median relative label time and the time taken to label images and intersession agreement were assessed. RESULTS Labelling sessions ranged from 24 m 11 s to 41 m 06 s in length, with a median of 33 m 47 s. The majority of the 2,454 images were labelled as benign urothelial cells, with atypical and malignant urothelial cells more sparsely represented. The time taken to label individual images ranged from 1 s to 42 s with a median of 2.9 s. Labelling times differed significantly among categories, with the median label time for the atypical urothelial category being 7.2 s, followed by the malignant urothelial category at 3.8 s and the benign urothelial category at 2.9 s. The overall intersession agreement for quality assurance images was substantial. The level of agreement differed among classes of urothelial cells - benign and malignant urothelial cell classes showed almost perfect agreement and the atypical urothelial cell class showed moderate agreement. Image labelling times seemed to speed up, and there was no evidence of worsening of intersession agreement with session time. DISCUSSION/CONCLUSION Important aspects of pathology dataset creation are presented, illustrating the significant resources required for labelling a large dataset. We present evidence that the time taken to categorise urine cytology images varies by diagnosis/class. The known challenges relating to the reproducibility of the AUC (atypical) category in TPS when compared to the NHGUC (benign) or HGUC (malignant) categories is also confirmed.
Collapse
Affiliation(s)
- Ewen David McAlpine
- National Health Laboratory Service and Division of Anatomical Pathology, University of the Witwatersrand, Johannesburg, South Africa
| | - Pamela M Michelow
- National Health Laboratory Service and Division of Anatomical Pathology, University of the Witwatersrand, Johannesburg, South Africa
| | - Turgay Celik
- School of Electrical and Information Engineering and Wits Institute of Data Science, University of the Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
24
|
Koohbanani NA, Unnikrishnan B, Khurram SA, Krishnaswamy P, Rajpoot N. Self-Path: Self-Supervision for Classification of Pathology Images With Limited Annotations. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:2845-2856. [PMID: 33523807 DOI: 10.1109/tmi.2021.3056023] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
While high-resolution pathology images lend themselves well to 'data hungry' deep learning algorithms, obtaining exhaustive annotations on these images for learning is a major challenge. In this article, we propose a self-supervised convolutional neural network (CNN) framework to leverage unlabeled data for learning generalizable and domain invariant representations in pathology images. Our proposed framework, termed as Self-Path, employs multi-task learning where the main task is tissue classification and pretext tasks are a variety of self-supervised tasks with labels inherent to the input images. We introduce novel pathology-specific self-supervision tasks that leverage contextual, multi-resolution and semantic features in pathology images for semi-supervised learning and domain adaptation. We investigate the effectiveness of Self-Path on 3 different pathology datasets. Our results show that Self-Path with the pathology-specific pretext tasks achieves state-of-the-art performance for semi-supervised learning when small amounts of labeled data are available. Further, we show that Self-Path improves domain adaptation for histopathology image classification when there is no labeled data available for the target domain. This approach can potentially be employed for other applications in computational pathology, where annotation budget is often limited or large amount of unlabeled image data is available.
Collapse
|
25
|
Cain CN, Sudol PE, Berrier KL, Synovec RE. Development of variance rank initiated-unsupervised sample indexing for gas chromatography-mass spectrometry analysis. Talanta 2021; 233:122495. [PMID: 34215113 DOI: 10.1016/j.talanta.2021.122495] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 04/29/2021] [Accepted: 04/30/2021] [Indexed: 02/08/2023]
Abstract
Traditional non-targeted chemometric workflows for gas chromatography-mass spectrometry (GC-MS) data rely on using supervised methods, which requires a priori knowledge of sample class membership. Herein, we propose a simple, unsupervised chemometric workflow known as variance rank initiated-unsupervised sample indexing (VRI-USI). VRI-USI discovers analyte peaks exhibiting high relative variance across all samples, followed by k-means clustering on the individual peaks. Based upon how the samples cluster for a given peak, a sample index assignment is provided. Using a probabilistic argument, if the same sample index assignment appears for several discovered peaks, then this outcome strongly suggests that the samples are properly classified by that particular sample index assignment. Thus, relevant chemical differences between the samples have been discovered in an unsupervised fashion. The VRI-USI workflow is demonstrated on three, increasingly difficult datasets: simulations, yeast metabolomics, and human cancer metabolomics. For simulated GC-MS datasets, VRI-USI discovered 85-90% of analytes modeled to vary between sample classes. Nineteen out of 53 peaks in the peak table developed for the yeast metabolome dataset had the same sample index assignments, indicating that those indices are most likely due to class-distinguishing chemical differences. A t-test revealed that 22 out of 53 peaks were statistically significant (p < 0.05) when using those sample index assignments. Likewise, for the human cancer metabolomics study, VRI-USI discovered 25 analytes that were statistically different (p < 0.05) using the sample index assignments determined to highlight meaningful sample-based differences. For all datasets, the sample index assignments that were deduced from VRI-USI were the correct class-based difference when using prior knowledge. VRI-USI holds promise as an exploratory data analysis workflow for studies in which analysts do not readily have a priori class information or want to uncover the underlying nature of their dataset.
Collapse
Affiliation(s)
- Caitlin N Cain
- Department of Chemistry, Box 351700, University of Washington, Seattle, WA, 98195, USA
| | - Paige E Sudol
- Department of Chemistry, Box 351700, University of Washington, Seattle, WA, 98195, USA
| | - Kelsey L Berrier
- Department of Chemistry, Box 351700, University of Washington, Seattle, WA, 98195, USA
| | - Robert E Synovec
- Department of Chemistry, Box 351700, University of Washington, Seattle, WA, 98195, USA.
| |
Collapse
|
26
|
Gomes J, Kong J, Kurc T, Melo ACMA, Ferreira R, Saltz JH, Teodoro G. Building robust pathology image analyses with uncertainty quantification. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 208:106291. [PMID: 34333205 DOI: 10.1016/j.cmpb.2021.106291] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 07/09/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVE Computerized pathology image analysis is an important tool in research and clinical settings, which enables quantitative tissue characterization and can assist a pathologist's evaluation. The aim of our study is to systematically quantify and minimize uncertainty in output of computer based pathology image analysis. METHODS Uncertainty quantification (UQ) and sensitivity analysis (SA) methods, such as Variance-Based Decomposition (VBD) and Morris One-At-a-Time (MOAT), are employed to track and quantify uncertainty in a real-world application with large Whole Slide Imaging datasets - 943 Breast Invasive Carcinoma (BRCA) and 381 Lung Squamous Cell Carcinoma (LUSC) patients. Because these studies are compute intensive, high-performance computing systems and efficient UQ/SA methods were combined to provide efficient execution. UQ/SA has been able to highlight parameters of the application that impact the results, as well as nuclear features that carry most of the uncertainty. Using this information, we built a method for selecting stable features that minimize application output uncertainty. RESULTS The results show that input parameter variations significantly impact all stages (segmentation, feature computation, and survival analysis) of the use case application. We then identified and classified features according to their robustness to parameter variation, and using the proposed features selection strategy, for instance, patient grouping stability in survival analysis has been improved from in 17% and 34% for BRCA and LUSC, respectively. CONCLUSIONS This strategy created more robust analyses, demonstrating that SA and UQ are important methods that may increase confidence digital pathology.
Collapse
Affiliation(s)
- Jeremias Gomes
- Department of Computer Science, University of Brasília, Brasília, Brazil
| | - Jun Kong
- Biomedical Informatics Department, Emory University, Atlanta, USA; Department of Biomedical Engineering, Emory-Georgia Institute of Technology, Atlanta, USA; Department of Mathematics and Statistics, Georgia State University, Atlanta, USA
| | - Tahsin Kurc
- Biomedical Informatics Department, Stony Brook University, Stony Brook, USA; Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, USA
| | - Alba C M A Melo
- Department of Computer Science, University of Brasília, Brasília, Brazil
| | - Renato Ferreira
- Department of Computer Science, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Joel H Saltz
- Biomedical Informatics Department, Stony Brook University, Stony Brook, USA
| | - George Teodoro
- Department of Computer Science, University of Brasília, Brasília, Brazil; Biomedical Informatics Department, Stony Brook University, Stony Brook, USA; Department of Computer Science, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil.
| |
Collapse
|
27
|
Barragán-Montero A, Javaid U, Valdés G, Nguyen D, Desbordes P, Macq B, Willems S, Vandewinckele L, Holmström M, Löfman F, Michiels S, Souris K, Sterpin E, Lee JA. Artificial intelligence and machine learning for medical imaging: A technology review. Phys Med 2021; 83:242-256. [PMID: 33979715 PMCID: PMC8184621 DOI: 10.1016/j.ejmp.2021.04.016] [Citation(s) in RCA: 90] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 04/15/2021] [Accepted: 04/18/2021] [Indexed: 02/08/2023] Open
Abstract
Artificial intelligence (AI) has recently become a very popular buzzword, as a consequence of disruptive technical advances and impressive experimental results, notably in the field of image analysis and processing. In medicine, specialties where images are central, like radiology, pathology or oncology, have seized the opportunity and considerable efforts in research and development have been deployed to transfer the potential of AI to clinical applications. With AI becoming a more mainstream tool for typical medical imaging analysis tasks, such as diagnosis, segmentation, or classification, the key for a safe and efficient use of clinical AI applications relies, in part, on informed practitioners. The aim of this review is to present the basic technological pillars of AI, together with the state-of-the-art machine learning methods and their application to medical imaging. In addition, we discuss the new trends and future research directions. This will help the reader to understand how AI methods are now becoming an ubiquitous tool in any medical image analysis workflow and pave the way for the clinical implementation of AI-based solutions.
Collapse
Affiliation(s)
- Ana Barragán-Montero
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, UCLouvain, Belgium.
| | - Umair Javaid
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, UCLouvain, Belgium
| | - Gilmer Valdés
- Department of Radiation Oncology, Department of Epidemiology and Biostatistics, University of California, San Francisco, USA
| | - Dan Nguyen
- Medical Artificial Intelligence and Automation (MAIA) Laboratory, Department of Radiation Oncology, UT Southwestern Medical Center, USA
| | - Paul Desbordes
- Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM), UCLouvain, Belgium
| | - Benoit Macq
- Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM), UCLouvain, Belgium
| | - Siri Willems
- ESAT/PSI, KU Leuven Belgium & MIRC, UZ Leuven, Belgium
| | | | | | | | - Steven Michiels
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, UCLouvain, Belgium
| | - Kevin Souris
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, UCLouvain, Belgium
| | - Edmond Sterpin
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, UCLouvain, Belgium; KU Leuven, Department of Oncology, Laboratory of Experimental Radiotherapy, Belgium
| | - John A Lee
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, UCLouvain, Belgium
| |
Collapse
|
28
|
Wang X, Chen H, Xiang H, Lin H, Lin X, Heng PA. Deep virtual adversarial self-training with consistency regularization for semi-supervised medical image classification. Med Image Anal 2021; 70:102010. [PMID: 33677262 DOI: 10.1016/j.media.2021.102010] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Revised: 01/24/2021] [Accepted: 02/18/2021] [Indexed: 01/27/2023]
Abstract
Convolutional neural networks have achieved prominent success on a variety of medical imaging tasks when a large amount of labeled training data is available. However, the acquisition of expert annotations for medical data is usually expensive and time-consuming, which poses a great challenge for supervised learning approaches. In this work, we proposed a novel semi-supervised deep learning method, i.e., deep virtual adversarial self-training with consistency regularization, for large-scale medical image classification. To effectively exploit useful information from unlabeled data, we leverage self-training and consistency regularization to harness the underlying knowledge, which helps improve the discrimination capability of training models. More concretely, the model first uses its prediction for pseudo-labeling on the weakly-augmented input image. A pseudo-label is kept only if the corresponding class probability is of high confidence. Then the model prediction is encouraged to be consistent with the strongly-augmented version of the same input image. To improve the robustness of the network against virtual adversarial perturbed input, we incorporate virtual adversarial training (VAT) on both labeled and unlabeled data into the course of training. Hence, the network is trained by minimizing a combination of three types of losses, including a standard supervised loss on labeled data, a consistency regularization loss on unlabeled data, and a VAT loss on both labeled and labeled data. We extensively evaluate the proposed semi-supervised deep learning methods on two challenging medical image classification tasks: breast cancer screening from ultrasound images and multi-class ophthalmic disease classification from optical coherence tomography B-scan images. Experimental results demonstrate that the proposed method outperforms both supervised baseline and other state-of-the-art methods by a large margin on all tasks.
Collapse
Affiliation(s)
- Xi Wang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Hao Chen
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China.
| | - Huiling Xiang
- Department of Ultrasound, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, 651 Dongfeng East Road, Guangzhou 510060, China
| | - Huangjing Lin
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Xi Lin
- Department of Ultrasound, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, 651 Dongfeng East Road, Guangzhou 510060, China.
| | - Pheng-Ann Heng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China; Shenzhen Key Laboratory of Virtual Reality and Human Interaction Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China
| |
Collapse
|
29
|
Laohakiat S, Sa-ing V. An incremental density-based clustering framework using fuzzy local clustering. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2020.08.052] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
30
|
Reducing annotation effort in digital pathology: A Co-Representation learning framework for classification tasks. Med Image Anal 2021; 67:101859. [DOI: 10.1016/j.media.2020.101859] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Revised: 09/14/2020] [Accepted: 09/25/2020] [Indexed: 01/07/2023]
|
31
|
Predicting and Interpreting Students’ Grades in Distance Higher Education through a Semi-Regression Method. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10238413] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Multi-view learning is a machine learning app0roach aiming to exploit the knowledge retrieved from data, represented by multiple feature subsets known as views. Co-training is considered the most representative form of multi-view learning, a very effective semi-supervised classification algorithm for building highly accurate and robust predictive models. Even though it has been implemented in various scientific fields, it has not adequately used in educational data mining and learning analytics, since the hypothesis about the existence of two feature views cannot be easily implemented. Some notable studies have emerged recently dealing with semi-supervised classification tasks, such as student performance or student dropout prediction, while semi-supervised regression is uncharted territory. Therefore, the present study attempts to implement a semi-regression algorithm for predicting the grades of undergraduate students in the final exams of a one-year online course, which exploits three independent and naturally formed feature views, since they are derived from different sources. Moreover, we examine a well-established framework for interpreting the acquired results regarding their contribution to the final outcome per student/instance. To this purpose, a plethora of experiments is conducted based on data offered by the Hellenic Open University and representative machine learning algorithms. The experimental results demonstrate that the early prognosis of students at risk of failure can be accurately achieved compared to supervised models, even for a small amount of initially collected data from the first two semesters. The robustness of the applying semi-supervised regression scheme along with supervised learners and the investigation of features’ reasoning could highly benefit the educational domain.
Collapse
|
32
|
Li X, Plataniotis KN. How much off-the-shelf knowledge is transferable from natural images to pathology images? PLoS One 2020; 15:e0240530. [PMID: 33052964 PMCID: PMC7556818 DOI: 10.1371/journal.pone.0240530] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 09/28/2020] [Indexed: 11/19/2022] Open
Abstract
Deep learning has achieved a great success in natural image classification. To overcome data-scarcity in computational pathology, recent studies exploit transfer learning to reuse knowledge gained from natural images in pathology image analysis, aiming to build effective pathology image diagnosis models. Since transferability of knowledge heavily depends on the similarity of the original and target tasks, significant differences in image content and statistics between pathology images and natural images raise the questions: how much knowledge is transferable? Is the transferred information equally contributed by pre-trained layers? If not, is there a sweet spot in transfer learning that balances transferred model's complexity and performance? To answer these questions, this paper proposes a framework to quantify knowledge gain by a particular layer, conducts an empirical investigation in pathology image centered transfer learning, and reports some interesting observations. Particularly, compared to the performance baseline obtained by a random-weight model, though transferability of off-the-shelf representations from deep layers heavily depend on specific pathology image sets, the general representation generated by early layers does convey transferred knowledge in various image classification applications. The trade-off between transferable performance and transferred model's complexity observed in this study encourages further investigation of specific metric and tools to quantify effectiveness of transfer learning in future.
Collapse
Affiliation(s)
- Xingyu Li
- Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada
| | - Konstantinos N. Plataniotis
- The Edward S. Rogers Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
33
|
Pulido JV, Guleria S, Ehsan L, Fasullo M, Lippman R, Mutha P, Shah T, Syed S, Brown DE. Semi-Supervised Classification of Noisy, Gigapixel Histology Images. PROCEEDINGS. IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING 2020; 2020:563-568. [PMID: 34046246 PMCID: PMC8144886 DOI: 10.1109/bibe50027.2020.00097] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
One of the greatest obstacles in the adoption of deep neural networks for new medical applications is that training these models typically require a large amount of manually labeled training samples. In this body of work, we investigate the semi-supervised scenario where one has access to large amounts of unlabeled data and only a few labeled samples. We study the performance of MixMatch and FixMatch-two popular semi-supervised learning methods-on a histology dataset. More specifically, we study these models' impact under a highly noisy and imbalanced setting. The findings here motivate the development of semi-supervised methods to ameliorate problems commonly encountered in medical data applications.
Collapse
Affiliation(s)
- J Vince Pulido
- Applied Physics Laboratory, Johns Hopkins University, Laurel, MD
| | - Shan Guleria
- Dept. of Internal Medicine, Rush University Medical Center, Chicago, IL
| | - Lubaina Ehsan
- School of Medicine, University of Virginia, Charlottesville, VA
| | - Matthew Fasullo
- Division of Gastroenterology, Hepatology and Nutrition, Virginia Commonwealth University, Richmond, VA
| | - Robert Lippman
- Hunter Holmes McGuire, Veterans Affairs Medical Center, Richmond, VA
| | - Pritesh Mutha
- Hunter Holmes McGuire, Veterans Affairs Medical Center, Richmond, VA
| | - Tilak Shah
- Hunter Holmes McGuire, Veterans Affairs Medical Center, Richmond, VA
| | - Sana Syed
- School of Medicine, University of Virginia, Charlottesville, VA
| | - Donald E Brown
- School of Data Science, University of Virginia, Charlottesville, VA
| |
Collapse
|
34
|
Zhuo L, Cheng Y, Liu S, Yang Y, Tang S, Zhen J, Zhao J, Zhan S. A Multiview Model for Detecting the Inappropriate Use of Prescription Medication: Machine Learning Approach. JMIR Med Inform 2020; 8:e16312. [PMID: 32209527 PMCID: PMC7381037 DOI: 10.2196/16312] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Revised: 01/18/2020] [Accepted: 03/24/2020] [Indexed: 01/22/2023] Open
Abstract
Background The inappropriate use of prescription medication has recently garnered worldwide attention, but most national policies do not effectively provide for early detection or timely intervention. Objective This study aimed to develop and assess the validity of a model that can detect the inappropriate use of prescription medication. This effort combines a multiview and topic matching method. The study also assessed the validity of this approach. Methods A multiview extension of the latent Dirichlet allocation algorithm for topic modeling was chosen to generate diagnosis-medication topics, with data obtained from the Chinese Monitoring Network for Rational Use of Drugs (CMNRUD) database. Topic mapping allowed for calculating the degree to which diagnoses and medications were similarly distributed and, by setting a threshold, for identifying prescription misuse. The Beijing Regional Prescription Review Database (BRPRD) database was used as the gold standard to assess the model’s validity. We also conducted a sensitivity analysis using random samples of validated prescriptions and evaluated the model’s performance. Results A total of 44 million prescriptions were used to generate topics using the diagnoses and medications from the CMNRUD database. A random sample (15,000 prescriptions) from the BRPRD was used for validation, and it was found that the model had a sensitivity of 81.8%, specificity of 47.4%, positive-predictive value of 14.5%, and negative-predictive value of 96.0%. The model showed superior stability under different sampling proportions. Conclusions A method that combines multiview topic modeling and topic matching can detect the inappropriate use of prescription medication. This model, which has mediocre specificity and moderate sensitivity, can be used as a primary screening tool and will likely complement and improve the process of manually reviewing prescriptions.
Collapse
Affiliation(s)
- Lin Zhuo
- Research Center of Clinical Epidemiology, Peking University Third Hospital, Beijing, China.,Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Yinchu Cheng
- Department of Pharmacy, Peking University Third Hospital, Beijing, China
| | - Shaoqin Liu
- School of Electronics Engineering and Computer Science, Peking University, Beijing, China
| | - Yu Yang
- Center for Data Science in Medicine and Health, Peking University, Beijing, China
| | - Shuang Tang
- School of Electronics Engineering and Computer Science, Peking University, Beijing, China
| | - Jiancun Zhen
- Department of Pharmacy, Ji Shui Tan Hospital and Fourth Medical College of Peking University, Beijing, China
| | - Junfeng Zhao
- School of Electronics Engineering and Computer Science, Peking University, Beijing, China
| | - Siyan Zhan
- Research Center of Clinical Epidemiology, Peking University Third Hospital, Beijing, China.,Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| |
Collapse
|
35
|
McAlpine ED, Michelow P. The cytopathologist's role in developing and evaluating artificial intelligence in cytopathology practice. Cytopathology 2020; 31:385-392. [PMID: 31957101 DOI: 10.1111/cyt.12799] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 01/11/2020] [Accepted: 01/13/2020] [Indexed: 01/30/2023]
Abstract
Artificial intelligence (AI) technologies have the potential to transform cytopathology practice, and it is important for cytopathologists to embrace this and place themselves at the forefront of implementing these technologies in cytopathology. This review illustrates an archetypal AI workflow from project conception to implementation in a diagnostic setting and illustrates the cytopathologist's role and level of involvement at each stage of the process. Cytopathologists need to develop and maintain a basic understanding of AI, drive decisions regarding the development and implementation of AI in cytopathology, participate in the generation of datasets used to train and evaluate AI algorithms, understand how the performance of these algorithms is assessed, participate in the validation of these algorithms (either at a regulatory level or in the laboratory setting), and ensure continuous quality assurance of algorithms deployed in a diagnostic setting. In addition, cytopathologists should ensure that these algorithms are developed, trained, tested and deployed in an ethical manner. Cytopathologists need to become informed consumers of these AI algorithms by understanding their workings and limitations, how their performance is assessed and how to validate and verify their output in clinical practice.
Collapse
Affiliation(s)
- Ewen D McAlpine
- Cytology Unit, Department of Anatomical Pathology, Faculty of Health Science, National Health Laboratory Service, University of the Witwatersrand, Johannesburg, South Africa
| | - Pamela Michelow
- Cytology Unit, Department of Anatomical Pathology, Faculty of Health Science, National Health Laboratory Service, University of the Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
36
|
Kurc T, Bakas S, Ren X, Bagari A, Momeni A, Huang Y, Zhang L, Kumar A, Thibault M, Qi Q, Wang Q, Kori A, Gevaert O, Zhang Y, Shen D, Khened M, Ding X, Krishnamurthi G, Kalpathy-Cramer J, Davis J, Zhao T, Gupta R, Saltz J, Farahani K. Segmentation and Classification in Digital Pathology for Glioma Research: Challenges and Deep Learning Approaches. Front Neurosci 2020; 14:27. [PMID: 32153349 PMCID: PMC7046596 DOI: 10.3389/fnins.2020.00027] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Accepted: 01/10/2020] [Indexed: 12/12/2022] Open
Abstract
Biomedical imaging Is an important source of information in cancer research. Characterizations of cancer morphology at onset, progression, and in response to treatment provide complementary information to that gleaned from genomics and clinical data. Accurate extraction and classification of both visual and latent image features Is an increasingly complex challenge due to the increased complexity and resolution of biomedical image data. In this paper, we present four deep learning-based image analysis methods from the Computational Precision Medicine (CPM) satellite event of the 21st International Medical Image Computing and Computer Assisted Intervention (MICCAI 2018) conference. One method Is a segmentation method designed to segment nuclei in whole slide tissue images (WSIs) of adult diffuse glioma cases. It achieved a Dice similarity coefficient of 0.868 with the CPM challenge datasets. Three methods are classification methods developed to categorize adult diffuse glioma cases into oligodendroglioma and astrocytoma classes using radiographic and histologic image data. These methods achieved accuracy values of 0.75, 0.80, and 0.90, measured as the ratio of the number of correct classifications to the number of total cases, with the challenge datasets. The evaluations of the four methods indicate that (1) carefully constructed deep learning algorithms are able to produce high accuracy in the analysis of biomedical image data and (2) the combination of radiographic with histologic image information improves classification performance.
Collapse
Affiliation(s)
- Tahsin Kurc
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, United States
| | - Spyridon Bakas
- Center for Biomedical Image Computing and Analytics, University of Pennsylvania, Philadelphia, PA, United States
- Department of Radiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Xuhua Ren
- Institute for Medical Imaging Technology, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Aditya Bagari
- Department of Engineering Design, Indian Institute of Technology Madras, Chennai, India
| | - Alexandre Momeni
- Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA, United States
| | - Yue Huang
- School of Informatics, Xiamen University, Xiamen, China
| | - Lichi Zhang
- Institute for Medical Imaging Technology, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Ashish Kumar
- Department of Engineering Design, Indian Institute of Technology Madras, Chennai, India
| | - Marc Thibault
- Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA, United States
| | - Qi Qi
- School of Informatics, Xiamen University, Xiamen, China
| | - Qian Wang
- Institute for Medical Imaging Technology, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Avinash Kori
- Department of Engineering Design, Indian Institute of Technology Madras, Chennai, India
| | - Olivier Gevaert
- Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA, United States
| | - Yunlong Zhang
- School of Informatics, Xiamen University, Xiamen, China
| | - Dinggang Shen
- Department of Radiology and BRIC, The University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
- Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea
| | - Mahendra Khened
- Department of Engineering Design, Indian Institute of Technology Madras, Chennai, India
| | - Xinghao Ding
- School of Informatics, Xiamen University, Xiamen, China
| | | | - Jayashree Kalpathy-Cramer
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States
| | - James Davis
- Department of Pathology, Stony Brook University, Stony Brook, NY, United States
| | - Tianhao Zhao
- Department of Pathology, Stony Brook University, Stony Brook, NY, United States
| | - Rajarsi Gupta
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, United States
- Department of Pathology, Stony Brook University, Stony Brook, NY, United States
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, United States
| | - Keyvan Farahani
- Cancer Imaging Program, National Cancer Institute, National Institutes of Health, Bethesda, MD, United States
| |
Collapse
|
37
|
Active Semi-Supervised Random Forest for Hyperspectral Image Classification. REMOTE SENSING 2019. [DOI: 10.3390/rs11242974] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Random forest (RF) has obtained great success in hyperspectral image (HSI) classification. However, RF cannot leverage its full potential in the case of limited labeled samples. To address this issue, we propose a unified framework that embeds active learning (AL) and semi-supervised learning (SSL) into RF (ASSRF). Our aim is to utilize AL and SSL simultaneously to improve the performance of RF. The objective of the proposed method is to use a small number of manually labeled samples to train classifiers with relative high classification accuracy. To achieve this goal, a new query function is designed to query the most informative samples for manual labeling, and a new pseudolabeling strategy is introduced to select some samples for pseudolabeling. Compared with other AL- and SSL-based methods, the proposed method has several advantages. First, ASSRF utilizes the spatial information to construct a query function for AL, which can select more informative samples. Second, in addition to providing more labeled samples for SSL, the proposed pseudolabeling method avoids bias caused by AL-labeled samples. Finally, the proposed model retains the advantages of RF. To demonstrate the effectiveness of ASSRF, we conducted experiments on three real hyperspectral data sets. The experimental results have shown that our proposed method outperforms other state-of-the-art methods.
Collapse
|
38
|
Shi X, Su H, Xing F, Liang Y, Qu G, Yang L. Graph temporal ensembling based semi-supervised convolutional neural network with noisy labels for histopathology image analysis. Med Image Anal 2019; 60:101624. [PMID: 31841948 DOI: 10.1016/j.media.2019.101624] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Revised: 11/22/2019] [Accepted: 11/25/2019] [Indexed: 12/21/2022]
Abstract
Although convolutional neural networks have achieved tremendous success on histopathology image classification, they usually require large-scale clean annotated data and are sensitive to noisy labels. Unfortunately, labeling large-scale images is laborious, expensive and lowly reliable for pathologists. To address these problems, in this paper, we propose a novel self-ensembling based deep architecture to leverage the semantic information of annotated images and explore the information hidden in unlabeled data, and meanwhile being robust to noisy labels. Specifically, the proposed architecture first creates ensemble targets for feature and label predictions of training samples, by using exponential moving average (EMA) to aggregate feature and label predictions within multiple previous training epochs. Then, the ensemble targets within the same class are mapped into a cluster so that they are further enhanced. Next, a consistency cost is utilized to form consensus predictions under different configurations. Finally, we validate the proposed method with extensive experiments on lung and breast cancer datasets that contain thousands of images. It can achieve 90.5% and 89.5% image classification accuracy using only 20% labeled patients on the two datasets, respectively. This performance is comparable to that of the baseline method with all labeled patients. Experiments also demonstrate its robustness to small percentage of noisy labels.
Collapse
Affiliation(s)
- Xiaoshuang Shi
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, United States.
| | - Hai Su
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, United States
| | - Fuyong Xing
- Department of Biostatistics and Informatics, University of Colorado, Denver, United States
| | - Yun Liang
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, United States
| | - Gang Qu
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, United States
| | - Lin Yang
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, United States.
| |
Collapse
|
39
|
Gupta R, Kurc T, Sharma A, Almeida JS, Saltz J. The Emergence of Pathomics. CURRENT PATHOBIOLOGY REPORTS 2019. [DOI: 10.1007/s40139-019-00200-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
40
|
Wu G, Zhang D, Chen W, Zuo W, Xia Z. Robust Deep Softmax Regression Against Label Noise for Unsupervised Domain Adaptation. INT J PATTERN RECOGN 2019. [DOI: 10.1142/s0218001419400020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Domain adaptation aims to generalize the classification model from a source domain to a different but related target domain. Recent studies have revealed the benefit of deep convolutional features trained on a large dataset (e.g. ImageNet) in alleviating domain discrepancy. However, literatures show that the transferability of features decreases as (i) the difference between the source and target domains increases, or (ii) the layers are toward the top layers. Therefore, even with deep features, domain adaptation remains necessary. In this paper, we propose a novel unsupervised domain adaptation (UDA) model for deep neural networks, which is learned with the labeled source samples and the unlabeled target ones simultaneously. For target samples without labels, pseudo labels are assigned to them according to their maximum classification scores during training of the UDA model. However, due to the domain discrepancy, label noise generally is inevitable, which degrades the performance of the domain adaptation model. Thus, to effectively utilize the target samples, three specific robust deep softmax regression (RDSR) functions are performed for them with high, medium and low classification confidence respectively. Extensive experiments show that our method yields the state-of-the-art results, demonstrating the effectiveness of the robust deep softmax regression classifier in UDA.
Collapse
Affiliation(s)
- Guangbin Wu
- State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, P. R. China
| | - David Zhang
- Department of Computing, The Hong Kong Polytechnic University, Hong Kong, P. R. China
| | - Weishan Chen
- State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, P. R. China
| | - Wangmeng Zuo
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, P. R. China
| | - Zhuang Xia
- State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, P. R. China
| |
Collapse
|
41
|
Cordier T, Lanzén A, Apothéloz-Perret-Gentil L, Stoeck T, Pawlowski J. Embracing Environmental Genomics and Machine Learning for Routine Biomonitoring. Trends Microbiol 2019; 27:387-397. [DOI: 10.1016/j.tim.2018.10.012] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Revised: 10/17/2018] [Accepted: 10/30/2018] [Indexed: 01/28/2023]
|
42
|
Saber H, Somai M, Rajah GB, Scalzo F, Liebeskind DS. Predictive analytics and machine learning in stroke and neurovascular medicine. Neurol Res 2019; 41:681-690. [PMID: 31038007 DOI: 10.1080/01616412.2019.1609159] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Advances in predictive analytics and machine learning supported by an ever-increasing wealth of data and processing power are transforming almost every industry. Accuracy and precision of predictive analytics have significantly increased over the past few years and are evolving at an exponential pace. There have been significant breakthroughs in using Predictive Analytics in healthcare where it is held as the foundation of precision medicine. Yet, although the research in the field is expanding with the profuse volume of papers applying machine learning algorithms to medical data, very few have contributed meaningfully to clinical care. This lack of impact stands in stark contrast to the enormous relevance of machine learning to many other industries. Regardless of the status of its current contribution, the field of predictive analytics is expected to fundamentally change the way we diagnose and treat diseases, as well as the conduct of biomedical science research. In this review, we describe the main tools and techniques in predictive analytics and will analyze the trends in application of these techniques over the recent years. We will also provide examples of its application in medicine and more specifically in stroke and neurovascular research and outline current limitations.
Collapse
Affiliation(s)
- Hamidreza Saber
- a Wayne State Department of Neurology, Wayne State University , Detroit , MI , USA
| | - Melek Somai
- b Neuro-Epidemiology and Ageing Research Unit, School of Public Health, Imperial College London , London , UK
| | - Gary B Rajah
- c Wayne State Department of Neurosurgery, Wayne State University , Detroit , MI , USA
| | - Fabien Scalzo
- d Departement of Neurology, University of California Los Angeles , Los Angeles , CA , USA
| | - David S Liebeskind
- d Departement of Neurology, University of California Los Angeles , Los Angeles , CA , USA
| |
Collapse
|
43
|
Vu QD, Graham S, Kurc T, To MNN, Shaban M, Qaiser T, Koohbanani NA, Khurram SA, Kalpathy-Cramer J, Zhao T, Gupta R, Kwak JT, Rajpoot N, Saltz J, Farahani K. Methods for Segmentation and Classification of Digital Microscopy Tissue Images. Front Bioeng Biotechnol 2019; 7:53. [PMID: 31001524 PMCID: PMC6454006 DOI: 10.3389/fbioe.2019.00053] [Citation(s) in RCA: 91] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Accepted: 03/01/2019] [Indexed: 12/12/2022] Open
Abstract
High-resolution microscopy images of tissue specimens provide detailed information about the morphology of normal and diseased tissue. Image analysis of tissue morphology can help cancer researchers develop a better understanding of cancer biology. Segmentation of nuclei and classification of tissue images are two common tasks in tissue image analysis. Development of accurate and efficient algorithms for these tasks is a challenging problem because of the complexity of tissue morphology and tumor heterogeneity. In this paper we present two computer algorithms; one designed for segmentation of nuclei and the other for classification of whole slide tissue images. The segmentation algorithm implements a multiscale deep residual aggregation network to accurately segment nuclear material and then separate clumped nuclei into individual nuclei. The classification algorithm initially carries out patch-level classification via a deep learning method, then patch-level statistical and morphological features are used as input to a random forest regression model for whole slide image classification. The segmentation and classification algorithms were evaluated in the MICCAI 2017 Digital Pathology challenge. The segmentation algorithm achieved an accuracy score of 0.78. The classification algorithm achieved an accuracy score of 0.81. These scores were the highest in the challenge.
Collapse
Affiliation(s)
- Quoc Dang Vu
- Department of Computer Science and Engineering, Sejong University, Seoul, South Korea
| | - Simon Graham
- Department of Computer Science, University of Warwick, Coventry, United Kingdom
| | - Tahsin Kurc
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, United States
| | - Minh Nguyen Nhat To
- Department of Computer Science and Engineering, Sejong University, Seoul, South Korea
| | - Muhammad Shaban
- Department of Computer Science, University of Warwick, Coventry, United Kingdom
| | - Talha Qaiser
- Department of Computer Science, University of Warwick, Coventry, United Kingdom
| | | | - Syed Ali Khurram
- School of Clinical Dentistry, The University of Sheffield, Sheffield, United Kingdom
| | - Jayashree Kalpathy-Cramer
- Department of Radiology, Harvard Medical School and Mass General Hospital, Boston, MA, United States
| | - Tianhao Zhao
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, United States
- Department of Pathology, Stony Brook University, Stony Brook, NY, United States
| | - Rajarsi Gupta
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, United States
- Department of Pathology, Stony Brook University, Stony Brook, NY, United States
| | - Jin Tae Kwak
- Department of Computer Science and Engineering, Sejong University, Seoul, South Korea
| | - Nasir Rajpoot
- Department of Computer Science, University of Warwick, Coventry, United Kingdom
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, United States
| | - Keyvan Farahani
- Cancer Imaging Program, National Cancer Institute, National Institutes of Health, Bethesda, MD, United States
| |
Collapse
|
44
|
Kapil A, Meier A, Zuraw A, Steele KE, Rebelatto MC, Schmidt G, Brieu N. Deep Semi Supervised Generative Learning for Automated Tumor Proportion Scoring on NSCLC Tissue Needle Biopsies. Sci Rep 2018; 8:17343. [PMID: 30478349 PMCID: PMC6255873 DOI: 10.1038/s41598-018-35501-5] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Accepted: 11/06/2018] [Indexed: 12/22/2022] Open
Abstract
The level of PD-L1 expression in immunohistochemistry (IHC) assays is a key biomarker for the identification of Non-Small-Cell-Lung-Cancer (NSCLC) patients that may respond to anti PD-1/PD-L1 treatments. The quantification of PD-L1 expression currently includes the visual estimation by a pathologist of the percentage (tumor proportional scoring or TPS) of tumor cells showing PD-L1 staining. Known challenges like differences in positivity estimation around clinically relevant cut-offs and sub-optimal quality of samples makes visual scoring tedious and subjective, yielding a scoring variability between pathologists. In this work, we propose a novel deep learning solution that enables the first automated and objective scoring of PD-L1 expression in late stage NSCLC needle biopsies. To account for the low amount of tissue available in biopsy images and to restrict the amount of manual annotations necessary for training, we explore the use of semi-supervised approaches against standard fully supervised methods. We consolidate the manual annotations used for training as well the visual TPS scores used for quantitative evaluation with multiple pathologists. Concordance measures computed on a set of slides unseen during training provide evidence that our automatic scoring method matches visual scoring on the considered dataset while ensuring repeatability and objectivity.
Collapse
|
45
|
Peering Into the Black Box of Artificial Intelligence: Evaluation Metrics of Machine Learning Methods. AJR Am J Roentgenol 2018; 212:38-43. [PMID: 30332290 DOI: 10.2214/ajr.18.20224] [Citation(s) in RCA: 98] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
OBJECTIVE Machine learning (ML) and artificial intelligence (AI) are rapidly becoming the most talked about and controversial topics in radiology and medicine. Over the past few years, the numbers of ML- or AI-focused studies in the literature have increased almost exponentially, and ML has become a hot topic at academic and industry conferences. However, despite the increased awareness of ML as a tool, many medical professionals have a poor understanding of how ML works and how to critically appraise studies and tools that are presented to us. Thus, we present a brief overview of ML, explain the metrics used in ML and how to interpret them, and explain some of the technical jargon associated with the field so that readers with a medical background and basic knowledge of statistics can feel more comfortable when examining ML applications. CONCLUSION Attention to sample size, overfitting, underfitting, cross validation, as well as a broad knowledge of the metrics of machine learning, can help those with little or no technical knowledge begin to assess machine learning studies. However, transparency in methods and sharing of algorithms is vital to allow clinicians to assess these tools themselves.
Collapse
|