1
|
Tommasino C, Merolla F, Russo C, Staibano S, Rinaldi AM. Histopathological Image Deep Feature Representation for CBIR in Smart PACS. J Digit Imaging 2023; 36:2194-2209. [PMID: 37296349 PMCID: PMC10501985 DOI: 10.1007/s10278-023-00832-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 03/16/2023] [Accepted: 04/12/2023] [Indexed: 06/12/2023] Open
Abstract
Pathological Anatomy is moving toward computerizing processes mainly due to the extensive digitization of histology slides that resulted in the availability of many Whole Slide Images (WSIs). Their use is essential, especially in cancer diagnosis and research, and raises the pressing need for increasingly influential information archiving and retrieval systems. Picture Archiving and Communication Systems (PACSs) represent an actual possibility to archive and organize this growing amount of data. The design and implementation of a robust and accurate methodology for querying them in the pathology domain using a novel approach are mandatory. In particular, the Content-Based Image Retrieval (CBIR) methodology can be involved in the PACSs using a query-by-example task. In this context, one of many crucial points of CBIR concerns the representation of images as feature vectors, and the accuracy of retrieval mainly depends on feature extraction. Thus, our study explored different representations of WSI patches by features extracted from pre-trained Convolution Neural Networks (CNNs). In order to perform a helpful comparison, we evaluated features extracted from different layers of state-of-the-art CNNs using different dimensionality reduction techniques. Furthermore, we provided a qualitative analysis of obtained results. The evaluation showed encouraging results for our proposed framework.
Collapse
Affiliation(s)
- Cristian Tommasino
- Department of Electrical Engineering and Information Technology, University of Napoli Federico II, Via Claudio 21, Naples, 80125 Italy
| | - Francesco Merolla
- Department of Advanced Biomedical Sciences, Pathology Section, University of Naples Federico II, Naples, 80131 Italy
| | - Cristiano Russo
- Department of Electrical Engineering and Information Technology, University of Napoli Federico II, Via Claudio 21, Naples, 80125 Italy
| | - Stefania Staibano
- Department of Medicine and Health Sciences V. Tiberio, University of Molise, Campobasso, 86100 Italy
| | - Antonio Maria Rinaldi
- Department of Electrical Engineering and Information Technology, University of Napoli Federico II, Via Claudio 21, Naples, 80125 Italy
| |
Collapse
|
2
|
Al-Thelaya K, Gilal NU, Alzubaidi M, Majeed F, Agus M, Schneider J, Househ M. Applications of discriminative and deep learning feature extraction methods for whole slide image analysis: A survey. J Pathol Inform 2023; 14:100335. [PMID: 37928897 PMCID: PMC10622844 DOI: 10.1016/j.jpi.2023.100335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 07/17/2023] [Accepted: 07/19/2023] [Indexed: 11/07/2023] Open
Abstract
Digital pathology technologies, including whole slide imaging (WSI), have significantly improved modern clinical practices by facilitating storing, viewing, processing, and sharing digital scans of tissue glass slides. Researchers have proposed various artificial intelligence (AI) solutions for digital pathology applications, such as automated image analysis, to extract diagnostic information from WSI for improving pathology productivity, accuracy, and reproducibility. Feature extraction methods play a crucial role in transforming raw image data into meaningful representations for analysis, facilitating the characterization of tissue structures, cellular properties, and pathological patterns. These features have diverse applications in several digital pathology applications, such as cancer prognosis and diagnosis. Deep learning-based feature extraction methods have emerged as a promising approach to accurately represent WSI contents and have demonstrated superior performance in histology-related tasks. In this survey, we provide a comprehensive overview of feature extraction methods, including both manual and deep learning-based techniques, for the analysis of WSIs. We review relevant literature, analyze the discriminative and geometric features of WSIs (i.e., features suited to support the diagnostic process and extracted by "engineered" methods as opposed to AI), and explore predictive modeling techniques using AI and deep learning. This survey examines the advances, challenges, and opportunities in this rapidly evolving field, emphasizing the potential for accurate diagnosis, prognosis, and decision-making in digital pathology.
Collapse
Affiliation(s)
- Khaled Al-Thelaya
- Department of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Nauman Ullah Gilal
- Department of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Mahmood Alzubaidi
- Department of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Fahad Majeed
- Department of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Marco Agus
- Department of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Jens Schneider
- Department of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Mowafa Househ
- Department of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| |
Collapse
|
3
|
Rasoolijaberi M, Babaei M, Riasatian A, Hemati S, Ashrafi P, Gonzalez R, Tizhoosh HR. Multi-Magnification Image Search in Digital Pathology. IEEE J Biomed Health Inform 2022; 26:4611-4622. [PMID: 35687644 DOI: 10.1109/jbhi.2022.3181531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This paper investigates the effect of magnification on content-based image search in digital pathology archives and proposes to use multi-magnification image representation. Image search in large archives of digital pathology slides provides researchers and medical professionals with an opportunity to match records of current and past patients and learn from evidently diagnosed and treated cases. When working with microscopes, pathologists switch between different magnification levels while examining tissue specimens to find and evaluate various morphological features. Inspired by the conventional pathology workflow, we have investigated several magnification levels in digital pathology and their combinations to minimize the gap between AI-enabled image search methods and clinical settings. The proposed searching framework does not rely on any regional annotation and potentially applies to millions of unlabelled (raw) whole slide images. This paper suggests two approaches for combining magnification levels and compares their performance. The first approach obtains a single-vector deep feature representation for a digital slide, whereas the second approach works with a multi-vector deep feature representation. We report the search results of 20×, 10×, and 5× magnifications and their combinations on a subset of The Cancer Genome Atlas (TCGA) repository. The experiments verify that cell-level information at the highest magnification is essential for searching for diagnostic purposes. In contrast, low-magnification information may improve this assessment depending on the tumor type. Our multi-magnification approach achieved up to 11% F1-score improvement in searching among the urinary tract and brain tumor subtypes compared to the single-magnification image search.
Collapse
|
4
|
Reena MR, Ameer PM. A content-based image retrieval system for the diagnosis of lymphoma using blood micrographs: An incorporation of deep learning with a traditional learning approach. Comput Biol Med 2022; 145:105463. [PMID: 35421794 DOI: 10.1016/j.compbiomed.2022.105463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Revised: 03/24/2022] [Accepted: 03/25/2022] [Indexed: 12/01/2022]
Abstract
Lymphomas, or cancers of the lymphatic system, account for around half of all blood cancers diagnosed each year. Lymphoma is a condition that is difficult to diagnose, and accurate diagnosis is critical for effective treatment. Manual microscopic analysis of blood cells requires the involvement of medical experts, whose precision is dependent on their abilities, and it takes time. This paper describes a content-based image retrieval system that uses deep learning-based feature extraction and a traditional learning method for feature reduction to retrieve similar images from a database to aid early/initial lymphoma diagnosis. The proposed algorithm employs a pre-trained network called ResNet-101 to extract image features required to distinguish four types of cells: lymphoma cells, blasts, lymphocytes, and other cells. The issue of class imbalance is resolved by over-sampling the training data followed by data augmentation. Deep learning features are extracted using the activations of the feature layer in the pre-trained net, then dimensionality reduction techniques are used to select discriminant features for the image retrieval system. Euclidean distance is used as the similarity measure to retrieve similar images from the database. The experimentation uses a microscopic blood image dataset with 1673 leukocytes of the categories blasts, lymphoma, lymphocytes, and other cells. The proposed algorithm achieves 98.74% precision in lymphoma cell classification and 99.22% precision @10 for lymphoma cell image retrieval. Experimental findings confirm our approach's practicability and effectiveness. Extended studies endorse the idea of using the prescribed system in actual medical applications, helping doctors diagnose lymphoma, dramatically reducing human resource requirements.
Collapse
Affiliation(s)
- M Roy Reena
- Department of Electronics and Communication Engineering, National Institute of Technology, Calicut, India.
| | - P M Ameer
- Department of Electronics and Communication Engineering, National Institute of Technology, Calicut, India
| |
Collapse
|
5
|
Foran DJ, Durbin EB, Chen W, Sadimin E, Sharma A, Banerjee I, Kurc T, Li N, Stroup AM, Harris G, Gu A, Schymura M, Gupta R, Bremer E, Balsamo J, DiPrima T, Wang F, Abousamra S, Samaras D, Hands I, Ward K, Saltz JH. An Expandable Informatics Framework for Enhancing Central Cancer Registries with Digital Pathology Specimens, Computational Imaging Tools, and Advanced Mining Capabilities. J Pathol Inform 2022; 13:5. [PMID: 35136672 PMCID: PMC8794027 DOI: 10.4103/jpi.jpi_31_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 04/30/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Population-based state cancer registries are an authoritative source for cancer statistics in the United States. They routinely collect a variety of data, including patient demographics, primary tumor site, stage at diagnosis, first course of treatment, and survival, on every cancer case that is reported across all U.S. states and territories. The goal of our project is to enrich NCI's Surveillance, Epidemiology, and End Results (SEER) registry data with high-quality population-based biospecimen data in the form of digital pathology, machine-learning-based classifications, and quantitative histopathology imaging feature sets (referred to here as Pathomics features). MATERIALS AND METHODS As part of the project, the underlying informatics infrastructure was designed, tested, and implemented through close collaboration with several participating SEER registries to ensure consistency with registry processes, computational scalability, and ability to support creation of population cohorts that span multiple sites. Utilizing computational imaging algorithms and methods to both generate indices and search for matches makes it possible to reduce inter- and intra-observer inconsistencies and to improve the objectivity with which large image repositories are interrogated. RESULTS Our team has created and continues to expand a well-curated repository of high-quality digitized pathology images corresponding to subjects whose data are routinely collected by the collaborating registries. Our team has systematically deployed and tested key, visual analytic methods to facilitate automated creation of population cohorts for epidemiological studies and tools to support visualization of feature clusters and evaluation of whole-slide images. As part of these efforts, we are developing and optimizing advanced search and matching algorithms to facilitate automated, content-based retrieval of digitized specimens based on their underlying image features and staining characteristics. CONCLUSION To meet the challenges of this project, we established the analytic pipelines, methods, and workflows to support the expansion and management of a growing repository of high-quality digitized pathology and information-rich, population cohorts containing objective imaging and clinical attributes to facilitate studies that seek to discriminate among different subtypes of disease, stratify patient populations, and perform comparisons of tumor characteristics within and across patient cohorts. We have also successfully developed a suite of tools based on a deep-learning method to perform quantitative characterizations of tumor regions, assess infiltrating lymphocyte distributions, and generate objective nuclear feature measurements. As part of these efforts, our team has implemented reliable methods that enable investigators to systematically search through large repositories to automatically retrieve digitized pathology specimens and correlated clinical data based on their computational signatures.
Collapse
Affiliation(s)
- David J. Foran
- Center for Biomedical Informatics, Rutgers Cancer Institute of New Jersey, New Brunswick, NJ, USA
- Department of Pathology and Laboratory Medicine, Rutgers-Robert Wood Johnson Medical School, Piscataway, NJ, USA
| | - Eric B. Durbin
- Kentucky Cancer Registry, Markey Cancer Center, University of Kentucky, Lexington, KY, USA
- Division of Biomedical Informatics, Department of Internal Medicine, College of Medicine, Lexington, KY, USA
| | - Wenjin Chen
- Center for Biomedical Informatics, Rutgers Cancer Institute of New Jersey, New Brunswick, NJ, USA
| | - Evita Sadimin
- Center for Biomedical Informatics, Rutgers Cancer Institute of New Jersey, New Brunswick, NJ, USA
- Department of Pathology and Laboratory Medicine, Rutgers-Robert Wood Johnson Medical School, Piscataway, NJ, USA
| | - Ashish Sharma
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, USA
| | - Imon Banerjee
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, USA
| | - Tahsin Kurc
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Nan Li
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, USA
| | - Antoinette M. Stroup
- New Jersey State Cancer Registry, Rutgers Cancer Institute of New Jersey, New Brunswick, NJ, USA
| | - Gerald Harris
- New Jersey State Cancer Registry, Rutgers Cancer Institute of New Jersey, New Brunswick, NJ, USA
| | - Annie Gu
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, USA
| | - Maria Schymura
- New York State Cancer Registry, New York State Department of Health, Albany, NY, USA
| | - Rajarsi Gupta
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Erich Bremer
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Joseph Balsamo
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Tammy DiPrima
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Feiqiao Wang
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Shahira Abousamra
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
| | - Dimitris Samaras
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
| | - Isaac Hands
- Division of Biomedical Informatics, Department of Internal Medicine, College of Medicine, Lexington, KY, USA
| | - Kevin Ward
- Georgia State Cancer Registry, Georgia Department of Public Health, Atlanta, GA, USA
| | - Joel H. Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| |
Collapse
|
6
|
Fast and scalable search of whole-slide images via self-supervised deep learning. Nat Biomed Eng 2022; 6:1420-1434. [PMID: 36217022 PMCID: PMC9792371 DOI: 10.1038/s41551-022-00929-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Accepted: 07/15/2022] [Indexed: 01/14/2023]
Abstract
The adoption of digital pathology has enabled the curation of large repositories of gigapixel whole-slide images (WSIs). Computationally identifying WSIs with similar morphologic features within large repositories without requiring supervised training can have significant applications. However, the retrieval speeds of algorithms for searching similar WSIs often scale with the repository size, which limits their clinical and research potential. Here we show that self-supervised deep learning can be leveraged to search for and retrieve WSIs at speeds that are independent of repository size. The algorithm, which we named SISH (for self-supervised image search for histology) and provide as an open-source package, requires only slide-level annotations for training, encodes WSIs into meaningful discrete latent representations and leverages a tree data structure for fast searching followed by an uncertainty-based ranking algorithm for WSI retrieval. We evaluated SISH on multiple tasks (including retrieval tasks based on tissue-patch queries) and on datasets spanning over 22,000 patient cases and 56 disease subtypes. SISH can also be used to aid the diagnosis of rare cancer types for which the number of available WSIs is often insufficient to train supervised deep-learning models.
Collapse
|
7
|
Pantanowitz L, Michelow P, Hazelhurst S, Kalra S, Choi C, Shah S, Babaie M, Tizhoosh HR. A Digital Pathology Solution to Resolve the Tissue Floater Conundrum. Arch Pathol Lab Med 2021; 145:359-364. [PMID: 32886759 DOI: 10.5858/arpa.2020-0034-oa] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/11/2020] [Indexed: 11/06/2022]
Abstract
CONTEXT.— Pathologists may encounter extraneous pieces of tissue (tissue floaters) on glass slides because of specimen cross-contamination. Troubleshooting this problem, including performing molecular tests for tissue identification if available, is time consuming and often does not satisfactorily resolve the problem. OBJECTIVE.— To demonstrate the feasibility of using an image search tool to resolve the tissue floater conundrum. DESIGN.— A glass slide was produced containing 2 separate hematoxylin and eosin (H&E)-stained tissue floaters. This fabricated slide was digitized along with the 2 slides containing the original tumors used to create these floaters. These slides were then embedded into a dataset of 2325 whole slide images comprising a wide variety of H&E stained diagnostic entities. Digital slides were broken up into patches and the patch features converted into barcodes for indexing and easy retrieval. A deep learning-based image search tool was employed to extract features from patches via barcodes, hence enabling image matching to each tissue floater. RESULTS.— There was a very high likelihood of finding a correct tumor match for the queried tissue floater when searching the digital database. Search results repeatedly yielded a correct match within the top 3 retrieved images. The retrieval accuracy improved when greater proportions of the floater were selected. The time to run a search was completed within several milliseconds. CONCLUSIONS.— Using an image search tool offers pathologists an additional method to rapidly resolve the tissue floater conundrum, especially for those laboratories that have transitioned to going fully digital for primary diagnosis.
Collapse
Affiliation(s)
- Liron Pantanowitz
- From the Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania (Pantanowitz).,Department of Anatomical Pathology, University of the Witwatersrand and National Health Laboratory Services, Johannesburg, South Africa (Pantanowitz, Michelow)
| | - Pamela Michelow
- Department of Anatomical Pathology, University of the Witwatersrand and National Health Laboratory Services, Johannesburg, South Africa (Pantanowitz, Michelow)
| | - Scott Hazelhurst
- School of Electrical & Information Engineering and Sydney Brenner Institute for Molecular Bioscience, University of the Witwatersrand, Johannesburg, South Africa (Hazelhurst)
| | - Shivam Kalra
- Kimia Lab, University of Waterloo, Waterloo, Ontario, Canada (Kalra, Babaie, Tizhoosh).,Huron Digital Pathology, Engineering Department, St. Jacobs, Ontario, Canada (Kalra, Choi, Shah). Pantanowitz is now with the Department of Pathology & Clinical Labs, University of Michigan, Ann Arbor
| | - Charles Choi
- Huron Digital Pathology, Engineering Department, St. Jacobs, Ontario, Canada (Kalra, Choi, Shah). Pantanowitz is now with the Department of Pathology & Clinical Labs, University of Michigan, Ann Arbor
| | - Sultaan Shah
- Huron Digital Pathology, Engineering Department, St. Jacobs, Ontario, Canada (Kalra, Choi, Shah). Pantanowitz is now with the Department of Pathology & Clinical Labs, University of Michigan, Ann Arbor
| | - Morteza Babaie
- Kimia Lab, University of Waterloo, Waterloo, Ontario, Canada (Kalra, Babaie, Tizhoosh)
| | - Hamid R Tizhoosh
- Kimia Lab, University of Waterloo, Waterloo, Ontario, Canada (Kalra, Babaie, Tizhoosh)
| |
Collapse
|
8
|
Objective Diagnosis for Histopathological Images Based on Machine Learning Techniques: Classical Approaches and New Trends. MATHEMATICS 2020. [DOI: 10.3390/math8111863] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Histopathology refers to the examination by a pathologist of biopsy samples. Histopathology images are captured by a microscope to locate, examine, and classify many diseases, such as different cancer types. They provide a detailed view of different types of diseases and their tissue status. These images are an essential resource with which to define biological compositions or analyze cell and tissue structures. This imaging modality is very important for diagnostic applications. The analysis of histopathology images is a prolific and relevant research area supporting disease diagnosis. In this paper, the challenges of histopathology image analysis are evaluated. An extensive review of conventional and deep learning techniques which have been applied in histological image analyses is presented. This review summarizes many current datasets and highlights important challenges and constraints with recent deep learning techniques, alongside possible future research avenues. Despite the progress made in this research area so far, it is still a significant area of open research because of the variety of imaging techniques and disease-specific characteristics.
Collapse
|
9
|
Schaer R, Otálora S, Jimenez-Del-Toro O, Atzori M, Müller H. Deep Learning-Based Retrieval System for Gigapixel Histopathology Cases and the Open Access Literature. J Pathol Inform 2019; 10:19. [PMID: 31367471 PMCID: PMC6639847 DOI: 10.4103/jpi.jpi_88_18] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2018] [Accepted: 05/17/2019] [Indexed: 11/08/2022] Open
Abstract
Background: The introduction of digital pathology into clinical practice has led to the development of clinical workflows with digital images, in connection with pathology reports. Still, most of the current work is time-consuming manual analysis of image areas at different scales. Links with data in the biomedical literature are rare, and a need for search based on visual similarity within whole slide images (WSIs) exists. Objectives: The main objective of the work presented is to integrate content-based visual retrieval with a WSI viewer in a prototype. Another objective is to connect cases analyzed in the viewer with cases or images from the biomedical literature, including the search through visual similarity and text. Methods: An innovative retrieval system for digital pathology is integrated with a WSI viewer, allowing to define regions of interest (ROIs) in images as queries for finding visually similar areas in the same or other images and to zoom in/out to find structures at varying magnification levels. The algorithms are based on a multimodal approach, exploiting both text information and content-based image features. Results: The retrieval system allows viewing WSIs and searching for regions that are visually similar to manually defined ROIs in various data sources (proprietary and public datasets, e.g., scientific literature). The system was tested by pathologists, highlighting its capabilities and suggesting ways to improve it and make it more usable in clinical practice. Conclusions: The developed system can enhance the practice of pathologists by enabling them to use their experience and knowledge to control artificial intelligence tools for navigating repositories of images for clinical decision support and teaching, where the comparison with visually similar cases can help to avoid misinterpretations. The system is available as open source, allowing the scientific community to test, ideate and develop similar systems for research and clinical practice.
Collapse
Affiliation(s)
- Roger Schaer
- Institute of Information Systems, HES-SO (University of Applied Sciences of Western Switzerland), Sierre, Switzerland
| | - Sebastian Otálora
- Institute of Information Systems, HES-SO (University of Applied Sciences of Western Switzerland), Sierre, Switzerland.,Department of Computer Science, University of Geneva (UNIGE), Geneva, Switzerland
| | - Oscar Jimenez-Del-Toro
- Institute of Information Systems, HES-SO (University of Applied Sciences of Western Switzerland), Sierre, Switzerland.,Department of Computer Science, University of Geneva (UNIGE), Geneva, Switzerland
| | - Manfredo Atzori
- Institute of Information Systems, HES-SO (University of Applied Sciences of Western Switzerland), Sierre, Switzerland
| | - Henning Müller
- Institute of Information Systems, HES-SO (University of Applied Sciences of Western Switzerland), Sierre, Switzerland.,Department of Computer Science, University of Geneva (UNIGE), Geneva, Switzerland
| |
Collapse
|
10
|
Hegde N, Hipp JD, Liu Y, Emmert-Buck M, Reif E, Smilkov D, Terry M, Cai CJ, Amin MB, Mermel CH, Nelson PQ, Peng LH, Corrado GS, Stumpe MC. Similar image search for histopathology: SMILY. NPJ Digit Med 2019; 2:56. [PMID: 31304402 PMCID: PMC6588631 DOI: 10.1038/s41746-019-0131-z] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 05/29/2019] [Indexed: 12/19/2022] Open
Abstract
The increasing availability of large institutional and public histopathology image datasets is enabling the searching of these datasets for diagnosis, research, and education. Although these datasets typically have associated metadata such as diagnosis or clinical notes, even carefully curated datasets rarely contain annotations of the location of regions of interest on each image. As pathology images are extremely large (up to 100,000 pixels in each dimension), further laborious visual search of each image may be needed to find the feature of interest. In this paper, we introduce a deep-learning-based reverse image search tool for histopathology images: Similar Medical Images Like Yours (SMILY). We assessed SMILY's ability to retrieve search results in two ways: using pathologist-provided annotations, and via prospective studies where pathologists evaluated the quality of SMILY search results. As a negative control in the second evaluation, pathologists were blinded to whether search results were retrieved by SMILY or randomly. In both types of assessments, SMILY was able to retrieve search results with similar histologic features, organ site, and prostate cancer Gleason grade compared with the original query. SMILY may be a useful general-purpose tool in the pathologist's arsenal, to improve the efficiency of searching large archives of histopathology images, without the need to develop and implement specific tools for each application.
Collapse
Affiliation(s)
| | | | - Yun Liu
- Google AI Healthcare, Mountain View, CA 94043 USA
| | | | - Emily Reif
- Google AI Healthcare, Mountain View, CA 94043 USA
| | | | | | | | - Mahul B. Amin
- Department of Pathology and Laboratory Medicine, University of Tennessee Health Science Center, Memphis, TN 38163 USA
| | | | | | - Lily H. Peng
- Google AI Healthcare, Mountain View, CA 94043 USA
| | | | - Martin C. Stumpe
- Google AI Healthcare, Mountain View, CA 94043 USA
- Present Address: AI and Data Science, Tempus Labs Inc, Chicago, IL USA
| |
Collapse
|
11
|
Machine learning approaches for pathologic diagnosis. Virchows Arch 2019; 475:131-138. [PMID: 31222375 DOI: 10.1007/s00428-019-02594-w] [Citation(s) in RCA: 62] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Revised: 04/07/2019] [Accepted: 06/03/2019] [Indexed: 02/06/2023]
Abstract
Machine learning techniques, especially deep learning techniques such as convolutional neural networks, have been successfully applied to general image recognitions since their overwhelming performance at the 2012 ImageNet Large Scale Visual Recognition Challenge. Recently, such techniques have also been applied to various medical, including histopathological, images to assist the process of medical diagnosis. In some cases, deep learning-based algorithms have already outperformed experienced pathologists for recognition of histopathological images. However, pathological images differ from general images in some aspects, and thus, machine learning of histopathological images requires specialized learning methods. Moreover, many pathologists are skeptical about the ability of deep learning technology to accurately recognize histopathological images because what the learned neural network recognizes is often indecipherable to humans. In this review, we first introduce various applications incorporating machine learning developed to assist the process of pathologic diagnosis, and then describe machine learning problems related to histopathological image analysis, and review potential ways to solve these problems.
Collapse
|
12
|
Cui L, Feng J, Zhang Z, Yang L. High throughput automatic muscle image segmentation using parallel framework. BMC Bioinformatics 2019; 20:158. [PMID: 30922212 PMCID: PMC6437912 DOI: 10.1186/s12859-019-2719-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Accepted: 03/07/2019] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Fast and accurate automatic segmentation of skeletal muscle cell image is crucial for the diagnosis of muscle related diseases, which extremely reduces the labor-intensive manual annotation. Recently, several methods have been presented for automatic muscle cell segmentation. However, most methods exhibit high model complexity and time cost, and they are not adaptive to large-scale images such as whole-slide scanned specimens. METHODS In this paper, we propose a novel distributed computing approach, which adopts both data and model parallel, for fast muscle cell segmentation. With a master-worker parallelism manner, the image data in the master is distributed onto multiple workers based on the Spark cloud computing platform. On each worker node, we first detect cell contours using a structured random forest (SRF) contour detector with fast parallel prediction and generate region candidates using a superpixel technique. Next, we propose a novel hierarchical tree based region selection algorithm for cell segmentation based on the conditional random field (CRF) algorithm. We divide the region selection algorithm into multiple sub-problems, which can be further parallelized using multi-core programming. RESULTS We test the performance of the proposed method on a large-scale haematoxylin and eosin (H &E) stained skeletal muscle image dataset. Compared with the standalone implementation, the proposed method achieves more than 10 times speed improvement on very large-scale muscle images containing hundreds to thousands of cells. Meanwhile, our proposed method produces high-quality segmentation results compared with several state-of-the-art methods. CONCLUSIONS This paper presents a parallel muscle image segmentation method with both data and model parallelism on multiple machines. The parallel strategy exhibits high compatibility to our muscle segmentation framework. The proposed method achieves high-throughput effective cell segmentation on large-scale muscle images.
Collapse
Affiliation(s)
- Lei Cui
- Department of Information Science and Technology, Northwest University, Xi'an, China
| | - Jun Feng
- Department of Information Science and Technology, Northwest University, Xi'an, China.
| | - Zizhao Zhang
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL, USA
| | - Lin Yang
- Department of Information Science and Technology, Northwest University, Xi'an, China
| |
Collapse
|
13
|
Van Es SL. Digital pathology: semper ad meliora. Pathology 2018; 51:1-10. [PMID: 30522785 DOI: 10.1016/j.pathol.2018.10.011] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2018] [Revised: 10/03/2018] [Accepted: 10/04/2018] [Indexed: 02/07/2023]
Abstract
This review is an evidence-based summary of digital pathology: past, present and future. It discusses digital surgical pathology and the cytopathology digitisation challenge as well as the performance of digital histopathology and cytopathology as a diagnostic tool, particularly in contrast to user perceptions. Time and cost efficiency of digital pathology, learning curves, education and quality assurance, with the importance of validation of systems, is emphasised. The review concludes with a discussion of digital pathology as a source of 'big data' and where this might lead pathologists in the digital pathology future.
Collapse
Affiliation(s)
- Simone L Van Es
- Department of Pathology, School of Medical Sciences, UNSW, Sydney, NSW, Australia.
| |
Collapse
|
14
|
Komura D, Ishikawa S. Machine Learning Methods for Histopathological Image Analysis. Comput Struct Biotechnol J 2018; 16:34-42. [PMID: 30275936 PMCID: PMC6158771 DOI: 10.1016/j.csbj.2018.01.001] [Citation(s) in RCA: 340] [Impact Index Per Article: 56.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Revised: 12/03/2017] [Accepted: 01/14/2018] [Indexed: 12/12/2022] Open
Abstract
Abundant accumulation of digital histopathological images has led to the increased demand for their analysis, such as computer-aided diagnosis using machine learning techniques. However, digital pathological images and related tasks have some issues to be considered. In this mini-review, we introduce the application of digital pathological image analysis using machine learning algorithms, address some problems specific to such analysis, and propose possible solutions.
Collapse
Affiliation(s)
- Daisuke Komura
- Department of Genomic Pathology, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
| | | |
Collapse
|
15
|
Li Z, Zhang X, Müller H, Zhang S. Large-scale retrieval for medical image analytics: A comprehensive review. Med Image Anal 2017; 43:66-84. [PMID: 29031831 DOI: 10.1016/j.media.2017.09.007] [Citation(s) in RCA: 75] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2017] [Revised: 08/01/2017] [Accepted: 09/29/2017] [Indexed: 12/27/2022]
Abstract
Over the past decades, medical image analytics was greatly facilitated by the explosion of digital imaging techniques, where huge amounts of medical images were produced with ever-increasing quality and diversity. However, conventional methods for analyzing medical images have achieved limited success, as they are not capable to tackle the huge amount of image data. In this paper, we review state-of-the-art approaches for large-scale medical image analysis, which are mainly based on recent advances in computer vision, machine learning and information retrieval. Specifically, we first present the general pipeline of large-scale retrieval, summarize the challenges/opportunities of medical image analytics on a large-scale. Then, we provide a comprehensive review of algorithms and techniques relevant to major processes in the pipeline, including feature representation, feature indexing, searching, etc. On the basis of existing work, we introduce the evaluation protocols and multiple applications of large-scale medical image retrieval, with a variety of exploratory and diagnostic scenarios. Finally, we discuss future directions of large-scale retrieval, which can further improve the performance of medical image analysis.
Collapse
Affiliation(s)
- Zhongyu Li
- Department of Computer Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Xiaofan Zhang
- Department of Computer Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Henning Müller
- Information Systems Institute, HES-SO Valais, Sierre, Switzerland
| | - Shaoting Zhang
- Department of Computer Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA.
| |
Collapse
|
16
|
Chennubhotla C, Clarke LP, Fedorov A, Foran D, Harris G, Helton E, Nordstrom R, Prior F, Rubin D, Saltz JH, Shalley E, Sharma A. An Assessment of Imaging Informatics for Precision Medicine in Cancer. Yearb Med Inform 2017; 26:110-119. [PMID: 29063549 DOI: 10.15265/iy-2017-041] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Objectives: Precision medicine requires the measurement, quantification, and cataloging of medical characteristics to identify the most effective medical intervention. However, the amount of available data exceeds our current capacity to extract meaningful information. We examine the informatics needs to achieve precision medicine from the perspective of quantitative imaging and oncology. Methods: The National Cancer Institute (NCI) organized several workshops on the topic of medical imaging and precision medicine. The observations and recommendations are summarized herein. Results: Recommendations include: use of standards in data collection and clinical correlates to promote interoperability; data sharing and validation of imaging tools; clinician's feedback in all phases of research and development; use of open-source architecture to encourage reproducibility and reusability; use of challenges which simulate real-world situations to incentivize innovation; partnership with industry to facilitate commercialization; and education in academic communities regarding the challenges involved with translation of technology from the research domain to clinical utility and the benefits of doing so. Conclusions: This article provides a survey of the role and priorities for imaging informatics to help advance quantitative imaging in the era of precision medicine. While these recommendations were drawn from oncology, they are relevant and applicable to other clinical domains where imaging aids precision medicine.
Collapse
|
17
|
Madabhushi A, Lee G. Image analysis and machine learning in digital pathology: Challenges and opportunities. Med Image Anal 2016; 33:170-175. [PMID: 27423409 DOI: 10.1016/j.media.2016.06.037] [Citation(s) in RCA: 461] [Impact Index Per Article: 57.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Revised: 06/13/2016] [Accepted: 06/30/2016] [Indexed: 02/05/2023]
Abstract
With the rise in whole slide scanner technology, large numbers of tissue slides are being scanned and represented and archived digitally. While digital pathology has substantial implications for telepathology, second opinions, and education there are also huge research opportunities in image computing with this new source of "big data". It is well known that there is fundamental prognostic data embedded in pathology images. The ability to mine "sub-visual" image features from digital pathology slide images, features that may not be visually discernible by a pathologist, offers the opportunity for better quantitative modeling of disease appearance and hence possibly improved prediction of disease aggressiveness and patient outcome. However the compelling opportunities in precision medicine offered by big digital pathology data come with their own set of computational challenges. Image analysis and computer assisted detection and diagnosis tools previously developed in the context of radiographic images are woefully inadequate to deal with the data density in high resolution digitized whole slide images. Additionally there has been recent substantial interest in combining and fusing radiologic imaging and proteomics and genomics based measurements with features extracted from digital pathology images for better prognostic prediction of disease aggressiveness and patient outcome. Again there is a paucity of powerful tools for combining disease specific features that manifest across multiple different length scales. The purpose of this review is to discuss developments in computational image analysis tools for predictive modeling of digital pathology images from a detection, segmentation, feature extraction, and tissue classification perspective. We discuss the emergence of new handcrafted feature approaches for improved predictive modeling of tissue appearance and also review the emergence of deep learning schemes for both object detection and tissue classification. We also briefly review some of the state of the art in fusion of radiology and pathology images and also combining digital pathology derived image measurements with molecular "omics" features for better predictive modeling. The review ends with a brief discussion of some of the technical and computational challenges to be overcome and reflects on future opportunities for the quantitation of histopathology.
Collapse
Affiliation(s)
- Anant Madabhushi
- Department of Biomedical Engineering, Center for Computational Imaging and Personalized Diagnostics, Case Western Reserve University, Cleveland, OH 44106-7207, United State.
| | - George Lee
- Department of Biomedical Engineering, Center for Computational Imaging and Personalized Diagnostics, Case Western Reserve University, Cleveland, OH 44106-7207, United State
| |
Collapse
|
18
|
Yildirim E, Foran DJ. Parallel Versus Distributed Data Access for Gigapixel-Resolution Histology Images: Challenges and Opportunities. IEEE J Biomed Health Inform 2016; 21:1049-1057. [PMID: 27323383 DOI: 10.1109/jbhi.2016.2580145] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Recent advances in digital pathology technology have led to significant improvements in terms of both the quality and resolution of the resulting images, which now often exceed several gigabytes each. Today, several leading institutions across the country utilize whole-slide imaging (WSI) as part of their routine workflow. WSIs have utility in a wide range of diagnostic and investigative pathology applications. The fact that these images are both large in size (about 30 GB when uncompressed) and are generated in nonstandard proprietary formats has limited wider adoption of these technologies and makes the task of accessing, processing, and analyzing them in high-throughput fashion extremely challenging. The common approach for such data analytic applications is to preprocess the large whole-slide images into smaller size files and store them in a generic format. However, this approach limits the advantages that might be realized if different scalability levels and data unit sizes could be dynamically changed based on the specifications of the task at hand and the architectural limits of the infrastructure (e.g., node memory size). Such strategies also introduce extra processing time to the workflow. To address these challenges, we present, in this paper, novel scalable access methods for parallel file systems and distributed file/object storage systems. Experimental results gathered during the course of our studies show that these methods provide opportunities not realizable using traditional approaches. We demonstrate tangible, scalability, and high-throughput advantages using a Lustre parallel file system and AWS S3 distributed storage system.
Collapse
|
19
|
Kurc T, Qi X, Wang D, Wang F, Teodoro G, Cooper L, Nalisnik M, Yang L, Saltz J, Foran DJ. Scalable analysis of Big pathology image data cohorts using efficient methods and high-performance computing strategies. BMC Bioinformatics 2015; 16:399. [PMID: 26627175 PMCID: PMC4667532 DOI: 10.1186/s12859-015-0831-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 11/16/2015] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND We describe a suite of tools and methods that form a core set of capabilities for researchers and clinical investigators to evaluate multiple analytical pipelines and quantify sensitivity and variability of the results while conducting large-scale studies in investigative pathology and oncology. The overarching objective of the current investigation is to address the challenges of large data sizes and high computational demands. RESULTS The proposed tools and methods take advantage of state-of-the-art parallel machines and efficient content-based image searching strategies. The content based image retrieval (CBIR) algorithms can quickly detect and retrieve image patches similar to a query patch using a hierarchical analysis approach. The analysis component based on high performance computing can carry out consensus clustering on 500,000 data points using a large shared memory system. CONCLUSIONS Our work demonstrates efficient CBIR algorithms and high performance computing can be leveraged for efficient analysis of large microscopy images to meet the challenges of clinically salient applications in pathology. These technologies enable researchers and clinical investigators to make more effective use of the rich informational content contained within digitized microscopy specimens.
Collapse
Affiliation(s)
- Tahsin Kurc
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, USA.
| | - Xin Qi
- Department of Pathology & Laboratory Medicine, Rutgers -- Robert Wood Johnson Medical School, New Brunswick, USA.
- Rutgers Cancer Institute of New Jersey, New Brunswick, USA.
| | - Daihou Wang
- Department of Electrical and Computer Engineering, Rutgers University, New Brunswick, USA.
| | - Fusheng Wang
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, USA.
- Department of Computer Science, Stony Brook University, Stony Brook, USA.
| | - George Teodoro
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, USA.
- Department of Computer Science, University of Brasilia, Brasília, Brazil.
| | - Lee Cooper
- Department of Biomedical Informatics, Emory University, Atlanta, USA.
| | - Michael Nalisnik
- Department of Biomedical Informatics, Emory University, Atlanta, USA.
| | - Lin Yang
- Department of Biomedical Engineering, University of Florida, Gainesville, USA.
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, USA.
| | - David J Foran
- Department of Pathology & Laboratory Medicine, Rutgers -- Robert Wood Johnson Medical School, New Brunswick, USA.
- Rutgers Cancer Institute of New Jersey, New Brunswick, USA.
| |
Collapse
|
20
|
Pospischil A, Folkers G. How much reproducibility do we need in human and veterinary pathology? ACTA ACUST UNITED AC 2015; 67:77-80. [DOI: 10.1016/j.etp.2014.11.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2014] [Revised: 11/17/2014] [Accepted: 11/17/2014] [Indexed: 10/24/2022]
|