1
|
Cang Z, Ning X, Nie A, Xu M, Zhang J. SCAN-IT: Domain segmentation of spatial transcriptomics images by graph neural network. BMVC : PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE. BRITISH MACHINE VISION CONFERENCE 2021; 32:406. [PMID: 36227018 PMCID: PMC9552951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Complex biological tissues consist of numerous cells in a highly coordinated manner and carry out various biological functions. Therefore, segmenting a tissue into spatial and functional domains is critically important for understanding and controlling the biological functions. The emerging spatial transcriptomics technologies allow simultaneous measurements of thousands of genes with precise spatial information, providing an unprecedented opportunity for dissecting biological tissues. However, how to utilize such noisy, sparse, and high dimensional data for tissue segmentation remains a major challenge. Here, we develop a deep learning-based method, named SCAN-IT by transforming the spatial domain identification problem into an image segmentation problem, with cells mimicking pixels and expression values of genes within a cell representing the color channels. Specifically, SCAN-IT relies on geometric modeling, graph neural networks, and an informatics approach, DeepGraphInfomax. We demonstrate that SCAN-IT can handle datasets from a wide range of spatial transcriptomics techniques, including the ones with high spatial resolution but low gene coverage as well as those with low spatial resolution but high gene coverage. We show that SCAN-IT outperforms state-of-the-art methods using a benchmark dataset with ground truth domain annotations.
Collapse
Affiliation(s)
- Zixuan Cang
- Department of Mathematics University of California, Irvine Irvine, CA, United States
| | | | - Annika Nie
- University High School Irvine, CA, United States
| | - Min Xu
- Computational Biology Department Carnegie Mellon University Pittsburgh, PA, United States
| | - Jing Zhang
- Department of Computer Science University of California, Irvine Irvine, CA, United States
| |
Collapse
|
2
|
Bhargava R, Madabhushi A. Emerging Themes in Image Informatics and Molecular Analysis for Digital Pathology. Annu Rev Biomed Eng 2017; 18:387-412. [PMID: 27420575 DOI: 10.1146/annurev-bioeng-112415-114722] [Citation(s) in RCA: 86] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Pathology is essential for research in disease and development, as well as for clinical decision making. For more than 100 years, pathology practice has involved analyzing images of stained, thin tissue sections by a trained human using an optical microscope. Technological advances are now driving major changes in this paradigm toward digital pathology (DP). The digital transformation of pathology goes beyond recording, archiving, and retrieving images, providing new computational tools to inform better decision making for precision medicine. First, we discuss some emerging innovations in both computational image analytics and imaging instrumentation in DP. Second, we discuss molecular contrast in pathology. Molecular DP has traditionally been an extension of pathology with molecularly specific dyes. Label-free, spectroscopic images are rapidly emerging as another important information source, and we describe the benefits and potential of this evolution. Third, we describe multimodal DP, which is enabled by computational algorithms and combines the best characteristics of structural and molecular pathology. Finally, we provide examples of application areas in telepathology, education, and precision medicine. We conclude by discussing challenges and emerging opportunities in this area.
Collapse
Affiliation(s)
- Rohit Bhargava
- Departments of Bioengineering, Chemical and Biomolecular Engineering, Electrical and Computer Engineering, Mechanical Science and Engineering, and Chemistry, and Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801;
| | - Anant Madabhushi
- Center for Computational Imaging and Personalized Diagnostics; Departments of Biomedical Engineering, Urology, Pathology, Radiology, Radiation Oncology, General Medical Sciences, Electrical Engineering, and Computer Science; and Case Comprehensive Cancer Center, Case Western Reserve University, Cleveland, Ohio 44106;
| |
Collapse
|
3
|
Konhar R, Debnath M, Marbaniang JV, Biswal DK, Tandon P. Age estimation for the genus Cymbidium (Orchidaceae: Epidendroideae) with implementation of fossil data calibration using molecular markers (ITS2 & matK) and phylogeographic inference from ancestral area reconstruction. J Bioinform Comput Biol 2017; 14:1660001. [DOI: 10.1142/s0219720016600015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Intercontinental dislocations between tropical regions harboring two-thirds of the flowering plants have always drawn attention from taxonomists and biogeographers. One such family belonging to angiosperms is Orchidaceae with an herbaceous habit and high species diversity in the tropics. Here, we investigate the evolutionary and biogeographical history of the genus Cymbidium, which represents a monophyletic subfamily (Epidendroideae) of the orchids and comprises 50 odd species that are distinctly distributed in tropical to temperate regions. Much is not known about correlations among the level of CAM activity (one of the photosynthetic pathways often regarded as an adaptation to water stress in land plants), habitat, life forms, and phylogenetic relationships of orchids from an evolutionary perspective. A relatively well-resolved and highly supported phylogeny for Cymbidium orchids is reconstructed based on sequence analysis of ITS2 and matK regions from the chloroplast DNA available in public repositories viz. GenBank at NCBI. This study examines a genus level analysis by integrating different molecular matrices to existing fossil data on orchids in a molecular Bayesian relaxed clock employed in BEAST and assessed divergence times for the genus Cymbidium with a focus on evolutionary history of photosynthetic characters. Our study has enabled age estimations (45Ma) as well as ancestral area reconstruction for the genus Cymbidium using BEAST by addition of previously analyzed two internal calibration points.
Collapse
Affiliation(s)
- Ruchishree Konhar
- Bioinformatics Centre, North-Eastern Hill University, Shillong, Meghalaya, India
| | - Manish Debnath
- Bioinformatics Centre, North-Eastern Hill University, Shillong, Meghalaya, India
| | | | | | | |
Collapse
|
4
|
Viswanath SE, Tiwari P, Lee G, Madabhushi A. Dimensionality reduction-based fusion approaches for imaging and non-imaging biomedical data: concepts, workflow, and use-cases. BMC Med Imaging 2017; 17:2. [PMID: 28056889 PMCID: PMC5217665 DOI: 10.1186/s12880-016-0172-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2016] [Accepted: 12/09/2016] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND With a wide array of multi-modal, multi-protocol, and multi-scale biomedical data being routinely acquired for disease characterization, there is a pressing need for quantitative tools to combine these varied channels of information. The goal of these integrated predictors is to combine these varied sources of information, while improving on the predictive ability of any individual modality. A number of application-specific data fusion methods have been previously proposed in the literature which have attempted to reconcile the differences in dimensionalities and length scales across different modalities. Our objective in this paper was to help identify metholodological choices that need to be made in order to build a data fusion technique, as it is not always clear which strategy is optimal for a particular problem. As a comprehensive review of all possible data fusion methods was outside the scope of this paper, we have focused on fusion approaches that employ dimensionality reduction (DR). METHODS In this work, we quantitatively evaluate 4 non-overlapping existing instantiations of DR-based data fusion, within 3 different biomedical applications comprising over 100 studies. These instantiations utilized different knowledge representation and knowledge fusion methods, allowing us to examine the interplay of these modules in the context of data fusion. The use cases considered in this work involve the integration of (a) radiomics features from T2w MRI with peak area features from MR spectroscopy for identification of prostate cancer in vivo, (b) histomorphometric features (quantitative features extracted from histopathology) with protein mass spectrometry features for predicting 5 year biochemical recurrence in prostate cancer patients, and (c) volumetric measurements on T1w MRI with protein expression features to discriminate between patients with and without Alzheimers' Disease. RESULTS AND CONCLUSIONS Our preliminary results in these specific use cases indicated that the use of kernel representations in conjunction with DR-based fusion may be most effective, as a weighted multi-kernel-based DR approach resulted in the highest area under the ROC curve of over 0.8. By contrast non-optimized DR-based representation and fusion methods yielded the worst predictive performance across all 3 applications. Our results suggest that when the individual modalities demonstrate relatively poor discriminability, many of the data fusion methods may not yield accurate, discriminatory representations either. In summary, to outperform the predictive ability of individual modalities, methodological choices for data fusion must explicitly account for the sparsity of and noise in the feature space.
Collapse
Affiliation(s)
- Satish E Viswanath
- Department of Biomedical Engineering, Case Western Reserve University, 10900 Euclid Ave, Wickenden 523, Cleveland, OH, USA.
| | - Pallavi Tiwari
- Department of Biomedical Engineering, Case Western Reserve University, 10900 Euclid Ave, Wickenden 523, Cleveland, OH, USA
| | - George Lee
- Department of Biomedical Engineering, Case Western Reserve University, 10900 Euclid Ave, Wickenden 523, Cleveland, OH, USA
| | - Anant Madabhushi
- Department of Biomedical Engineering, Case Western Reserve University, 10900 Euclid Ave, Wickenden 523, Cleveland, OH, USA
| | | |
Collapse
|
5
|
Chen F, Liu X, Yu C, Chen Y, Tang H, Zhang L. Water lilies as emerging models for Darwin's abominable mystery. HORTICULTURE RESEARCH 2017; 4:17051. [PMID: 28979789 PMCID: PMC5626932 DOI: 10.1038/hortres.2017.51] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2017] [Revised: 06/30/2017] [Accepted: 07/26/2017] [Indexed: 05/02/2023]
Abstract
Water lilies are not only highly favored aquatic ornamental plants with cultural and economic importance but they also occupy a critical evolutionary space that is crucial for understanding the origin and early evolutionary trajectory of flowering plants. The birth and rapid radiation of flowering plants has interested many scientists and was considered 'an abominable mystery' by Charles Darwin. In searching for the angiosperm evolutionary origin and its underlying mechanisms, the genome of Amborella has shed some light on the molecular features of one of the basal angiosperm lineages; however, little is known regarding the genetics and genomics of another basal angiosperm lineage, namely, the water lily. In this study, we reviewed current molecular research and note that water lily research has entered the genomic era. We propose that the genome of the water lily is critical for studying the contentious relationship of basal angiosperms and Darwin's 'abominable mystery'. Four pantropical water lilies, especially the recently sequenced Nymphaea colorata, have characteristics such as small size, rapid growth rate and numerous seeds and can act as the best model for understanding the origin of angiosperms. The water lily genome is also valuable for revealing the genetics of ornamental traits and will largely accelerate the molecular breeding of water lilies.
Collapse
Affiliation(s)
- Fei Chen
- Center for Genomics and Biotechnology; State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops; Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops; Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology; Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Xing Liu
- Center for Genomics and Biotechnology; State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops; Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops; Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology; Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Cuiwei Yu
- Zhejiang Humanities Landscape Co., LTD, Hangzhou 310030, China
| | - Yuchu Chen
- Zhejiang Humanities Landscape Co., LTD, Hangzhou 310030, China
| | - Haibao Tang
- Center for Genomics and Biotechnology; State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops; Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops; Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology; Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Liangsheng Zhang
- Center for Genomics and Biotechnology; State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops; Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops; Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology; Fujian Agriculture and Forestry University, Fuzhou 350002, China
- )
| |
Collapse
|
6
|
ITS2 secondary structure for species circumscription: case study in southern African Strychnos L. (Loganiaceae). Genetica 2016; 144:639-650. [DOI: 10.1007/s10709-016-9931-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Accepted: 10/03/2016] [Indexed: 10/20/2022]
|
7
|
Abstract
Identification and clustering of orthologous genes plays an important role in developing evolutionary models such as validating convergent and divergent phylogeny and predicting functional proteins in newly sequenced species of unverified nucleotide protein mappings. Here, we introduce an application of subspace clustering as applied to orthologous gene sequences and discuss the initial results. The working hypothesis is based upon the concept that genetic changes between nucleotide sequences coding for proteins among selected species and groups may lie within a union of subspaces for clusters of the orthologous groups. Estimates for the subspace dimensions were computed for a small population sample. A series of experiments was performed to cluster randomly selected sequences. The experimental design allows for both false positives and false negatives, and estimates for the statistical significance are provided. The clustering results are consistent with the main hypothesis. A simple random mutation binary tree model is used to simulate speciation events that show the interdependence of the subspace rank versus time and mutation rates. The simple mutation model is found to be largely consistent with the observed subspace clustering singular value results. Our study indicates that the subspace clustering method may be applied in orthology analysis.
Collapse
Affiliation(s)
- Tim Wallace
- 1 Department of Computer Science, Tennessee State University , Nashville, Tennessee
| | - Ali Sekmen
- 1 Department of Computer Science, Tennessee State University , Nashville, Tennessee
| | - Xiaofei Wang
- 2 Department of Biological Sciences, Tennessee State University , Nashville, Tennessee
| |
Collapse
|
8
|
Lee G, Singanamalli A, Wang H, Feldman MD, Master SR, Shih NNC, Spangler E, Rebbeck T, Tomaszewski JE, Madabhushi A. Supervised multi-view canonical correlation analysis (sMVCCA): integrating histologic and proteomic features for predicting recurrent prostate cancer. IEEE TRANSACTIONS ON MEDICAL IMAGING 2015; 34:284-297. [PMID: 25203987 DOI: 10.1109/tmi.2014.2355175] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
In this work, we present a new methodology to facilitate prediction of recurrent prostate cancer (CaP) following radical prostatectomy (RP) via the integration of quantitative image features and protein expression in the excised prostate. Creating a fused predictor from high-dimensional data streams is challenging because the classifier must 1) account for the "curse of dimensionality" problem, which hinders classifier performance when the number of features exceeds the number of patient studies and 2) balance potential mismatches in the number of features across different channels to avoid classifier bias towards channels with more features. Our new data integration methodology, supervised Multi-view Canonical Correlation Analysis (sMVCCA), aims to integrate infinite views of highdimensional data to provide more amenable data representations for disease classification. Additionally, we demonstrate sMVCCA using Spearman's rank correlation which, unlike Pearson's correlation, can account for nonlinear correlations and outliers. Forty CaP patients with pathological Gleason scores 6-8 were considered for this study. 21 of these men revealed biochemical recurrence (BCR) following RP, while 19 did not. For each patient, 189 quantitative histomorphometric attributes and 650 protein expression levels were extracted from the primary tumor nodule. The fused histomorphometric/proteomic representation via sMVCCA combined with a random forest classifier predicted BCR with a mean AUC of 0.74 and a maximum AUC of 0.9286. We found sMVCCA to perform statistically significantly (p < 0.05) better than comparative state-of-the-art data fusion strategies for predicting BCR. Furthermore, Kaplan-Meier analysis demonstrated improved BCR-free survival prediction for the sMVCCA-fused classifier as compared to histology or proteomic features alone.
Collapse
|
9
|
Caisová L, Melkonian M. Evolution of helix formation in the ribosomal Internal Transcribed Spacer 2 (ITS2) and its significance for RNA secondary structures. J Mol Evol 2014; 78:324-37. [PMID: 24908393 DOI: 10.1007/s00239-014-9625-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2014] [Accepted: 05/19/2014] [Indexed: 01/25/2023]
Abstract
Helices are the most common elements of RNA secondary structure. Despite intensive investigations of various types of RNAs, the evolutionary history of the formation of new helices (novel helical structures) remains largely elusive. Here, by studying the nuclear ribosomal Internal Transcribed Spacer 2 (ITS2), a fast-evolving part of the eukaryotic nuclear ribosomal operon, we identify two possible types of helix formation: one type is "dichotomous helix formation"--transition from one large helix to two smaller helices by invagination of the apical part of a helix, which significantly changes the shape of the original secondary structure but does not increase its complexity (i.e., the total length of the RNA). An alternative type is "lateral helix formation"--origin of an extra helical region by the extension of a bulge loop or a spacer in a multi-helix loop of the original helix, which does not disrupt the pre-existing structure but increases RNA size. Moreover, we present examples from the RNA sequence literature indicating that both types of helix formation may have implications for RNA evolution beyond ITS2.
Collapse
Affiliation(s)
- Lenka Caisová
- Universität zu Köln, Biozentrum Köln, Botanisches Institut, Zülpicher Str. 47b, 50674, Köln, Germany,
| | | |
Collapse
|
10
|
Sparks R, Madabhushi A. Statistical Shape Model for Manifold Regularization: Gleason grading of prostate histology. COMPUTER VISION AND IMAGE UNDERSTANDING : CVIU 2013; 117:1138-1146. [PMID: 23888106 PMCID: PMC3718190 DOI: 10.1016/j.cviu.2012.11.011] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Gleason patterns of prostate cancer histopathology, characterized primarily by morphological and architectural attributes of histological structures (glands and nuclei), have been found to be highly correlated with disease aggressiveness and patient outcome. Gleason patterns 4 and 5 are highly correlated with more aggressive disease and poorer patient outcome, while Gleason patterns 1-3 tend to reflect more favorable patient outcome. Because Gleason grading is done manually by a pathologist visually examining glass (or digital) slides subtle morphologic and architectural differences of histological attributes, in addition to other factors, may result in grading errors and hence cause high inter-observer variability. Recently some researchers have proposed computerized decision support systems to automatically grade Gleason patterns by using features pertaining to nuclear architecture, gland morphology, as well as tissue texture. Automated characterization of gland morphology has been shown to distinguish between intermediate Gleason patterns 3 and 4 with high accuracy. Manifold learning (ML) schemes attempt to generate a low dimensional manifold representation of a higher dimensional feature space while simultaneously preserving nonlinear relationships between object instances. Classification can then be performed in the low dimensional space with high accuracy. However ML is sensitive to the samples contained in the dataset; changes in the dataset may alter the manifold structure. In this paper we present a manifold regularization technique to constrain the low dimensional manifold to a specific range of possible manifold shapes, the range being determined via a statistical shape model of manifolds (SSMM). In this work we demonstrate applications of the SSMM in (1) identifying samples on the manifold which contain noise, defined as those samples which deviate from the SSMM, and (2) accurate out-of-sample extrapolation (OSE) of newly acquired samples onto a manifold constrained by the SSMM. We demonstrate these applications of the SSMM in the context of distinguish between Gleason patterns 3 and 4 using glandular morphologic features in a prostate histopathology dataset of 58 patient studies. Identifying and eliminating noisy samples from the manifold via the SSMM results in a statistically significant improvement in area under the receiver operator characteristic curve (AUC), 0.832 ± 0.048 with removal of noisy samples compared to a AUC of 0.779 ± 0.075 without removal of samples. The use of the SSMM for OSE of newly acquired glands also shows statistically significant improvement in AUC, 0.834 ± 0.051 with the SSMM compared to 0.779 ± 0.054 without the SSMM. Similar results were observed for the synthetic Swiss Roll and Helix datasets.
Collapse
Affiliation(s)
- Rachel Sparks
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, 08854
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH, 44106
| | - Anant Madabhushi
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH, 44106
| |
Collapse
|
11
|
Liu R, Wang X, Aihara K, Chen L. Early diagnosis of complex diseases by molecular biomarkers, network biomarkers, and dynamical network biomarkers. Med Res Rev 2013; 34:455-78. [PMID: 23775602 DOI: 10.1002/med.21293] [Citation(s) in RCA: 189] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Many studies have been carried out for early diagnosis of complex diseases by finding accurate and robust biomarkers specific to respective diseases. In particular, recent rapid advance of high-throughput technologies provides unprecedented rich information to characterize various disease genotypes and phenotypes in a global and also dynamical manner, which significantly accelerates the study of biomarkers from both theoretical and clinical perspectives. Traditionally, molecular biomarkers that distinguish disease samples from normal samples are widely adopted in clinical practices due to their ease of data measurement. However, many of them suffer from low coverage and high false-positive rates or high false-negative rates, which seriously limit their further clinical applications. To overcome those difficulties, network biomarkers (or module biomarkers) attract much attention and also achieve better performance because a network (or subnetwork) is considered to be a more robust form to characterize diseases than individual molecules. But, both molecular biomarkers and network biomarkers mainly distinguish disease samples from normal samples, and they generally cannot ensure to identify predisease samples due to their static nature, thereby lacking ability to early diagnosis. Based on nonlinear dynamical theory and complex network theory, a new concept of dynamical network biomarkers (DNBs, or a dynamical network of biomarkers) has been developed, which is different from traditional static approaches, and the DNB is able to distinguish a predisease state from normal and disease states by even a small number of samples, and therefore has great potential to achieve "real" early diagnosis of complex diseases. In this paper, we comprehensively review the recent advances and developments on molecular biomarkers, network biomarkers, and DNBs in particular, focusing on the biomarkers for early diagnosis of complex diseases considering a small number of samples and high-throughput data (or big data). Detailed comparisons of various types of biomarkers as well as their applications are also discussed.
Collapse
Affiliation(s)
- Rui Liu
- Department of Mathematics, South China University of Technology, Guangzhou, 510640, China
| | | | | | | |
Collapse
|
12
|
Khan AM, El-Daly H, Simmons E, Rajpoot NM. HyMaP: A hybrid magnitude-phase approach to unsupervised segmentation of tumor areas in breast cancer histology images. J Pathol Inform 2013; 4:S1. [PMID: 23766931 PMCID: PMC3678741 DOI: 10.4103/2153-3539.109802] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2013] [Accepted: 01/21/2013] [Indexed: 11/04/2022] Open
Abstract
BACKGROUND Segmentation of areas containing tumor cells in standard H&E histopathology images of breast (and several other tissues) is a key task for computer-assisted assessment and grading of histopathology slides. Good segmentation of tumor regions is also vital for automated scoring of immunohistochemical stained slides to restrict the scoring or analysis to areas containing tumor cells only and avoid potentially misleading results from analysis of stromal regions. Furthermore, detection of mitotic cells is critical for calculating key measures such as mitotic index; a key criteria for grading several types of cancers including breast cancer. We show that tumor segmentation can allow detection and quantification of mitotic cells from the standard H&E slides with a high degree of accuracy without need for special stains, in turn making the whole process more cost-effective. METHOD BASED ON THE TISSUE MORPHOLOGY, BREAST HISTOLOGY IMAGE CONTENTS CAN BE DIVIDED INTO FOUR REGIONS: Tumor, Hypocellular Stroma (HypoCS), Hypercellular Stroma (HyperCS), and tissue fat (Background). Background is removed during the preprocessing stage on the basis of color thresholding, while HypoCS and HyperCS regions are segmented by calculating features using magnitude and phase spectra in the frequency domain, respectively, and performing unsupervised segmentation on these features. RESULTS All images in the database were hand segmented by two expert pathologists. The algorithms considered here are evaluated on three pixel-wise accuracy measures: precision, recall, and F1-Score. The segmentation results obtained by combining HypoCS and HyperCS yield high F1-Score of 0.86 and 0.89 with re-spect to the ground truth. CONCLUSIONS In this paper, we show that segmentation of breast histopathology image into hypocellular stroma and hypercellular stroma can be achieved using magnitude and phase spectra in the frequency domain. The segmentation leads to demarcation of tumor margins leading to improved accuracy of mitotic cell detection.
Collapse
Affiliation(s)
- Adnan M Khan
- Department of Computer Science, University of Warwick, UK
| | | | | | | |
Collapse
|