1
|
Binson VA, Thomas S, Subramoniam M, Arun J, Naveen S, Madhu S. A Review of Machine Learning Algorithms for Biomedical Applications. Ann Biomed Eng 2024; 52:1159-1183. [PMID: 38383870 DOI: 10.1007/s10439-024-03459-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Accepted: 01/24/2024] [Indexed: 02/23/2024]
Abstract
As the amount and complexity of biomedical data continue to increase, machine learning methods are becoming a popular tool in creating prediction models for the underlying biomedical processes. Although all machine learning methods aim to fit models to data, the methodologies used can vary greatly and may seem daunting at first. A comprehensive review of various machine learning algorithms per biomedical applications is presented. The key concepts of machine learning are supervised and unsupervised learning, feature selection, and evaluation metrics. Technical insights on the major machine learning methods such as decision trees, random forests, support vector machines, and k-nearest neighbors are analyzed. Next, the dimensionality reduction methods like principal component analysis and t-distributed stochastic neighbor embedding methods, and their applications in biomedical data analysis were reviewed. Moreover, in biomedical applications predominantly feedforward neural networks, convolutional neural networks, and recurrent neural networks are utilized. In addition, the identification of emerging directions in machine learning methodology will serve as a useful reference for individuals involved in biomedical research, clinical practice, and related professions who are interested in understanding and applying machine learning algorithms in their research or practice.
Collapse
Affiliation(s)
- V A Binson
- Department of Electronics Engineering, Saintgits College of Engineering, Kottayam, India
| | - Sania Thomas
- Department of Computer Science and Engineering, Saintgits College of Engineering, Kottayam, India
| | - M Subramoniam
- Department of Electronics Engineering, Sathyabama Institute of Science and Technology, Chennai, Tamil Nadu, India
| | - J Arun
- Centre for Waste Management-International Research Centre, Sathyabama Institute of Science and Technology, Chennai, 600119, India
| | - S Naveen
- Department of Automobile Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India
| | - S Madhu
- Department of Automobile Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India.
| |
Collapse
|
2
|
M S K, Rajaguru H, Nair AR. Enhancement of Classifier Performance with Adam and RanAdam Hyper-Parameter Tuning for Lung Cancer Detection from Microarray Data-In Pursuit of Precision. Bioengineering (Basel) 2024; 11:314. [PMID: 38671736 PMCID: PMC11047746 DOI: 10.3390/bioengineering11040314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 03/18/2024] [Accepted: 03/20/2024] [Indexed: 04/28/2024] Open
Abstract
Microarray gene expression analysis is a powerful technique used in cancer classification and research to identify and understand gene expression patterns that can differentiate between different cancer types, subtypes, and stages. However, microarray databases are highly redundant, inherently nonlinear, and noisy. Therefore, extracting meaningful information from such a huge database is a challenging one. The paper adopts the Fast Fourier Transform (FFT) and Mixture Model (MM) for dimensionality reduction and utilises the Dragonfly optimisation algorithm as the feature selection technique. The classifiers employed in this research are Nonlinear Regression, Naïve Bayes, Decision Tree, Random Forest and SVM (RBF). The classifiers' performances are analysed with and without feature selection methods. Finally, Adaptive Moment Estimation (Adam) and Random Adaptive Moment Estimation (RanAdam) hyper-parameter tuning techniques are used as improvisation techniques for classifiers. The SVM (RBF) classifier with the Fast Fourier Transform Dimensionality Reduction method and Dragonfly feature selection achieved the highest accuracy of 98.343% with RanAdam hyper-parameter tuning compared to other classifiers.
Collapse
Affiliation(s)
- Karthika M S
- Department of Information Technology, Bannari Amman Institute of Technology, Sathyamangalam 638401, India;
| | - Harikumar Rajaguru
- Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathyamangalam 638401, India;
| | - Ajin R. Nair
- Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathyamangalam 638401, India;
| |
Collapse
|
3
|
Medina-Ortiz D, Contreras S, Quiroz C, Olivera-Nappa Á. Development of Supervised Learning Predictive Models for Highly Non-linear Biological, Biomedical, and General Datasets. Front Mol Biosci 2020; 7:13. [PMID: 32118039 PMCID: PMC7031350 DOI: 10.3389/fmolb.2020.00013] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Accepted: 01/22/2020] [Indexed: 11/13/2022] Open
Abstract
In highly non-linear datasets, attributes or features do not allow readily finding visual patterns for identifying common underlying behaviors. Therefore, it is not possible to achieve classification or regression using linear or mildly non-linear hyperspace partition functions. Hence, supervised learning models based on the application of most existing algorithms are limited, and their performance metrics are low. Linear transformations of variables, such as principal components analysis, cannot avoid the problem, and even models based on artificial neural networks and deep learning are unable to improve the metrics. Sometimes, even when features allow classification or regression in reported cases, performance metrics of supervised learning algorithms remain unsatisfyingly low. This problem is recurrent in many areas of study as, per example, the clinical, biotechnological, and protein engineering areas, where many of the attributes are correlated in an unknown and very non-linear fashion or are categorical and difficult to relate to a target response variable. In such areas, being able to create predictive models would dramatically impact the quality of their outcomes, generating an immediate added value for both the scientific and general public. In this manuscript, we present RV-Clustering, a library of unsupervised learning algorithms, and a new methodology designed to find optimum partitions within highly non-linear datasets that allow deconvoluting variables and notoriously improving performance metrics in supervised learning classification or regression models. The partitions obtained are statistically cross-validated, ensuring correct representativity and no over-fitting. We have successfully tested RV-Clustering in several highly non-linear datasets with different origins. The approach herein proposed has generated classification and regression models with high-performance metrics, which further supports its ability to generate predictive models for highly non-linear datasets. Advantageously, the method does not require significant human input, which guarantees a higher usability in the biological, biomedical, and protein engineering community with no specific knowledge in the machine learning area.
Collapse
Affiliation(s)
- David Medina-Ortiz
- Departamento de Ingeniería Química, Biotecnología y Materiales, Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile, Santiago, Chile.,Centre for Biotechnology and Bioengineering, Universidad de Chile, Santiago, Chile
| | - Sebastián Contreras
- Centre for Biotechnology and Bioengineering, Universidad de Chile, Santiago, Chile
| | - Cristofer Quiroz
- Facultad de Ingeniería, Universidad Autónoma de Chile, Talca, Chile
| | - Álvaro Olivera-Nappa
- Departamento de Ingeniería Química, Biotecnología y Materiales, Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile, Santiago, Chile.,Centre for Biotechnology and Bioengineering, Universidad de Chile, Santiago, Chile
| |
Collapse
|
4
|
Seok HS. Performance comparison of dimensionality reduction methods on RNA-Seq data from the GTEx project. Genes Genomics 2019; 42:225-234. [PMID: 31833048 DOI: 10.1007/s13258-019-00896-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Accepted: 11/22/2019] [Indexed: 11/25/2022]
Abstract
BACKGROUND One of the apparent characteristics of bioinformatics data is the combination of very large number of features and relatively small number of samples. The vast number of features makes intuitive understanding of a target domain difficult. Dimensionality reduction or manifold learning has potential to circumvent this obstacle, but restricted methods have been preferred. OBJECTIVE The objective of this study is to observe the characteristics of various dimensionality reduction methods-locally linear embedding (LLE), multi-dimensional scaling (MDS), principal component analysis (PCA), spectral embedding (SE), and t-distributed Stochastic Neighbor Embedding (t-SNE)-on the RNA-Seq dataset from the genotype-tissue expression (GTEx) project. RESULTS The characteristics of the dimensionality reduction methods are observed on the nine groups of three different tissues in the reduced space with dimensionality of two, three, and four. The visualization results report that each dimensionality reduction method produces a very distinct reduced space. The quantitative results are obtained as the performance of k-means clustering. Clustering in the reduced space from non-linear methods such as LLE, t-SNE and SE achieved better results than in the reduced space produced by linear methods like PCA and MDS. CONCLUSIONS The experimental results recommend the application of both linear and non-linear dimensionality reduction methods on the target data for grasping the underlying characteristics of the datasets intuitively.
Collapse
Affiliation(s)
- Ho-Sik Seok
- Department of Computer and Communications Engineering, Kangwon National University, Chuncheon-si, Gangwon-do, 24341, South Korea.
| |
Collapse
|
5
|
Akazawa Y, Mizuno S, Fujinami N, Suzuki T, Yoshioka Y, Ochiya T, Nakamoto Y, Nakatsura T. Usefulness of serum microRNA as a predictive marker of recurrence and prognosis in biliary tract cancer after radical surgery. Sci Rep 2019; 9:5925. [PMID: 30976046 PMCID: PMC6459925 DOI: 10.1038/s41598-019-42392-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Accepted: 03/20/2019] [Indexed: 02/06/2023] Open
Abstract
Biliary tract cancer (BTC) is an aggressive type of malignant tumour. Even after radical resection, the risk of recurrence is still high, resulting in a poor prognosis. Here, we investigated the usefulness of serum miRNAs as predictive markers of recurrence and prognosis for patients with BTC after radical surgery using 66 serum samples that were collected at three time points from 22 patients with BTC who underwent radical surgery. Using microarray analysis, we successfully identified six specific miRNAs (miR-1225-3p, miR-1234-3p, miR1260b, miR-1470, miR-6834-3p, and miR-6875-5p) associated with recurrence and prognosis of BTC after radical surgery. In addition, using a combination of these miRNAs, we developed a recurrence predictive index to predict recurrence in patients with BTC after operation with high accuracy. Patients having higher index scores (≥ cut-off) had significantly worse recurrence-free survival (RFS) and overall survival (OS) than those with lower index scores (<cut-off). Furthermore, the index was an independent factor related to RFS and OS by univariate and multivariate analyses using a Cox hazard proportional model. Overall, our results provided compelling evidence for the potential usefulness of specific serum miRNAs as effective predictive tools for recurrence and prognosis in patients with BTC who underwent radical surgery.
Collapse
Affiliation(s)
- Yu Akazawa
- Division of Cancer Immunotherapy, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Kashiwa, Japan.,Second Department of Internal Medicine, Faculty of Medical Sciences, University of Fukui, Fukui, Japan
| | - Shoichi Mizuno
- Division of Cancer Immunotherapy, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Kashiwa, Japan
| | - Norihiro Fujinami
- Division of Cancer Immunotherapy, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Kashiwa, Japan
| | - Toshihiro Suzuki
- Division of Cancer Immunotherapy, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Kashiwa, Japan
| | - Yusuke Yoshioka
- Division of Molecular and Cellular Medicine, National Cancer Center Research Institute, Tokyo, Japan
| | - Takahiro Ochiya
- Division of Molecular and Cellular Medicine, National Cancer Center Research Institute, Tokyo, Japan.,Institute of Medical Science, Tokyo Medical University, Tokyo, Japan
| | - Yasunari Nakamoto
- Second Department of Internal Medicine, Faculty of Medical Sciences, University of Fukui, Fukui, Japan
| | - Tetsuya Nakatsura
- Division of Cancer Immunotherapy, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Kashiwa, Japan.
| |
Collapse
|
6
|
Pfaffl MW, Riedmaier-Sprenzel I. New surveillance concepts in food safety in meat producing animals: the advantage of high throughput 'omics' technologies - A review. ASIAN-AUSTRALASIAN JOURNAL OF ANIMAL SCIENCES 2018; 31:1062-1071. [PMID: 29879820 PMCID: PMC6039326 DOI: 10.5713/ajas.18.0155] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Accepted: 05/23/2018] [Indexed: 12/14/2022]
Abstract
The misuse of anabolic hormones or illegal drugs is a ubiquitous problem in animal husbandry and in food safety. The ban on growth promotants in food producing animals in the European Union is well controlled. However, application regimens that are difficult to detect persist, including newly designed anabolic drugs and complex hormone cocktails. Therefore identification of molecular endogenous biomarkers which are based on the physiological response after the illicit treatment has become a focus of detection methods. The analysis of the ‘transcriptome’ has been shown to have promise to discover the misuse of anabolic drugs, by indirect detection of their pharmacological action in organs or selected tissues. Various studies have measured gene expression changes after illegal drug or hormone application. So-called transcriptomic biomarkers were quantified at the mRNA and/or microRNA level by reverse transcription-quantitative polymerase chain reaction (RT-qPCR) technology or by more modern ‘omics’ and high throughput technologies including RNA-sequencing (RNA-Seq). With the addition of advanced bioinformatical approaches such as hierarchical clustering analysis or dynamic principal components analysis, a valid ‘biomarker signature’ can be established to discriminate between treated and untreated individuals. It has been shown in numerous animal and cell culture studies, that identification of treated animals is possible via our transcriptional biomarker approach. The high throughput sequencing approach is also capable of discovering new biomarker candidates and, in combination with quantitative RT-qPCR, validation and confirmation of biomarkers has been possible. These results from animal production and food safety studies demonstrate that analysis of the transcriptome has high potential as a new screening method using transcriptional ‘biomarker signatures’ based on the physiological response triggered by illegal substances.
Collapse
Affiliation(s)
- Michael W Pfaffl
- Animal Physiology and Immunology, TUM School of Life Sciences, Technical University of Munich Weihenstephan, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Irmgard Riedmaier-Sprenzel
- Animal Physiology and Immunology, TUM School of Life Sciences, Technical University of Munich Weihenstephan, Weihenstephaner Berg 3, 85354 Freising, Germany.,Eurofins Medigenomix Forensik GmbH, Anzinger Straße 7a, 85560 Ebersberg, Germany
| |
Collapse
|
7
|
Bhargava R, Madabhushi A. Emerging Themes in Image Informatics and Molecular Analysis for Digital Pathology. Annu Rev Biomed Eng 2017; 18:387-412. [PMID: 27420575 DOI: 10.1146/annurev-bioeng-112415-114722] [Citation(s) in RCA: 86] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Pathology is essential for research in disease and development, as well as for clinical decision making. For more than 100 years, pathology practice has involved analyzing images of stained, thin tissue sections by a trained human using an optical microscope. Technological advances are now driving major changes in this paradigm toward digital pathology (DP). The digital transformation of pathology goes beyond recording, archiving, and retrieving images, providing new computational tools to inform better decision making for precision medicine. First, we discuss some emerging innovations in both computational image analytics and imaging instrumentation in DP. Second, we discuss molecular contrast in pathology. Molecular DP has traditionally been an extension of pathology with molecularly specific dyes. Label-free, spectroscopic images are rapidly emerging as another important information source, and we describe the benefits and potential of this evolution. Third, we describe multimodal DP, which is enabled by computational algorithms and combines the best characteristics of structural and molecular pathology. Finally, we provide examples of application areas in telepathology, education, and precision medicine. We conclude by discussing challenges and emerging opportunities in this area.
Collapse
Affiliation(s)
- Rohit Bhargava
- Departments of Bioengineering, Chemical and Biomolecular Engineering, Electrical and Computer Engineering, Mechanical Science and Engineering, and Chemistry, and Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801;
| | - Anant Madabhushi
- Center for Computational Imaging and Personalized Diagnostics; Departments of Biomedical Engineering, Urology, Pathology, Radiology, Radiation Oncology, General Medical Sciences, Electrical Engineering, and Computer Science; and Case Comprehensive Cancer Center, Case Western Reserve University, Cleveland, Ohio 44106;
| |
Collapse
|
8
|
Alilou M, Beig N, Orooji M, Rajiah P, Velcheti V, Rakshit S, Reddy N, Yang M, Jacono F, Gilkeson RC, Linden P, Madabhushi A. An integrated segmentation and shape-based classification scheme for distinguishing adenocarcinomas from granulomas on lung CT. Med Phys 2017; 44:3556-3569. [PMID: 28295386 DOI: 10.1002/mp.12208] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2016] [Revised: 02/20/2017] [Accepted: 02/27/2017] [Indexed: 12/30/2022] Open
Abstract
PURPOSE Distinguishing between benign granulmoas and adenocarcinomas is confounded by their similar visual appearance on routine CT scans. Unfortunately, owing to the inability to discriminate these lesions radigraphically, many patients with benign granulomas are subjected to unnecessary surgical wedge resections and biopsies for pathologic confirmation of cancer presence or absence. This suggests the need for improved computerized characterization of these nodules in order to distinguish between these two classes of lesions on CT scans. While there has been substantial interest in the use of textural analysis for radiomic characterization of lung nodules, relatively less work has been done in shape based characterization of lung nodules, particularly with respect to granulmoas and adenocarcinomas. The primary goal of this study is to evaluate the role of 3D shape features for discrimination of benign granulomas from malignant adenocarcinomas on lung CT images. Towards this end we present an integrated framework for segmentation, feature characterization and classification of these nodules on CT. METHODS The nodule segmentation method starts with separation of lung regions from the surrounding lung anatomy. Next, the lung CT scans are projected into and represented in a three dimensional spectral embedding (SE) space, allowing for better determination of the boundaries of the nodule. This then enables the application of a gradient vector flow active contour (SEGvAC) model for nodule boundary extraction. A set of 24 shape features from both 2D slices and 3D surface of the segmented nodules are extracted, including features pertaining to the angularity, spiculation, elongation and nodule compactness. A feature selection scheme, PCA-VIP, is employed to identify the most discriminating set of features to distinguish granulmoas from adenocarcinomas within a learning set of 82 patients. The features thus identified were then combined with a support vector machine classifier and independently validated on a distinct test set comprising 67 patients. The performance of the classifier for both of the training and validation cohorts was evaluated by the area under receiver characteristic curve (ROC). RESULTS We used 82 and 67 studies from two different institutions respectively for training and independent validation of the model and the shape features. The Dice coefficient between automatically segmented nodules by SEGvAC and the manual delineations by expert radiologists (readers) was 0.84± 0.04 whereas inter-reader segmentation agreement was 0.79± 0.12. We also identified a set of consistent features (Roughness, Convexity and Spherecity) that were found to be strongly correlated across both manual and automated nodule segmentations (R > 0.80, p < 0.0001) and capture the marginal smoothness and 3D compactness of the nodules. On the independent validation set of 67 studies our classifier yielded a ROC AUC of 0.72 and 0.64 for manually- and automatically segmented nodules respectively. On a subset of 20 studies, the AUCs for the two expert radiologists and 1 pulmonologist were found to be 0.82, 0.68 and 0.58 respectively. CONCLUSIONS The major finding of this study was that certain shape features appear to differentially express between granulomas and adenocarcinomas and thus computer extracted shape cues could be used to distinguish these radiographically similar pathologies.
Collapse
Affiliation(s)
- Mehdi Alilou
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Niha Beig
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Mahdi Orooji
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Prabhakar Rajiah
- Department of Radiology, University of Texas Southwestern Medical Centre, Dallas, TX, 75390, USA
| | | | - Sagar Rakshit
- Taussig Cancer Institute, Cleveland Clinic, Cleveland, OH, 44195, USA
| | - Niyoti Reddy
- Taussig Cancer Institute, Cleveland Clinic, Cleveland, OH, 44195, USA
| | - Michael Yang
- Department of Pathology, University Hospital Cleveland Medical Center, Cleveland, OH, 44106, USA
| | - Frank Jacono
- Division of Pulmonology and Critical Care, Louis Stokes Cleveland VA Medical Center, Cleveland, OH, 44106, USA
| | - Robert C Gilkeson
- Department of Radiology, University Hospital Cleveland Medical Center, Cleveland, OH, 44106, USA
| | - Philip Linden
- Division of Thoracic and Esophageal Surgery, University Hospital Cleveland Medical Center, Cleveland, OH, 44106, USA
| | - Anant Madabhushi
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH, 44106, USA
| |
Collapse
|
9
|
Viswanath SE, Tiwari P, Lee G, Madabhushi A. Dimensionality reduction-based fusion approaches for imaging and non-imaging biomedical data: concepts, workflow, and use-cases. BMC Med Imaging 2017; 17:2. [PMID: 28056889 PMCID: PMC5217665 DOI: 10.1186/s12880-016-0172-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2016] [Accepted: 12/09/2016] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND With a wide array of multi-modal, multi-protocol, and multi-scale biomedical data being routinely acquired for disease characterization, there is a pressing need for quantitative tools to combine these varied channels of information. The goal of these integrated predictors is to combine these varied sources of information, while improving on the predictive ability of any individual modality. A number of application-specific data fusion methods have been previously proposed in the literature which have attempted to reconcile the differences in dimensionalities and length scales across different modalities. Our objective in this paper was to help identify metholodological choices that need to be made in order to build a data fusion technique, as it is not always clear which strategy is optimal for a particular problem. As a comprehensive review of all possible data fusion methods was outside the scope of this paper, we have focused on fusion approaches that employ dimensionality reduction (DR). METHODS In this work, we quantitatively evaluate 4 non-overlapping existing instantiations of DR-based data fusion, within 3 different biomedical applications comprising over 100 studies. These instantiations utilized different knowledge representation and knowledge fusion methods, allowing us to examine the interplay of these modules in the context of data fusion. The use cases considered in this work involve the integration of (a) radiomics features from T2w MRI with peak area features from MR spectroscopy for identification of prostate cancer in vivo, (b) histomorphometric features (quantitative features extracted from histopathology) with protein mass spectrometry features for predicting 5 year biochemical recurrence in prostate cancer patients, and (c) volumetric measurements on T1w MRI with protein expression features to discriminate between patients with and without Alzheimers' Disease. RESULTS AND CONCLUSIONS Our preliminary results in these specific use cases indicated that the use of kernel representations in conjunction with DR-based fusion may be most effective, as a weighted multi-kernel-based DR approach resulted in the highest area under the ROC curve of over 0.8. By contrast non-optimized DR-based representation and fusion methods yielded the worst predictive performance across all 3 applications. Our results suggest that when the individual modalities demonstrate relatively poor discriminability, many of the data fusion methods may not yield accurate, discriminatory representations either. In summary, to outperform the predictive ability of individual modalities, methodological choices for data fusion must explicitly account for the sparsity of and noise in the feature space.
Collapse
Affiliation(s)
- Satish E Viswanath
- Department of Biomedical Engineering, Case Western Reserve University, 10900 Euclid Ave, Wickenden 523, Cleveland, OH, USA.
| | - Pallavi Tiwari
- Department of Biomedical Engineering, Case Western Reserve University, 10900 Euclid Ave, Wickenden 523, Cleveland, OH, USA
| | - George Lee
- Department of Biomedical Engineering, Case Western Reserve University, 10900 Euclid Ave, Wickenden 523, Cleveland, OH, USA
| | - Anant Madabhushi
- Department of Biomedical Engineering, Case Western Reserve University, 10900 Euclid Ave, Wickenden 523, Cleveland, OH, USA
| | | |
Collapse
|
10
|
Adaptive Dimensionality Reduction with Semi-Supervision (AdDReSS): Classifying Multi-Attribute Biomedical Data. PLoS One 2016; 11:e0159088. [PMID: 27421116 PMCID: PMC4946789 DOI: 10.1371/journal.pone.0159088] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2016] [Accepted: 06/27/2016] [Indexed: 11/19/2022] Open
Abstract
Medical diagnostics is often a multi-attribute problem, necessitating sophisticated tools for analyzing high-dimensional biomedical data. Mining this data often results in two crucial bottlenecks: 1) high dimensionality of features used to represent rich biological data and 2) small amounts of labelled training data due to the expense of consulting highly specific medical expertise necessary to assess each study. Currently, no approach that we are aware of has attempted to use active learning in the context of dimensionality reduction approaches for improving the construction of low dimensional representations. We present our novel methodology, AdDReSS (Adaptive Dimensionality Reduction with Semi-Supervision), to demonstrate that fewer labeled instances identified via AL in embedding space are needed for creating a more discriminative embedding representation compared to randomly selected instances. We tested our methodology on a wide variety of domains ranging from prostate gene expression, ovarian proteomic spectra, brain magnetic resonance imaging, and breast histopathology. Across these various high dimensional biomedical datasets with 100+ observations each and all parameters considered, the median classification accuracy across all experiments showed AdDReSS (88.7%) to outperform SSAGE, a SSDR method using random sampling (85.5%), and Graph Embedding (81.5%). Furthermore, we found that embeddings generated via AdDReSS achieved a mean 35.95% improvement in Raghavan efficiency, a measure of learning rate, over SSAGE. Our results demonstrate the value of AdDReSS to provide low dimensional representations of high dimensional biomedical data while achieving higher classification rates with fewer labelled examples as compared to without active learning.
Collapse
|
11
|
Buschmann D, Haberberger A, Kirchner B, Spornraft M, Riedmaier I, Schelling G, Pfaffl MW. Toward reliable biomarker signatures in the age of liquid biopsies - how to standardize the small RNA-Seq workflow. Nucleic Acids Res 2016; 44:5995-6018. [PMID: 27317696 PMCID: PMC5291277 DOI: 10.1093/nar/gkw545] [Citation(s) in RCA: 78] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 06/03/2016] [Indexed: 12/21/2022] Open
Abstract
Small RNA-Seq has emerged as a powerful tool in transcriptomics, gene expression profiling and biomarker discovery. Sequencing cell-free nucleic acids, particularly microRNA (miRNA), from liquid biopsies additionally provides exciting possibilities for molecular diagnostics, and might help establish disease-specific biomarker signatures. The complexity of the small RNA-Seq workflow, however, bears challenges and biases that researchers need to be aware of in order to generate high-quality data. Rigorous standardization and extensive validation are required to guarantee reliability, reproducibility and comparability of research findings. Hypotheses based on flawed experimental conditions can be inconsistent and even misleading. Comparable to the well-established MIQE guidelines for qPCR experiments, this work aims at establishing guidelines for experimental design and pre-analytical sample processing, standardization of library preparation and sequencing reactions, as well as facilitating data analysis. We highlight bottlenecks in small RNA-Seq experiments, point out the importance of stringent quality control and validation, and provide a primer for differential expression analysis and biomarker discovery. Following our recommendations will encourage better sequencing practice, increase experimental transparency and lead to more reproducible small RNA-Seq results. This will ultimately enhance the validity of biomarker signatures, and allow reliable and robust clinical predictions.
Collapse
Affiliation(s)
- Dominik Buschmann
- Department of Animal Physiology and Immunology, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Weihenstephaner Berg 3, 85354 Freising, Germany Institute of Human Genetics, University Hospital, Ludwig-Maximilians-University Munich, Goethestraße 29, 80336 München, Germany
| | - Anna Haberberger
- Department of Animal Physiology and Immunology, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Benedikt Kirchner
- Department of Animal Physiology and Immunology, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Melanie Spornraft
- Department of Animal Physiology and Immunology, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Irmgard Riedmaier
- Eurofins Medigenomix Forensik GmbH, Anzinger Straße 7a, 85560 Ebersberg, Germany Department of Anesthesiology, University Hospital, Ludwig-Maximilians-University Munich, Marchioninistraße 15, 81377 München, Germany
| | - Gustav Schelling
- Department of Physiology, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Michael W Pfaffl
- Department of Animal Physiology and Immunology, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Weihenstephaner Berg 3, 85354 Freising, Germany
| |
Collapse
|
12
|
Sparks R, Madabhushi A. Out-of-Sample Extrapolation utilizing Semi-Supervised Manifold Learning (OSE-SSL): Content Based Image Retrieval for Histopathology Images. Sci Rep 2016; 6:27306. [PMID: 27264985 PMCID: PMC4893667 DOI: 10.1038/srep27306] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Accepted: 05/16/2016] [Indexed: 12/22/2022] Open
Abstract
Content-based image retrieval (CBIR) retrieves database images most similar to the query image by (1) extracting quantitative image descriptors and (2) calculating similarity between database and query image descriptors. Recently, manifold learning (ML) has been used to perform CBIR in a low dimensional representation of the high dimensional image descriptor space to avoid the curse of dimensionality. ML schemes are computationally expensive, requiring an eigenvalue decomposition (EVD) for every new query image to learn its low dimensional representation. We present out-of-sample extrapolation utilizing semi-supervised ML (OSE-SSL) to learn the low dimensional representation without recomputing the EVD for each query image. OSE-SSL incorporates semantic information, partial class label, into a ML scheme such that the low dimensional representation co-localizes semantically similar images. In the context of prostate histopathology, gland morphology is an integral component of the Gleason score which enables discrimination between prostate cancer aggressiveness. Images are represented by shape features extracted from the prostate gland. CBIR with OSE-SSL for prostate histology obtained from 58 patient studies, yielded an area under the precision recall curve (AUPRC) of 0.53 ± 0.03 comparatively a CBIR with Principal Component Analysis (PCA) to learn a low dimensional space yielded an AUPRC of 0.44 ± 0.01.
Collapse
Affiliation(s)
- Rachel Sparks
- University College of London, Centre for Medical Image Computing, London, UK
| | - Anant Madabhushi
- Case Western Reserve University, Department of Biomedical Engineering, Cleveland, OH, USA
| |
Collapse
|
13
|
Kumar M, Rath NK, Rath SK. Analysis of microarray leukemia data using an efficient MapReduce-based K-nearest-neighbor classifier. J Biomed Inform 2016; 60:395-409. [DOI: 10.1016/j.jbi.2016.03.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2015] [Revised: 02/28/2016] [Accepted: 03/02/2016] [Indexed: 10/22/2022]
|
14
|
Shimomura A, Shiino S, Kawauchi J, Takizawa S, Sakamoto H, Matsuzaki J, Ono M, Takeshita F, Niida S, Shimizu C, Fujiwara Y, Kinoshita T, Tamura K, Ochiya T. Novel combination of serum microRNA for detecting breast cancer in the early stage. Cancer Sci 2016; 107:326-34. [PMID: 26749252 PMCID: PMC4814263 DOI: 10.1111/cas.12880] [Citation(s) in RCA: 238] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2015] [Revised: 12/23/2015] [Accepted: 01/03/2016] [Indexed: 12/18/2022] Open
Abstract
MicroRNA (miRNA), which are stably present in serum, have been reported to be potentially useful for detecting cancer. In the present study, we examined the expression profiles of serum miRNA in several large cohorts to identify novel miRNA that can be used to detect early stage breast cancer. We comprehensively evaluated the serum miRNA expression profiles using highly sensitive microarray analysis. A total of 1280 serum samples of breast cancer patients stored in the National Cancer Center Biobank were used. In addition, 2836 serum samples were obtained from non‐cancer controls, 451 from patients with other types of cancers, and 63 from patients with non‐breast benign diseases. The samples were divided into a training cohort including non‐cancer controls, other cancers and breast cancer, and a test cohort including non‐cancer controls and breast cancer. The training cohort was used to identify a combination of miRNA that could detect breast cancer, and the test cohort was used to validate that combination. miRNA expressions were compared between patients with breast cancer and non‐breast cancer, and a combination of five miRNA (miR‐1246, miR‐1307‐3p, miR‐4634, miR‐6861‐5p and miR‐6875‐5p) was found to be able to detect breast cancer. This combination had a sensitivity of 97.3%, specificity of 82.9% and accuracy of 89.7% for breast cancer in the test cohort. In addition, this combination could detect early stage breast cancer (sensitivity of 98.0% for Tis).
Collapse
Affiliation(s)
- Akihiko Shimomura
- Department of Breast and Medical Oncology, National Cancer Center Hospital, Tokyo, Japan.,Department of Medical Oncology and Translational Research, Graduate School of Medical Sciences, Kumamoto University, Kumamoto, Japan
| | - Sho Shiino
- Department of Breast Surgery, National Cancer Center Hospital, Tokyo, Japan
| | - Junpei Kawauchi
- New Frontiers Research Institute, Toray Industries, Kanagawa, Japan
| | - Satoko Takizawa
- New Frontiers Research Institute, Toray Industries, Kanagawa, Japan
| | - Hiromi Sakamoto
- Division of Genetics, Fundamental Innovative Oncology Core Center, National Cancer Center Research Institute, Tokyo, Japan
| | - Juntaro Matsuzaki
- Division of Molecular and Cellular Medicine, Fundamental Innovative Oncology Core Center, National Cancer Center Research Institute, Tokyo, Japan
| | - Makiko Ono
- Department of Breast and Medical Oncology, National Cancer Center Hospital, Tokyo, Japan.,Division of Molecular and Cellular Medicine, Fundamental Innovative Oncology Core Center, National Cancer Center Research Institute, Tokyo, Japan
| | - Fumitaka Takeshita
- Department of Functional Analysis, Fundamental Innovative Oncology Core Center, National Cancer Center Research Institute, Tokyo, Japan
| | - Shumpei Niida
- Medical Genome Center, National Center for Geriatrics and Gerontology, Aichi, Japan
| | - Chikako Shimizu
- Department of Breast and Medical Oncology, National Cancer Center Hospital, Tokyo, Japan
| | - Yasuhiro Fujiwara
- Department of Breast and Medical Oncology, National Cancer Center Hospital, Tokyo, Japan
| | - Takayuki Kinoshita
- Department of Breast Surgery, National Cancer Center Hospital, Tokyo, Japan
| | - Kenji Tamura
- Department of Breast and Medical Oncology, National Cancer Center Hospital, Tokyo, Japan.,Department of Medical Oncology and Translational Research, Graduate School of Medical Sciences, Kumamoto University, Kumamoto, Japan
| | - Takahiro Ochiya
- Division of Molecular and Cellular Medicine, Fundamental Innovative Oncology Core Center, National Cancer Center Research Institute, Tokyo, Japan
| |
Collapse
|
15
|
Ring BZ, Hout DR, Morris SW, Lawrence K, Schweitzer BL, Bailey DB, Lehmann BD, Pietenpol JA, Seitz RS. Generation of an algorithm based on minimal gene sets to clinically subtype triple negative breast cancer patients. BMC Cancer 2016; 16:143. [PMID: 26908167 PMCID: PMC4763445 DOI: 10.1186/s12885-016-2198-0] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Accepted: 02/17/2016] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Recently, a gene expression algorithm, TNBCtype, was developed that can divide triple-negative breast cancer (TNBC) into molecularly-defined subtypes. The algorithm has potential to provide predictive value for TNBC subtype-specific response to various treatments. TNBCtype used in a retrospective analysis of neoadjuvant clinical trial data of TNBC patients demonstrated that TNBC subtype and pathological complete response to neoadjuvant chemotherapy were significantly associated. Herein we describe an expression algorithm reduced to 101 genes with the power to subtype TNBC tumors similar to the original 2188-gene expression algorithm and predict patient outcomes. METHODS The new classification model was built using the same expression data sets used for the original TNBCtype algorithm. Gene set enrichment followed by shrunken centroid analysis were used for feature reduction, then elastic-net regularized linear modeling was used to identify genes for a centroid model classifying all subtypes, comprised of 101 genes. The predictive capability of both this new "lean" algorithm and the original 2188-gene model were applied to an independent clinical trial cohort of 139 TNBC patients treated initially with neoadjuvant doxorubicin/cyclophosphamide and then randomized to receive either paclitaxel or ixabepilone to determine association of pathologic complete response within the subtypes. RESULTS The new 101-gene expression model reproduced the classification provided by the 2188-gene algorithm and was highly concordant in the same set of seven TNBC cohorts used to generate the TNBCtype algorithm (87%), as well as in the independent clinical trial cohort (88%), when cases with significant correlations to multiple subtypes were excluded. Clinical responses to both neoadjuvant treatment arms, found BL2 to be significantly associated with poor response (Odds Ratio (OR) =0.12, p=0.03 for the 2188-gene model; OR = 0.23, p < 0.03 for the 101-gene model). Additionally, while the BL1 subtype trended towards significance in the 2188-gene model (OR = 1.91, p = 0.14), the 101-gene model demonstrated significant association with improved response in patients with the BL1 subtype (OR = 3.59, p = 0.02). CONCLUSIONS These results demonstrate that a model using small gene sets can recapitulate the TNBC subtypes identified by the original 2188-gene model and in the case of standard chemotherapy, the ability to predict therapeutic response.
Collapse
Affiliation(s)
- Brian Z Ring
- Institute of Personalized and Genomic Medicine, College of Life Science, Huazhong University of Science and Technology, Wuhan, China.
| | - David R Hout
- Insight Genetics Incorporated, Nashville, Tennessee, USA.
| | | | - Kasey Lawrence
- Insight Genetics Incorporated, Nashville, Tennessee, USA.
| | | | | | - Brian D Lehmann
- Department of Biochemistry, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, Tennessee, USA.
| | - Jennifer A Pietenpol
- Department of Biochemistry, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, Tennessee, USA.
| | - Robert S Seitz
- Insight Genetics Incorporated, Nashville, Tennessee, USA.
| |
Collapse
|
16
|
Hu C, Sepulcre J, Johnson KA, Fakhri GE, Lu YM, Li Q. Matched signal detection on graphs: Theory and application to brain imaging data classification. Neuroimage 2016; 125:587-600. [PMID: 26481679 DOI: 10.1016/j.neuroimage.2015.10.026] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2015] [Revised: 08/11/2015] [Accepted: 10/11/2015] [Indexed: 12/23/2022] Open
Affiliation(s)
- Chenhui Hu
- Center for Advanced Medical Imaging Sciences, NMMI, Radiology, Massachusetts General Hospital, Boston, MA, USA; School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Jorge Sepulcre
- NMMI, Radiology, Massachusetts General Hospital, Boston, MA, USA
| | - Keith A Johnson
- NMMI, Radiology, Massachusetts General Hospital, Boston, MA, USA
| | - Georges E Fakhri
- Center for Advanced Medical Imaging Sciences, NMMI, Radiology, Massachusetts General Hospital, Boston, MA, USA
| | - Yue M Lu
- School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Quanzheng Li
- Center for Advanced Medical Imaging Sciences, NMMI, Radiology, Massachusetts General Hospital, Boston, MA, USA.
| |
Collapse
|
17
|
Ginsburg SB, Lee G, Ali S, Madabhushi A. Feature Importance in Nonlinear Embeddings (FINE): Applications in Digital Pathology. IEEE TRANSACTIONS ON MEDICAL IMAGING 2016; 35:76-88. [PMID: 26186772 DOI: 10.1109/tmi.2015.2456188] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Quantitative histomorphometry (QH) refers to the process of computationally modeling disease appearance on digital pathology images by extracting hundreds of image features and using them to predict disease presence or outcome. Since constructing a robust and interpretable classifier is challenging in a high dimensional feature space, dimensionality reduction (DR) is often implemented prior to classifier construction. However, when DR is performed it can be challenging to quantify the contribution of each of the original features to the final classification result. We have previously presented a method for scoring features based on their importance for classification on an embedding derived via principal components analysis (PCA). However, nonlinear DR involves the eigen-decomposition of a kernel matrix rather than the data itself, compounding the issue of classifier interpretability. In this paper we present feature importance in nonlinear embeddings (FINE), an extension of our PCA-based feature scoring method to kernel PCA (KPCA), as well as several NLDR algorithms that can be cast as variants of KPCA. FINE is applied to four digital pathology datasets to identify key QH features for predicting the risk of breast and prostate cancer recurrence. Measures of nuclear and glandular architecture and clusteredness were found to play an important role in predicting the likelihood of recurrence of both breast and prostate cancers. Compared to the t-test, Fisher score, and Gini index, FINE was able to identify a stable set of features that provide good classification accuracy on four publicly available datasets from the NIPS 2003 Feature Selection Challenge.
Collapse
|
18
|
Classification of microarray using MapReduce based proximal support vector machine classifier. Knowl Based Syst 2015. [DOI: 10.1016/j.knosys.2015.09.005] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
19
|
Sridhar A, Doyle S, Madabhushi A. Content-based image retrieval of digitized histopathology in boosted spectrally embedded spaces. J Pathol Inform 2015; 6:41. [PMID: 26167385 PMCID: PMC4498317 DOI: 10.4103/2153-3539.159441] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2013] [Accepted: 11/04/2014] [Indexed: 01/07/2023] Open
Abstract
Context: Content-based image retrieval (CBIR) systems allow for retrieval of images from within a database that are similar in visual content to a query image. This is useful for digital pathology, where text-based descriptors alone might be inadequate to accurately describe image content. By representing images via a set of quantitative image descriptors, the similarity between a query image with respect to archived, annotated images in a database can be computed and the most similar images retrieved. Recently, non-linear dimensionality reduction methods have become popular for embedding high-dimensional data into a reduced-dimensional space while preserving local object adjacencies, thereby allowing for object similarity to be determined more accurately in the reduced-dimensional space. However, most dimensionality reduction methods implicitly assume, in computing the reduced-dimensional representation, that all features are equally important. Aims: In this paper we present boosted spectral embedding(BoSE), which utilizes a boosted distance metric to selectively weight individual features (based on training data) to subsequently map the data into a reduced-dimensional space. Settings and Design: BoSE is evaluated against spectral embedding (SE) (which employs equal feature weighting) in the context of CBIR of digitized prostate and breast cancer histopathology images. Materials and Methods: The following datasets, which were comprised of a total of 154 hematoxylin and eosin stained histopathology images, were used: (1) Prostate cancer histopathology (benign vs. malignant), (2) estrogen receptor (ER) + breast cancer histopathology (low vs. high grade), and (3) HER2+ breast cancer histopathology (low vs. high levels of lymphocytic infiltration). Statistical Analysis Used: We plotted and calculated the area under precision-recall curves (AUPRC) and calculated classification accuracy using the Random Forest classifier. Results: BoSE outperformed SE both in terms of CBIR-based (area under the precision-recall curve) and classifier-based (classification accuracy) on average across all of the dimensions tested for all three datasets: (1) Prostate cancer histopathology (AUPRC: BoSE = 0.79, SE = 0.63; Accuracy: BoSE = 0.93, SE = 0.80), (2) ER + breast cancer histopathology (AUPRC: BoSE = 0.79, SE = 0.68; Accuracy: BoSE = 0.96, SE = 0.96), and (3) HER2+ breast cancer histopathology (AUPRC: BoSE = 0.54, SE = 0.44; Accuracy: BoSE = 0.93, SE = 0.91). Conclusion: Our results suggest that BoSE could serve as an important tool for CBIR and classification of high-dimensional biomedical data.
Collapse
Affiliation(s)
- Akshay Sridhar
- Department of Biomedical Engineering, Rutgers University, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Scott Doyle
- Department of Biomedical Engineering, Rutgers University, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Anant Madabhushi
- Department of Biomedical Engineering, Rutgers University, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
20
|
Kojima M, Sudo H, Kawauchi J, Takizawa S, Kondou S, Nobumasa H, Ochiai A. MicroRNA markers for the diagnosis of pancreatic and biliary-tract cancers. PLoS One 2015; 10:e0118220. [PMID: 25706130 PMCID: PMC4338196 DOI: 10.1371/journal.pone.0118220] [Citation(s) in RCA: 104] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2014] [Accepted: 01/11/2015] [Indexed: 12/21/2022] Open
Abstract
It is difficult to detect pancreatic cancer or biliary-tract cancer at an early stage using current diagnostic technology. Utilizing microRNA (miRNA) markers that are stably present in peripheral blood, we aimed to identify pancreatic and biliary-tract cancers in patients. With "3D-Gene", a highly sensitive microarray, we examined comprehensive miRNA expression profiles in 571 serum samples obtained from healthy patients, patients with pancreatic, biliary-tract, or other digestive cancers, and patients with non-malignant abnormalities in the pancreas or biliary tract. The samples were randomly divided into training and test cohorts, and candidate miRNA markers were independently evaluated. We found 81 miRNAs for pancreatic cancer and 66 miRNAs for biliary-tract cancer that showed statistically different expression compared with healthy controls. Among those markers, 55 miRNAs were common in both the pancreatic and biliary-tract cancer samples. The previously reported miR-125a-3p was one of the common markers; however, it was also expressed in other types of digestive-tract cancers, suggesting that it is not specific to cancer types. In order to discriminate the pancreato-biliary cancers from all other clinical conditions including the healthy controls, non-malignant abnormalities, and other types of cancers, we developed a diagnostic index using expression profiles of the 10 most significant miRNAs. A combination of eight miRNAs (miR-6075, miR-4294, miR-6880-5p, miR-6799-5p, miR-125a-3p, miR-4530, miR-6836-3p, and miR-4476) achieved a sensitivity, specificity, accuracy and AUC of 80.3%, 97.6%, 91.6% and 0.953, respectively. In contrast, CA19-9 and CEA gave sensitivities of 65.6% and 40.0%, specificities of 92.9% and 88.6%, and accuracies of 82.1% and 71.8%, respectively, in the same test cohort. This diagnostic index identified 18/21 operable pancreatic cancers and 38/48 operable biliary-tract cancers in the entire cohort. Our results suggest that the assessment of these miRNA markers is clinically valuable to identify patients with pancreato-biliary cancers who could benefit from surgical intervention.
Collapse
Affiliation(s)
- Motohiro Kojima
- Department of Pathology, National Cancer Center Hospital East, Kashiwa, Chiba, Japan
- * E-mail:
| | - Hiroko Sudo
- New Frontiers Research Laboratories, Toray Industries, Inc., Kamakura, Kanagawa, Japan
| | - Junpei Kawauchi
- New Frontiers Research Laboratories, Toray Industries, Inc., Kamakura, Kanagawa, Japan
| | - Satoko Takizawa
- New Frontiers Research Laboratories, Toray Industries, Inc., Kamakura, Kanagawa, Japan
| | - Satoshi Kondou
- New Projects Development Division, Toray Industries, Inc., Kamakura, Kanagawa, Japan
| | - Hitoshi Nobumasa
- New Projects Development Division, Toray Industries, Inc., Kamakura, Kanagawa, Japan
| | - Atsushi Ochiai
- Department of Pathology, National Cancer Center Hospital East, Kashiwa, Chiba, Japan
| |
Collapse
|
21
|
Lee G, Singanamalli A, Wang H, Feldman MD, Master SR, Shih NNC, Spangler E, Rebbeck T, Tomaszewski JE, Madabhushi A. Supervised multi-view canonical correlation analysis (sMVCCA): integrating histologic and proteomic features for predicting recurrent prostate cancer. IEEE TRANSACTIONS ON MEDICAL IMAGING 2015; 34:284-297. [PMID: 25203987 DOI: 10.1109/tmi.2014.2355175] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
In this work, we present a new methodology to facilitate prediction of recurrent prostate cancer (CaP) following radical prostatectomy (RP) via the integration of quantitative image features and protein expression in the excised prostate. Creating a fused predictor from high-dimensional data streams is challenging because the classifier must 1) account for the "curse of dimensionality" problem, which hinders classifier performance when the number of features exceeds the number of patient studies and 2) balance potential mismatches in the number of features across different channels to avoid classifier bias towards channels with more features. Our new data integration methodology, supervised Multi-view Canonical Correlation Analysis (sMVCCA), aims to integrate infinite views of highdimensional data to provide more amenable data representations for disease classification. Additionally, we demonstrate sMVCCA using Spearman's rank correlation which, unlike Pearson's correlation, can account for nonlinear correlations and outliers. Forty CaP patients with pathological Gleason scores 6-8 were considered for this study. 21 of these men revealed biochemical recurrence (BCR) following RP, while 19 did not. For each patient, 189 quantitative histomorphometric attributes and 650 protein expression levels were extracted from the primary tumor nodule. The fused histomorphometric/proteomic representation via sMVCCA combined with a random forest classifier predicted BCR with a mean AUC of 0.74 and a maximum AUC of 0.9286. We found sMVCCA to perform statistically significantly (p < 0.05) better than comparative state-of-the-art data fusion strategies for predicting BCR. Furthermore, Kaplan-Meier analysis demonstrated improved BCR-free survival prediction for the sMVCCA-fused classifier as compared to histology or proteomic features alone.
Collapse
|
22
|
Multiplicative distance: a method to alleviate distance instability for high-dimensional data. Knowl Inf Syst 2014. [DOI: 10.1007/s10115-014-0813-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
23
|
Classification of Microarray Data Using Kernel Fuzzy Inference System. INTERNATIONAL SCHOLARLY RESEARCH NOTICES 2014; 2014:769159. [PMID: 27433543 PMCID: PMC4897118 DOI: 10.1155/2014/769159] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/28/2014] [Revised: 05/28/2014] [Accepted: 06/12/2014] [Indexed: 12/02/2022]
Abstract
The DNA microarray classification technique has gained more popularity in both research and practice. In real data analysis, such as microarray data, the dataset contains a huge number of insignificant and irrelevant features that tend to lose useful information. Classes with high relevance and feature sets with high significance are generally referred for the selected features, which determine the samples classification into their respective classes. In this paper, kernel fuzzy inference system (K-FIS) algorithm is applied to classify the microarray data (leukemia) using t-test as a feature selection method. Kernel functions are used to map original data points into a higher-dimensional (possibly infinite-dimensional) feature space defined by a (usually nonlinear) function ϕ through a mathematical process called the kernel trick. This paper also presents a comparative study for classification using K-FIS along with support vector machine (SVM) for different set of features (genes). Performance parameters available in the literature such as precision, recall, specificity, F-measure, ROC curve, and accuracy are considered to analyze the efficiency of the classification model. From the proposed approach, it is apparent that K-FIS model obtains similar results when compared with SVM model. This is an indication that the proposed approach relies on kernel function.
Collapse
|
24
|
Cantor-Rivera D, Khan AR, Goubran M, Mirsattari SM, Peters TM. Detection of temporal lobe epilepsy using support vector machines in multi-parametric quantitative MR imaging. Comput Med Imaging Graph 2014; 41:14-28. [PMID: 25103878 DOI: 10.1016/j.compmedimag.2014.07.002] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2014] [Revised: 06/11/2014] [Accepted: 07/09/2014] [Indexed: 11/30/2022]
Abstract
The detection of MRI abnormalities that can be associated to seizures in the study of temporal lobe epilepsy (TLE) is a challenging task. In many cases, patients with a record of epileptic activity do not present any discernible MRI findings. In this domain, we propose a method that combines quantitative relaxometry and diffusion tensor imaging (DTI) with support vector machines (SVM) aiming to improve TLE detection. The main contribution of this work is two-fold: on one hand, the feature selection process, principal component analysis (PCA) transformations of the feature space, and SVM parameterization are analyzed as factors constituting a classification model and influencing its quality. On the other hand, several of these classification models are studied to determine the optimal strategy for the identification of TLE patients using data collected from multi-parametric quantitative MRI. A total of 17 TLE patients and 19 control volunteers were analyzed. Four images were considered for each subject (T1 map, T2 map, fractional anisotropy, and mean diffusivity) generating 936 regions of interest per subject, then 8 different classification models were studied, each one comprised by a distinct set of factors. Subjects were correctly classified with an accuracy of 88.9%. Further analysis revealed that the heterogeneous nature of the disease impeded an optimal outcome. After dividing patients into cohesive groups (9 left-sided seizure onset, 8 right-sided seizure onset) perfect classification for the left group was achieved (100% accuracy) whereas the accuracy for the right group remained the same (88.9%). We conclude that a linear SVM combined with an ANOVA-based feature selection+PCA method is a good alternative in scenarios like ours where feature spaces are high dimensional, and the sample size is limited. The good accuracy results and the localization of the respective features in the temporal lobe suggest that a multi-parametric quantitative MRI, ROI-based, SVM classification could be used for the identification of TLE patients. This method has the potential to improve the diagnostic assessment, especially for patients who do not have any obvious lesions in standard radiological examinations.
Collapse
Affiliation(s)
- Diego Cantor-Rivera
- Imaging Research Laboratories, Robarts Research Institute, London, ON, Canada N6A 5K8; Biomedical Engineering Graduate Program, Western University, London, ON, Canada.
| | - Ali R Khan
- Imaging Research Laboratories, Robarts Research Institute, London, ON, Canada N6A 5K8; Department of Medical Biophysics, Western University, London, ON, Canada.
| | - Maged Goubran
- Imaging Research Laboratories, Robarts Research Institute, London, ON, Canada N6A 5K8; Biomedical Engineering Graduate Program, Western University, London, ON, Canada.
| | - Seyed M Mirsattari
- Department of Clinical Neurological Sciences, Medical Biophysics, Medical Imaging and Psychology, Western University, London, ON, Canada; London Health Sciences Centre, University Hospital, B10-110, 339 Windermere Road, London, ON, Canada N6A 5A5.
| | - Terry M Peters
- Imaging Research Laboratories, Robarts Research Institute, London, ON, Canada N6A 5K8; Department of Medical Biophysics, Western University, London, ON, Canada; Biomedical Engineering Graduate Program, Western University, London, ON, Canada.
| |
Collapse
|
25
|
Agner SC, Xu J, Madabhushi A. Spectral embedding based active contour (SEAC) for lesion segmentation on breast dynamic contrast enhanced magnetic resonance imaging. Med Phys 2013; 40:032305. [PMID: 23464337 DOI: 10.1118/1.4790466] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
PURPOSE Segmentation of breast lesions on dynamic contrast enhanced (DCE) magnetic resonance imaging (MRI) is the first step in lesion diagnosis in a computer-aided diagnosis framework. Because manual segmentation of such lesions is both time consuming and highly susceptible to human error and issues of reproducibility, an automated lesion segmentation method is highly desirable. Traditional automated image segmentation methods such as boundary-based active contour (AC) models require a strong gradient at the lesion boundary. Even when region-based terms are introduced to an AC model, grayscale image intensities often do not allow for clear definition of foreground and background region statistics. Thus, there is a need to find alternative image representations that might provide (1) strong gradients at the margin of the object of interest (OOI); and (2) larger separation between intensity distributions and region statistics for the foreground and background, which are necessary to halt evolution of the AC model upon reaching the border of the OOI. METHODS In this paper, the authors introduce a spectral embedding (SE) based AC (SEAC) for lesion segmentation on breast DCE-MRI. SE, a nonlinear dimensionality reduction scheme, is applied to the DCE time series in a voxelwise fashion to reduce several time point images to a single parametric image where every voxel is characterized by the three dominant eigenvectors. This parametric eigenvector image (PrEIm) representation allows for better capture of image region statistics and stronger gradients for use with a hybrid AC model, which is driven by both boundary and region information. They compare SEAC to ACs that employ fuzzy c-means (FCM) and principal component analysis (PCA) as alternative image representations. Segmentation performance was evaluated by boundary and region metrics as well as comparing lesion classification using morphological features from SEAC, PCA+AC, and FCM+AC. RESULTS On a cohort of 50 breast DCE-MRI studies, PrEIm yielded overall better region and boundary-based statistics compared to the original DCE-MR image, FCM, and PCA based image representations. Additionally, SEAC outperformed a hybrid AC applied to both PCA and FCM image representations. Mean dice similarity coefficient (DSC) for SEAC was significantly better (DSC = 0.74 ± 0.21) than FCM+AC (DSC = 0.50 ± 0.32) and similar to PCA+AC (DSC = 0.73 ± 0.22). Boundary-based metrics of mean absolute difference and Hausdorff distance followed the same trends. Of the automated segmentation methods, breast lesion classification based on morphologic features derived from SEAC segmentation using a support vector machine classifier also performed better (AUC = 0.67 ± 0.05; p < 0.05) than FCM+AC (AUC = 0.50 ± 0.07), and PCA+AC (AUC = 0.49 ± 0.07). CONCLUSIONS In this work, we presented SEAC, an accurate, general purpose AC segmentation tool that could be applied to any imaging domain that employs time series data. SE allows for projection of time series data into a PrEIm representation so that every voxel is characterized by the dominant eigenvectors, capturing the global and local time-intensity curve similarities in the data. This PrEIm allows for the calculation of strong tensor gradients and better region statistics than the original image intensities or alternative image representations such as PCA and FCM. The PrEIm also allows for building a more accurate hybrid AC scheme.
Collapse
Affiliation(s)
- Shannon C Agner
- Department of Biomedical Engineering, Rutgers University, Piscataway, New Jersey 08854, USA.
| | | | | |
Collapse
|
26
|
Transcriptional biomarkers--high throughput screening, quantitative verification, and bioinformatical validation methods. Methods 2012; 59:3-9. [PMID: 22967906 DOI: 10.1016/j.ymeth.2012.08.012] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2012] [Revised: 08/21/2012] [Accepted: 08/25/2012] [Indexed: 02/08/2023] Open
Abstract
Molecular biomarkers found their way into many research fields, especially in molecular medicine, medical diagnostics, disease prognosis, risk assessment but also in other areas like food safety. Different definitions for the term biomarker exist, but on the whole biomarkers are measureable biological molecules that are characteristic for a specific physiological status including drug intervention, normal or pathological processes. There are various examples for molecular biomarkers that are already successfully used in clinical diagnostics, especially as prognostic or diagnostic tool for diseases. Molecular biomarkers can be identified on different molecular levels, namely the genome, the epigenome, the transcriptome, the proteome, the metabolome and the lipidome. With special "omic" technologies, nowadays often high throughput technologies, these molecular biomarkers can be identified and quantitatively measured. This article describes the different molecular levels on which biomarker research is possible including some biomarker candidates that have already been identified. Hereby the transcriptomic approach will be described in detail including available high throughput methods, molecular levels, quantitative verification, and biostatistical requirements for transcriptional biomarker identification and validation.
Collapse
|
27
|
He L, Long LR, Antani S, Thoma GR. Histology image analysis for carcinoma detection and grading. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2012; 107:538-56. [PMID: 22436890 PMCID: PMC3587978 DOI: 10.1016/j.cmpb.2011.12.007] [Citation(s) in RCA: 161] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2010] [Revised: 09/27/2011] [Accepted: 12/13/2011] [Indexed: 05/25/2023]
Abstract
This paper presents an overview of the image analysis techniques in the domain of histopathology, specifically, for the objective of automated carcinoma detection and classification. As in other biomedical imaging areas such as radiology, many computer assisted diagnosis (CAD) systems have been implemented to aid histopathologists and clinicians in cancer diagnosis and research, which have been attempted to significantly reduce the labor and subjectivity of traditional manual intervention with histology images. The task of automated histology image analysis is usually not simple due to the unique characteristics of histology imaging, including the variability in image preparation techniques, clinical interpretation protocols, and the complex structures and very large size of the images themselves. In this paper we discuss those characteristics, provide relevant background information about slide preparation and interpretation, and review the application of digital image processing techniques to the field of histology image analysis. In particular, emphasis is given to state-of-the-art image segmentation methods for feature extraction and disease classification. Four major carcinomas of cervix, prostate, breast, and lung are selected to illustrate the functions and capabilities of existing CAD systems.
Collapse
Affiliation(s)
- Lei He
- National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD, USA.
| | | | | | | |
Collapse
|
28
|
Abstract
Applications of clustering algorithms in biomedical research are ubiquitous, with typical examples including gene expression data analysis, genomic sequence analysis, biomedical document mining, and MRI image analysis. However, due to the diversity of cluster analysis, the differing terminologies, goals, and assumptions underlying different clustering algorithms can be daunting. Thus, determining the right match between clustering algorithms and biomedical applications has become particularly important. This paper is presented to provide biomedical researchers with an overview of the status quo of clustering algorithms, to illustrate examples of biomedical applications based on cluster analysis, and to help biomedical researchers select the most suitable clustering algorithms for their own applications.
Collapse
Affiliation(s)
- Rui Xu
- Industrial Artificial Intelligence Laboratory, GE Global Research Center, Niskayuna, NY 12309, USA.
| | | |
Collapse
|
29
|
Golugula A, Lee G, Madabhushi A. Evaluating feature selection strategies for high dimensional, small sample size datasets. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2011:949-52. [PMID: 22254468 DOI: 10.1109/iembs.2011.6090214] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In this work, we analyze and evaluate different strategies for comparing Feature Selection (FS) schemes on High Dimensional (HD) biomedical datasets (e.g. gene and protein expression studies) with a small sample size (SSS). Additionally, we define a new feature, Robustness, specifically for comparing the ability of an FS scheme to be invariant to changes in its training data. While classifier accuracy has been the de facto method for evaluating FS schemes, on account of the curse of dimensionality problem, it might not always be the appropriate measure for HD/SSS datasets. SSS lends the dataset a higher probability of containing data that is not representative of the true distribution of the whole population. However, an ideal FS scheme must be robust enough to produce the same results each time there are changes to the training data. In this study, we employed the robustness performance measure in conjunction with classifier accuracy (measured via the K-Nearest Neighbor and Random Forest classifiers) to quantitatively compare five different FS schemes (T-test, F-test, Kolmogorov-Smirnov Test, Wilks Lambda Test and Wilcoxon Rand Sum Test) on 5 HD/SSS gene and protein expression datasets corresponding to ovarian cancer, lung cancer, bone lesions, celiac disease, and coronary heart disease. Of the five FS schemes compared, the Wilcoxon Rand Sum Test was found to outperform other FS schemes in terms of classification accuracy and robustness. Our results suggest that both classifier accuracy and robustness should be considered when deciding on the appropriate FS scheme for HD/SSS datasets.
Collapse
Affiliation(s)
- Abhishek Golugula
- Department of Electrical and Computer Engineering, Rutgers University, Piscataway, New Jersey 08854, USA
| | | | | |
Collapse
|
30
|
Nanni L, Brahnam S, Lumini A. Combining multiple approaches for gene microarray classification. Bioinformatics 2012; 28:1151-7. [DOI: 10.1093/bioinformatics/bts108] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
31
|
Viswanath S, Madabhushi A. Consensus embedding: theory, algorithms and application to segmentation and classification of biomedical data. BMC Bioinformatics 2012; 13:26. [PMID: 22316103 PMCID: PMC3395843 DOI: 10.1186/1471-2105-13-26] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2011] [Accepted: 02/08/2012] [Indexed: 11/21/2022] Open
Abstract
Background Dimensionality reduction (DR) enables the construction of a lower dimensional space (embedding) from a higher dimensional feature space while preserving object-class discriminability. However several popular DR approaches suffer from sensitivity to choice of parameters and/or presence of noise in the data. In this paper, we present a novel DR technique known as consensus embedding that aims to overcome these problems by generating and combining multiple low-dimensional embeddings, hence exploiting the variance among them in a manner similar to ensemble classifier schemes such as Bagging. We demonstrate theoretical properties of consensus embedding which show that it will result in a single stable embedding solution that preserves information more accurately as compared to any individual embedding (generated via DR schemes such as Principal Component Analysis, Graph Embedding, or Locally Linear Embedding). Intelligent sub-sampling (via mean-shift) and code parallelization are utilized to provide for an efficient implementation of the scheme. Results Applications of consensus embedding are shown in the context of classification and clustering as applied to: (1) image partitioning of white matter and gray matter on 10 different synthetic brain MRI images corrupted with 18 different combinations of noise and bias field inhomogeneity, (2) classification of 4 high-dimensional gene-expression datasets, (3) cancer detection (at a pixel-level) on 16 image slices obtained from 2 different high-resolution prostate MRI datasets. In over 200 different experiments concerning classification and segmentation of biomedical data, consensus embedding was found to consistently outperform both linear and non-linear DR methods within all applications considered. Conclusions We have presented a novel framework termed consensus embedding which leverages ensemble classification theory within dimensionality reduction, allowing for application to a wide range of high-dimensional biomedical data classification and segmentation problems. Our generalizable framework allows for improved representation and classification in the context of both imaging and non-imaging data. The algorithm offers a promising solution to problems that currently plague DR methods, and may allow for extension to other areas of biomedical data analysis.
Collapse
Affiliation(s)
- Satish Viswanath
- Dept. of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, New Jersey 08854, USA.
| | | |
Collapse
|
32
|
Reutlinger M, Schneider G. Nonlinear dimensionality reduction and mapping of compound libraries for drug discovery. J Mol Graph Model 2012; 34:108-17. [PMID: 22326864 DOI: 10.1016/j.jmgm.2011.12.006] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2011] [Revised: 12/13/2011] [Accepted: 12/14/2011] [Indexed: 01/29/2023]
Abstract
Visualization of 'chemical space' and compound distributions has received much attraction by medicinal chemists as it may help to intuitively comprehend pharmaceutically relevant molecular features. It has been realized that for meaningful feature extraction from complex multivariate chemical data, such as compound libraries represented by many molecular descriptors, nonlinear projection techniques are required. Recent advances in machine-learning and artificial intelligence have resulted in a transfer of such methods to chemistry. We provide an overview of prominent visualization methods based on nonlinear dimensionality reduction, and highlight applications in drug discovery. Emphasis is on neural network techniques, kernel methods and stochastic embedding approaches, which have been successfully used for ligand-based virtual screening, SAR landscape analysis, combinatorial library design, and screening compound selection.
Collapse
Affiliation(s)
- Michael Reutlinger
- Swiss Federal Institute of Technology (ETH), Department of Chemistry and Applied Biosciences, Zurich, Switzerland
| | | |
Collapse
|
33
|
Agner SC, Soman S, Libfeld E, McDonald M, Thomas K, Englander S, Rosen MA, Chin D, Nosher J, Madabhushi A. Textural kinetics: a novel dynamic contrast-enhanced (DCE)-MRI feature for breast lesion classification. J Digit Imaging 2011; 24:446-63. [PMID: 20508965 DOI: 10.1007/s10278-010-9298-1] [Citation(s) in RCA: 91] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Dynamic contrast-enhanced (DCE)-magnetic resonance imaging (MRI) of the breast has emerged as an adjunct imaging tool to conventional X-ray mammography due to its high detection sensitivity. Despite the increasing use of breast DCE-MRI, specificity in distinguishing malignant from benign breast lesions is low, and interobserver variability in lesion classification is high. The novel contribution of this paper is in the definition of a new DCE-MRI descriptor that we call textural kinetics, which attempts to capture spatiotemporal changes in breast lesion texture in order to distinguish malignant from benign lesions. We qualitatively and quantitatively demonstrated on 41 breast DCE-MRI studies that textural kinetic features outperform signal intensity kinetics and lesion morphology features in distinguishing benign from malignant lesions. A probabilistic boosting tree (PBT) classifier in conjunction with textural kinetic descriptors yielded an accuracy of 90%, sensitivity of 95%, specificity of 82%, and an area under the curve (AUC) of 0.92. Graph embedding, used for qualitative visualization of a low-dimensional representation of the data, showed the best separation between benign and malignant lesions when using textural kinetic features. The PBT classifier results and trends were also corroborated via a support vector machine classifier which showed that textural kinetic features outperformed the morphological, static texture, and signal intensity kinetics descriptors. When textural kinetic attributes were combined with morphologic descriptors, the resulting PBT classifier yielded 89% accuracy, 99% sensitivity, 76% specificity, and an AUC of 0.91.
Collapse
Affiliation(s)
- Shannon C Agner
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Riedmaier I, Pfaffl MW, Meyer HHD. The analysis of the transcriptome as a new approach for biomarker development to trace the abuse of anabolic steroid hormones. Drug Test Anal 2011; 3:676-81. [DOI: 10.1002/dta.304] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2011] [Revised: 05/02/2011] [Accepted: 05/04/2011] [Indexed: 01/20/2023]
|
35
|
Viswanath S, Bloch BN, Chappelow J, Patel P, Rofsky N, Lenkinski R, Genega E, Madabhushi A. Enhanced Multi-Protocol Analysis via Intelligent Supervised Embedding (EMPrAvISE): Detecting Prostate Cancer on Multi-Parametric MRI. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2011; 7963:79630U. [PMID: 25301991 DOI: 10.1117/12.878312] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Currently, there is significant interest in developing methods for quantitative integration of multi-parametric (structural, functional) imaging data with the objective of building automated meta-classifiers to improve disease detection, diagnosis, and prognosis. Such techniques are required to address the differences in dimensionalities and scales of individual protocols, while deriving an integrated multi-parametric data representation which best captures all disease-pertinent information available. In this paper, we present a scheme called Enhanced Multi-Protocol Analysis via Intelligent Supervised Embedding (EMPrAvISE); a powerful, generalizable framework applicable to a variety of domains for multi-parametric data representation and fusion. Our scheme utilizes an ensemble of embeddings (via dimensionality reduction, DR); thereby exploiting the variance amongst multiple uncorrelated embeddings in a manner similar to ensemble classifier schemes (e.g. Bagging, Boosting). We apply this framework to the problem of prostate cancer (CaP) detection on 12 3 Tesla pre-operative in vivo multi-parametric (T2-weighted, Dynamic Contrast Enhanced, and Diffusion-weighted) magnetic resonance imaging (MRI) studies, in turn comprising a total of 39 2D planar MR images. We first align the different imaging protocols via automated image registration, followed by quantification of image attributes from individual protocols. Multiple embeddings are generated from the resultant high-dimensional feature space which are then combined intelligently to yield a single stable solution. Our scheme is employed in conjunction with graph embedding (for DR) and probabilistic boosting trees (PBTs) to detect CaP on multi-parametric MRI. Finally, a probabilistic pairwise Markov Random Field algorithm is used to apply spatial constraints to the result of the PBT classifier, yielding a per-voxel classification of CaP presence. Per-voxel evaluation of detection results against ground truth for CaP extent on MRI (obtained by spatially registering pre-operative MRI with available whole-mount histological specimens) reveals that EMPrAvISE yields a statistically significant improvement (AUC=0.77) over classifiers constructed from individual protocols (AUC=0.62, 0.62, 0.65, for T2w, DCE, DWI respectively) as well as one trained using multi-parametric feature concatenation (AUC=0.67).
Collapse
Affiliation(s)
| | | | | | - Pratik Patel
- Rutgers, the State University of New Jersey, USA
| | - Neil Rofsky
- UT Southwestern Medical School, Dallas, Texas, USA
| | | | | | | |
Collapse
|
36
|
Madabhushi A, Agner S, Basavanhally A, Doyle S, Lee G. Computer-aided prognosis: predicting patient and disease outcome via quantitative fusion of multi-scale, multi-modal data. Comput Med Imaging Graph 2011; 35:506-14. [PMID: 21333490 DOI: 10.1016/j.compmedimag.2011.01.008] [Citation(s) in RCA: 85] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2010] [Revised: 12/16/2010] [Accepted: 01/10/2011] [Indexed: 12/31/2022]
Abstract
Computer-aided prognosis (CAP) is a new and exciting complement to the field of computer-aided diagnosis (CAD) and involves developing and applying computerized image analysis and multi-modal data fusion algorithms to digitized patient data (e.g. imaging, tissue, genomic) for helping physicians predict disease outcome and patient survival. While a number of data channels, ranging from the macro (e.g. MRI) to the nano-scales (proteins, genes) are now being routinely acquired for disease characterization, one of the challenges in predicting patient outcome and treatment response has been in our inability to quantitatively fuse these disparate, heterogeneous data sources. At the Laboratory for Computational Imaging and Bioinformatics (LCIB)(1) at Rutgers University, our team has been developing computerized algorithms for high dimensional data and image analysis for predicting disease outcome from multiple modalities including MRI, digital pathology, and protein expression. Additionally, we have been developing novel data fusion algorithms based on non-linear dimensionality reduction methods (such as Graph Embedding) to quantitatively integrate information from multiple data sources and modalities with the overarching goal of optimizing meta-classifiers for making prognostic predictions. In this paper, we briefly describe 4 representative and ongoing CAP projects at LCIB. These projects include (1) an Image-based Risk Score (IbRiS) algorithm for predicting outcome of Estrogen receptor positive breast cancer patients based on quantitative image analysis of digitized breast cancer biopsy specimens alone, (2) segmenting and determining extent of lymphocytic infiltration (identified as a possible prognostic marker for outcome in human epidermal growth factor amplified breast cancers) from digitized histopathology, (3) distinguishing patients with different Gleason grades of prostate cancer (grade being known to be correlated to outcome) from digitized needle biopsy specimens, and (4) integrating protein expression measurements obtained from mass spectrometry with quantitative image features derived from digitized histopathology for distinguishing between prostate cancer patients at low and high risk of disease recurrence following radical prostatectomy.
Collapse
Affiliation(s)
- Anant Madabhushi
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ 08854, USA.
| | | | | | | | | |
Collapse
|
37
|
Zhang J, Zhang K, Feng J, Small M. Rhythmic dynamics and synchronization via dimensionality reduction: application to human gait. PLoS Comput Biol 2010; 6:e1001033. [PMID: 21187907 PMCID: PMC3002994 DOI: 10.1371/journal.pcbi.1001033] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2010] [Accepted: 11/11/2010] [Indexed: 11/18/2022] Open
Abstract
Reliable characterization of locomotor dynamics of human walking is vital to understanding the neuromuscular control of human locomotion and disease diagnosis. However, the inherent oscillation and ubiquity of noise in such non-strictly periodic signals pose great challenges to current methodologies. To this end, we exploit the state-of-the-art technology in pattern recognition and, specifically, dimensionality reduction techniques, and propose to reconstruct and characterize the dynamics accurately on the cycle scale of the signal. This is achieved by deriving a low-dimensional representation of the cycles through global optimization, which effectively preserves the topology of the cycles that are embedded in a high-dimensional Euclidian space. Our approach demonstrates a clear advantage in capturing the intrinsic dynamics and probing the subtle synchronization patterns from uni/bivariate oscillatory signals over traditional methods. Application to human gait data for healthy subjects and diabetics reveals a significant difference in the dynamics of ankle movements and ankle-knee coordination, but not in knee movements. These results indicate that the impaired sensory feedback from the feet due to diabetes does not influence the knee movement in general, and that normal human walking is not critically dependent on the feedback from the peripheral nervous system.
Collapse
Affiliation(s)
- Jie Zhang
- Center for Computational Systems Biology, Fudan University, Shanghai, People's Republic of China.
| | | | | | | |
Collapse
|
38
|
Zhang Y, Xu G, Wang J, Liang L. An automatic patient-specific seizure onset detection method in intracranial EEG based on incremental nonlinear dimensionality reduction. Comput Biol Med 2010; 40:889-99. [PMID: 20951372 DOI: 10.1016/j.compbiomed.2010.09.010] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2010] [Revised: 09/14/2010] [Accepted: 09/28/2010] [Indexed: 11/17/2022]
Abstract
Epileptic seizure features always include the morphology and spatial distribution of nonlinear waveforms in the electroencephalographic (EEG) signals. In this study, we propose a novel incremental learning scheme based on nonlinear dimensionality reduction for automatic patient-specific seizure onset detection. The method allows for identification of seizure onset times in long-term EEG signals acquired from epileptic patients. Firstly, a nonlinear dimensionality reduction (NDR) method called local tangent space alignment (LTSA) is used to reduce the dimensionality of available initial feature sets extracted with continuous wavelet transform (CWT). One-dimensional manifold which reflects the intrinsic dynamics of seizure onset is obtained. For each patient, IEEG recordings containing one seizure onset is sufficient to train the initial one-dimensional manifold. Secondly, an unsupervised incremental learning scheme is proposed to update the initial manifold when the unlabelled EEG segments flow in sequentially. The incremental learning scheme can cluster the new coming samples into the trained patterns (containing or not containing seizure onsets). Intracranial EEG recordings from 21 patients with duration of 193.8h and 82 seizures are used for the evaluation of the method. Average sensitivity of 98.8%, average uninteresting false positive rate of 0.24/h, average interesting false positives rate of 0.25/h, and average detection delay of 10.8s are obtained. Our method offers simple, accurate training with less human intervening and can be well used in off-line seizure detection. The unsupervised incremental learning scheme has the potential in identifying novel IEEG classes (different onset patterns) within the data.
Collapse
Affiliation(s)
- Yizhuo Zhang
- State Key Laboratory for Manufacturing Systems Engineering, Xi'an Jiaotong University, Xi'an 710049, PR China
| | | | | | | |
Collapse
|
39
|
Shi J, Luo Z. Nonlinear dimensionality reduction of gene expression data for visualization and clustering analysis of cancer tissue samples. Comput Biol Med 2010; 40:723-32. [PMID: 20637456 DOI: 10.1016/j.compbiomed.2010.06.007] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2009] [Revised: 06/24/2010] [Accepted: 06/30/2010] [Indexed: 11/17/2022]
Abstract
Gene expression data are the representation of nonlinear interactions among genes and environmental factors. Computing analysis of these data is expected to gain knowledge of gene functions and disease mechanisms. Clustering is a classical exploratory technique of discovering similar expression patterns and function modules. However, gene expression data are usually of high dimensions and relatively small samples, which results in the main difficulty for the application of clustering algorithms. Principal component analysis (PCA) is usually used to reduce the data dimensions for further clustering analysis. While PCA estimates the similarity between expression profiles based on the Euclidean distance, which cannot reveal the nonlinear connections between genes. This paper uses nonlinear dimensionality reduction (NDR) as a preprocessing strategy for feature selection and visualization, and then applies clustering algorithms to the reduced feature spaces. In order to estimate the effectiveness of NDR for capturing biologically relevant structures, the comparative analysis between NDR and PCA is exploited to five real cancer expression datasets. Results show that NDR can perform better than PCA in visualization and clustering analysis of complex gene expression data.
Collapse
Affiliation(s)
- Jinlong Shi
- School of Computer, National University of Defense Technology, Changsha 410073, Hunan, China.
| | | |
Collapse
|
40
|
Spectral embedding based probabilistic boosting tree (ScEPTre): classifying high dimensional heterogeneous biomedical data. ACTA ACUST UNITED AC 2010. [PMID: 20426190 DOI: 10.1007/978-3-642-04271-3_102] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
The major challenge with classifying high dimensional biomedical data is in identifying the appropriate feature representation to (a) overcome the curse of dimensionality, and (b) facilitate separation between the data classes. Another challenge is to integrate information from two disparate modalities, possibly existing in different dimensional spaces, for improved classification. In this paper, we present a novel data representation, integration and classification scheme, Spectral Embedding based Probabilistic boosting Tree (ScEPTre), which incorporates Spectral Embedding (SE) for data representation and integration and a Probabilistic Boosting Tree classifier for data classification. SE provides an alternate representation of the data by non-linearly transforming high dimensional data into a low dimensional embedding space such that the relative adjacencies between objects are preserved. We demonstrate the utility of ScEPTre to classify and integrate Magnetic Resonance (MR) Spectroscopy (MRS) and Imaging (MRI) data for prostate cancer detection. Area under the receiver operating Curve (AUC) obtained via randomized cross validation on 15 prostate MRI-MRS studies suggests that (a) ScEPTre on MRS significantly outperforms a Haar wavelets based classifier, (b) integration of MRI-MRS via ScEPTre performs significantly better compared to using MRI and MRS alone, and (c) data integration via ScEPTre yields superior classification results compared to combining decisions from individual classifiers (or modalities).
Collapse
|
41
|
Cevallos-Cevallos JM, Reyes-De-Corcuera JI, Etxeberria E, Danyluk MD, Rodrick GE. Metabolomic analysis in food science: a review. Trends Food Sci Technol 2009. [DOI: 10.1016/j.tifs.2009.07.002] [Citation(s) in RCA: 379] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
42
|
Tiwari P, Rosen M, Madabhushi A. A hierarchical spectral clustering and nonlinear dimensionality reduction scheme for detection of prostate cancer from magnetic resonance spectroscopy (MRS). Med Phys 2009; 36:3927-39. [PMID: 19810465 DOI: 10.1118/1.3180955] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
Magnetic resonance spectroscopy (MRS) has been shown to have great clinical potential as a supplement to magnetic resonance imaging in the detection of prostate cancer (CaP). MRS provides functional information in the form of changes in the relative concentration of specific metabolites including choline, creatine, and citrate which can be used to identify potential areas of CaP. With a view to assisting radiologists in interpretation and analysis of MRS data, some researchers have begun to develop computer-aided detection (CAD) schemes for CaP identification from spectroscopy. Most of these schemes have been centered on identifying and integrating the area under metabolite peaks which is then used to compute relative metabolite ratios. However, manual identification of metabolite peaks on the MR spectra, and especially via CAD, is a challenging problem due to low signal-to-noise ratio, baseline irregularity, peak overlap, and peak distortion. In this article the authors present a novel CAD scheme that integrates nonlinear dimensionality reduction (NLDR) with an unsupervised hierarchical clustering algorithm to automatically identify suspicious regions on the prostate using MRS and hence avoids the need to explicitly identify metabolite peaks. The methodology comprises two stages. In stage 1, a hierarchical spectral clustering algorithm is used to distinguish between extracapsular and prostatic spectra in order to localize the region of interest (ROI) corresponding to the prostate. Once the prostate ROI is localized, in stage 2, a NLDR scheme, in conjunction with a replicated clustering algorithm, is used to automatically discriminate between three classes of spectra (normal appearing, suspicious appearing, and indeterminate). The methodology was quantitatively and qualitatively evaluated on a total of 18 1.5 T in vivo prostate T2-weighted (w) and MRS studies obtained from the multisite, multi-institutional American College of Radiology (ACRIN) trial. In the absence of the precise ground truth for CaP extent on the MR imaging for most of the ACRIN studies, probabilistic quantitative metrics were defined based on partial knowledge on the quadrant location and size of the tumor. The scheme, when evaluated against this partial ground truth, was found to have a CaP detection sensitivity of 89.33% and specificity of 79.79%. The results obtained from randomized threefold and fivefold cross validation suggest that the NLDR based clustering scheme has a higher CaP detection accuracy compared to such commonly used MRS analysis schemes as z score and PCA. In addition, the scheme was found to be robust to changes in system parameters. For 6 of the 18 studies an expert radiologist laboriously labeled each of the individual spectra according to a five point scale, with 1/2 representing spectra that the expert considered normal and 3/4/5 being spectra the expert deemed suspicious. When evaluated on these expert annotated datasets, the CAD system yielded an average sensitivity (cluster corresponding to suspicious spectra being identified as the CaP class) and specificity of 81.39% and 64.71%, respectively.
Collapse
Affiliation(s)
- Pallavi Tiwari
- Department of Biomedical Engineering, Rutgers, The State University of New Jersey, Piscataway, New Jersey 08854, USA
| | | | | |
Collapse
|
43
|
The use of omic technologies for biomarker development to trace functions of anabolic agents. J Chromatogr A 2009; 1216:8192-9. [DOI: 10.1016/j.chroma.2009.01.094] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2009] [Revised: 01/27/2009] [Accepted: 01/30/2009] [Indexed: 12/25/2022]
|
44
|
Basavanhally AN, Ganesan S, Agner S, Monaco JP, Feldman MD, Tomaszewski JE, Bhanot G, Madabhushi A. Computerized image-based detection and grading of lymphocytic infiltration in HER2+ breast cancer histopathology. IEEE Trans Biomed Eng 2009; 57:642-53. [PMID: 19884074 DOI: 10.1109/tbme.2009.2035305] [Citation(s) in RCA: 189] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The identification of phenotypic changes in breast cancer (BC) histopathology on account of corresponding molecular changes is of significant clinical importance in predicting disease outcome. One such example is the presence of lymphocytic infiltration (LI) in histopathology, which has been correlated with nodal metastasis and distant recurrence in HER2+ BC patients. In this paper, we present a computer-aided diagnosis (CADx) scheme to automatically detect and grade the extent of LI in digitized HER2+ BC histopathology. Lymphocytes are first automatically detected by a combination of region growing and Markov random field algorithms. Using the centers of individual detected lymphocytes as vertices, three graphs (Voronoi diagram, Delaunay triangulation, and minimum spanning tree) are constructed and a total of 50 image-derived features describing the arrangement of the lymphocytes are extracted from each sample. A nonlinear dimensionality reduction scheme, graph embedding (GE), is then used to project the high-dimensional feature vector into a reduced 3-D embedding space. A support vector machine classifier is used to discriminate samples with high and low LI in the reduced dimensional embedding space. A total of 41 HER2+ hematoxylin-and-eosin-stained images obtained from 12 patients were considered in this study. For more than 100 three-fold cross-validation trials, the architectural feature set successfully distinguished samples of high and low LI levels with a classification accuracy greater than 90%. The popular unsupervised Varma-Zisserman texton-based classification scheme was used for comparison and yielded a classification accuracy of only 60%. Additionally, the projection of the 50 image-derived features for all 41 tissue samples into a reduced dimensional space via GE allowed for the visualization of a smooth manifold that revealed a continuum between low, intermediate, and high levels of LI. Since it is known that extent of LI in BC biopsy specimens is a prognostic indicator, our CADx scheme will potentially help clinicians determine disease outcome and allow them to make better therapy recommendations for patients with HER2+ BC.
Collapse
|
45
|
Algorithm for the Analysis of Tryptophan Fluorescence Spectra and Their Correlation with Protein Structural Parameters. ALGORITHMS 2009. [DOI: 10.3390/a2031155] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
|
46
|
Consensus-locally linear embedding (C-LLE): application to prostate cancer detection on magnetic resonance spectroscopy. ACTA ACUST UNITED AC 2008; 11:330-8. [PMID: 18982622 DOI: 10.1007/978-3-540-85990-1_40] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/24/2023]
Abstract
Locally Linear Embedding (LLE) is a widely used non-linear dimensionality reduction (NLDR) method that projects multi-dimensional data into a low-dimensional embedding space while attempting to preserve object adjacencies from the original high-dimensional feature space. A limitation of LLE, however, is the presence of free parameters, changing the values of which may dramatically change the low dimensional representations of the data. In this paper, we present a novel Consensus-LLE (C-LLE) scheme which constructs a stable consensus embedding from across multiple low dimensional unstable LLE data representations obtained by varying the parameter (kappa) controlling locally linearity. The approach is analogous to Breiman's Bagging algorithm for generating ensemble classifiers by combining multiple weak predictors into a single predictor. In this paper we demonstrate the utility of C-LLE in creating a low dimensional stable representation of Magnetic Resonance Spectroscopy (MRS) data for identifying prostate cancer. Results of quantitative evaluation demonstrate that our C-LLE scheme has higher cancer detection sensitivity (86.90%) and specificity (85.14%) compared to LLE and other state of the art schemes currently employed for analysis of MRS data.
Collapse
|