1
|
Carpenter CM, Gillenwater L, Bowler R, Kechris K, Ghosh D. TreeKernel: interpretable kernel machine tests for interactions between -omics and clinical predictors with applications to metabolomics and COPD phenotypes. BMC Bioinformatics 2023; 24:398. [PMID: 37880571 PMCID: PMC10601228 DOI: 10.1186/s12859-023-05459-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 08/30/2023] [Indexed: 10/27/2023] Open
Abstract
BACKGROUND In this paper, we are interested in interactions between a high-dimensional -omics dataset and clinical covariates. The goal is to evaluate the relationship between a phenotype of interest and a high-dimensional omics pathway, where the effect of the omics data depends on subjects' clinical covariates (age, sex, smoking status, etc.). For instance, metabolic pathways can vary greatly between sexes which may also change the relationship between certain metabolic pathways and a clinical phenotype of interest. We propose partitioning the clinical covariate space and performing a kernel association test within those partitions. To illustrate this idea, we focus on hierarchical partitions of the clinical covariate space and kernel tests on metabolic pathways. RESULTS We see that our proposed method outperforms competing methods in most simulation scenarios. It can identify different relationships among clinical groups with higher power in most scenarios while maintaining a proper Type I error rate. The simulation studies also show a robustness to the grouping structure within the clinical space. We also apply the method to the COPDGene study and find several clinically meaningful interactions between metabolic pathways, the clinical space, and lung function. CONCLUSION TreeKernel provides a simple and interpretable process for testing for relationships between high-dimensional omics data and clinical outcomes in the presence of interactions within clinical cohorts. The method is broadly applicable to many studies.
Collapse
Affiliation(s)
- Charlie M Carpenter
- Department of Biostatistics and Informatics, University of Colorado Denver, Anschutz Medical Campus, Denver, CO, USA.
| | - Lucas Gillenwater
- Computational Bioscience Program, University of Colorado Denver, Anschutz Medical Campus, Denver, CO, USA
| | - Russell Bowler
- Department of Medicine, National Jewish Health, Denver, USA
- University of Colorado Denver, Anschutz Medical Campus, Denver, CO, USA
| | - Katerina Kechris
- Department of Biostatistics and Informatics, University of Colorado Denver, Anschutz Medical Campus, Denver, CO, USA
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, University of Colorado Denver, Anschutz Medical Campus, Denver, CO, USA
| |
Collapse
|
2
|
Briscik M, Dillies MA, Déjean S. Improvement of variables interpretability in kernel PCA. BMC Bioinformatics 2023; 24:282. [PMID: 37438763 DOI: 10.1186/s12859-023-05404-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 06/27/2023] [Indexed: 07/14/2023] Open
Abstract
BACKGROUND Kernel methods have been proven to be a powerful tool for the integration and analysis of high-throughput technologies generated data. Kernels offer a nonlinear version of any linear algorithm solely based on dot products. The kernelized version of principal component analysis is a valid nonlinear alternative to tackle the nonlinearity of biological sample spaces. This paper proposes a novel methodology to obtain a data-driven feature importance based on the kernel PCA representation of the data. RESULTS The proposed method, kernel PCA Interpretable Gradient (KPCA-IG), provides a data-driven feature importance that is computationally fast and based solely on linear algebra calculations. It has been compared with existing methods on three benchmark datasets. The accuracy obtained using KPCA-IG selected features is equal to or greater than the other methods' average. Also, the computational complexity required demonstrates the high efficiency of the method. An exhaustive literature search has been conducted on the selected genes from a publicly available Hepatocellular carcinoma dataset to validate the retained features from a biological point of view. The results once again remark on the appropriateness of the computed ranking. CONCLUSIONS The black-box nature of kernel PCA needs new methods to interpret the original features. Our proposed methodology KPCA-IG proved to be a valid alternative to select influential variables in high-dimensional high-throughput datasets, potentially unravelling new biological and medical biomarkers.
Collapse
Affiliation(s)
- Mitja Briscik
- Institut de Mathématiques de Toulouse, UMR5219, CNRS, UPS, Université de Toulouse, Cedex 9, 31062, Toulouse, France.
| | - Marie-Agnès Dillies
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, F-75015, Paris, France
| | - Sébastien Déjean
- Institut de Mathématiques de Toulouse, UMR5219, CNRS, UPS, Université de Toulouse, Cedex 9, 31062, Toulouse, France
| |
Collapse
|
3
|
Binatlı OC, Gönen M. MOKPE: drug-target interaction prediction via manifold optimization based kernel preserving embedding. BMC Bioinformatics 2023; 24:276. [PMID: 37407927 DOI: 10.1186/s12859-023-05401-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 06/25/2023] [Indexed: 07/07/2023] Open
Abstract
BACKGROUND In many applications of bioinformatics, data stem from distinct heterogeneous sources. One of the well-known examples is the identification of drug-target interactions (DTIs), which is of significant importance in drug discovery. In this paper, we propose a novel framework, manifold optimization based kernel preserving embedding (MOKPE), to efficiently solve the problem of modeling heterogeneous data. Our model projects heterogeneous drug and target data into a unified embedding space by preserving drug-target interactions and drug-drug, target-target similarities simultaneously. RESULTS We performed ten replications of ten-fold cross validation on four different drug-target interaction network data sets for predicting DTIs for previously unseen drugs. The classification evaluation metrics showed better or comparable performance compared to previous similarity-based state-of-the-art methods. We also evaluated MOKPE on predicting unknown DTIs of a given network. Our implementation of the proposed algorithm in R together with the scripts that replicate the reported experiments is publicly available at https://github.com/ocbinatli/mokpe .
Collapse
Affiliation(s)
- Oğuz C Binatlı
- Graduate School of Sciences and Engineering, Koç University, 34450, Istanbul, Turkey
| | - Mehmet Gönen
- Department of Industrial Engineering, College of Engineering, Koç University, 34450, Istanbul, Turkey.
- School of Medicine, Koç University, 34450, Istanbul, Turkey.
| |
Collapse
|
4
|
De Santis E, Martino A, Rizzi A. On component-wise dissimilarity measures and metric properties in pattern recognition. PeerJ Comput Sci 2022; 8:e1106. [PMID: 36262128 PMCID: PMC9575871 DOI: 10.7717/peerj-cs.1106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 08/26/2022] [Indexed: 06/16/2023]
Abstract
In many real-world applications concerning pattern recognition techniques, it is of utmost importance the automatic learning of the most appropriate dissimilarity measure to be used in object comparison. Real-world objects are often complex entities and need a specific representation grounded on a composition of different heterogeneous features, leading to a non-metric starting space where Machine Learning algorithms operate. However, in the so-called unconventional spaces a family of dissimilarity measures can be still exploited, that is, the set of component-wise dissimilarity measures, in which each component is treated with a specific sub-dissimilarity that depends on the nature of the data at hand. These dissimilarities are likely to be non-Euclidean, hence the underlying dissimilarity matrix is not isometrically embeddable in a standard Euclidean space because it may not be structurally rich enough. On the other hand, in many metric learning problems, a component-wise dissimilarity measure can be defined as a weighted linear convex combination and weights can be suitably learned. This article, after introducing some hints on the relation between distances and the metric learning paradigm, provides a discussion along with some experiments on how weights, intended as mathematical operators, interact with the Euclidean behavior of dissimilarity matrices.
Collapse
Affiliation(s)
- Enrico De Santis
- Department of Information Engineering, Electronics and Telecommunications, University of Roma “La Sapienza”, Rome, Italy
| | - Alessio Martino
- Department of Business and Management, LUISS University, Rome, Italy
| | - Antonello Rizzi
- Department of Information Engineering, Electronics and Telecommunications, University of Roma “La Sapienza”, Rome, Italy
| |
Collapse
|
5
|
Bernal-de-Lázaro JM, Cruz-Corona C, Silva-Neto AJ, Llanes-Santiago O. Criteria for optimizing kernel methods in fault monitoring process: A survey. ISA Trans 2022; 127:259-272. [PMID: 34511263 DOI: 10.1016/j.isatra.2021.08.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Revised: 08/27/2021] [Accepted: 08/29/2021] [Indexed: 06/13/2023]
Abstract
Nowadays, how to select the kernel function and their parameters for ensuring high-performance indicators in fault diagnosis applications remains as two open research issues. This paper provides a comprehensive literature survey of kernel-preprocessing methods in condition monitoring tasks, with emphasis on the procedures for selecting their parameters. Accordingly, twenty kernel optimization criteria and sixteen kernel functions are analyzed. A kernel evaluation framework is further provided for helping in the selection and adjustment of kernel functions. The proposal is validated via a KPCA-based monitoring scheme and two well-known benchmark processes.
Collapse
Affiliation(s)
- José M Bernal-de-Lázaro
- Department of Automation and Computing, Universidad Tecnológica de La Habana "José Antonio Echeverría", CUJAE, Cuba
| | - Carlos Cruz-Corona
- Department of Computer Science and Artificial Intelligence, University of Granada, Spain
| | - Antônio J Silva-Neto
- Department of Mechanical Engineering, Universidade do Estado do Rio de Janeiro, IPRJ-UERJ, RJ, Brazil
| | - Orestes Llanes-Santiago
- Department of Automation and Computing, Universidad Tecnológica de La Habana "José Antonio Echeverría", CUJAE, Cuba.
| |
Collapse
|
6
|
Matabuena M, Félix P, García-Meixide C, Gude F. Kernel machine learning methods to handle missing responses with complex predictors. Application in modelling five-year glucose changes using distributional representations. Comput Methods Programs Biomed 2022; 221:106905. [PMID: 35649295 DOI: 10.1016/j.cmpb.2022.106905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Revised: 05/11/2022] [Accepted: 05/22/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND AND OBJECTIVES Missing data is a ubiquitous problem in longitudinal studies due to the number of patients lost to follow-up. Kernel methods have enriched the machine learning field by successfully managing non-vectorial predictors, such as graphs, strings, and probability distributions, and have emerged as a promising tool for the analysis of complex data stemming from modern healthcare. This paper proposes a new set of kernel methods to handle missing data in the response variables. These methods will be applied to predict long-term changes in glycated haemoglobin (A1c), the primary biomarker used to diagnose and monitor the progression of diabetes mellitus, making emphasis on exploring the predictive potential of continuous glucose monitoring (CGM). METHODS We propose a new framework of non-linear kernel methods for testing statistical independence, selecting relevant predictors, and quantifying the uncertainty of the resultant predictive models. As a novelty in the clinical analysis, we used a distributional representation of CGM as a predictor and compared its performance with that of traditional diabetes biomarkers. RESULTS The results show that, after the incorporation of CGM information, predictive ability increases from R2=0.61 to R2=0.71. In addition, uncertainty analysis is useful for characterising some subpopulations where predictivity is worsened, and a more personalised clinical follow-up is advisable according to expected patient uncertainty in glucose values. CONCLUSIONS The proposed methods have proven to deal effectively with missing data. They also have the potential to improve the results of predictive tasks by including new complex objects as explanatory variables and modelling arbitrary dependence relations. The application of these methods to a longitudinal study of diabetes showed that the inclusion of a distributional representation of CGM data provides greater sensitivity in predicting five-year A1c changes than classical diabetes biomarkers and traditional CGM metrics.
Collapse
Affiliation(s)
- Marcos Matabuena
- CiTIUS (Centro Singular de Investigación en Tecnoloxías Intelixentes), Universidade de Santiago de Compostela, Santiago de Compostela 15782, Spain.
| | - Paulo Félix
- CiTIUS (Centro Singular de Investigación en Tecnoloxías Intelixentes), Universidade de Santiago de Compostela, Santiago de Compostela 15782, Spain
| | | | - Francisco Gude
- Unidade de Epidemioloxía Clínica, Complexo Hospitalario Universidade de Santiago (CHUS), Travesía da Choupana, Santiago de Compostela 15706, Spain
| |
Collapse
|
7
|
Reinoso-Peláez EL, Gianola D, González-Recio O. Genome-Enabled Prediction Methods Based on Machine Learning. Methods Mol Biol 2022; 2467:189-218. [PMID: 35451777 DOI: 10.1007/978-1-0716-2205-6_7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Growth of artificial intelligence and machine learning (ML) methodology has been explosive in recent years. In this class of procedures, computers get knowledge from sets of experiences and provide forecasts or classification. In genome-wide based prediction (GWP), many ML studies have been carried out. This chapter provides a description of main semiparametric and nonparametric algorithms used in GWP in animals and plants. Thirty-four ML comparative studies conducted in the last decade were used to develop a meta-analysis through a Thurstonian model, to evaluate algorithms with the best predictive qualities. It was found that some kernel, Bayesian, and ensemble methods displayed greater robustness and predictive ability. However, the type of study and data distribution must be considered in order to choose the most appropriate model for a given problem.
Collapse
|
8
|
Muller B, Lengellé R. Cross-Gram matrices and their use in transfer learning: Application to automatic REM detection using heart rate. Comput Methods Programs Biomed 2021; 208:106280. [PMID: 34333204 DOI: 10.1016/j.cmpb.2021.106280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 07/06/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVES while traditional sleep staging is achieved through the visual - expert-based - annotation of a polysomnography, it has the disadvantages of being unpractical and expensive. Alternatives have been developed over the years to relieve sleep staging from its heavy requirements, through the collection of more easily assessable signals and its automation using machine learning. However, these alternatives have their limitations, some due to variabilities among and between subjects, other inherent to their use of sub-discriminative signals. Many new solutions rely on the evaluation of the Autonomic Nervous System (ANS) activation through the assessment of the heart-rate (HR); the latter is modulated by the aforementioned variabilities, which may result in data and concept shifts between what was learned and what we want to classify. Such adversary effects are usually tackled by Transfer Learning, dealing with problems where there are differences between what is known (source) and what we want to classify (target). In this paper, we propose two new kernel-based methods of transfer learning and assess their performances in Rapid-Eye-Movement (REM) sleep stage detection, using solely the heart rate. METHODS our first contribution is the introduction of Kernel-Cross Alignment (KCA), a measure of similarity between a source and a target, which is a direct extension of Kernel-Target Alignment (KTA). To our knowledge, KCA has currently never been studied in the literature. Our second contribution is two alignment-based methods of transfer learning: Kernel-Target Alignment Transfer Learning (KTATL) and Kernel-Cross Alignment Transfer Learning (KCATL). Both methods differ from KTA, whose traditional use is kernel-tuning: in our methods, the kernel has been fixed beforehand, and our objective is the improvement of the estimation of unknown target labels by taking into account how observations relate to each other, which, as it will be explained, allows to transfer knowledge (transfer learning). RESULTS we compare performances with transfer learning (KCATL, KTATL) to performances without transfer using a fixed classifier (a Support Vector Classifier - SVC). In most cases, both transfer learning methods result in an improvement of performances (higher detection rates for a fixed false-alarm rate). Our methods do not require iterative computations. CONCLUSION we observe improved performances using our transfer methods, which are computationally efficient, as they only require the computation of a kernel matrix and are non-iterative. However, some optimisation aspects are still under investigation.
Collapse
Affiliation(s)
- Bruno Muller
- Institut Charles Delaunay, UTT, 12 rue Marie Curie, CS 42060, 10004 Troyes CEDEX, France; PPRS, 4E avenue du Général de Gaulle, 68000 Colmar, France.
| | - Régis Lengellé
- Institut Charles Delaunay, UTT, 12 rue Marie Curie, CS 42060, 10004 Troyes CEDEX, France.
| |
Collapse
|
9
|
Tonin F, Patrinos P, Suykens JAK. Unsupervised learning of disentangled representations in deep restricted kernel machines with orthogonality constraints. Neural Netw 2021; 142:661-679. [PMID: 34399376 DOI: 10.1016/j.neunet.2021.07.023] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 05/24/2021] [Accepted: 07/19/2021] [Indexed: 10/20/2022]
Abstract
We introduce Constr-DRKM, a deep kernel method for the unsupervised learning of disentangled data representations. We propose augmenting the original deep restricted kernel machine formulation for kernel PCA by orthogonality constraints on the latent variables to promote disentanglement and to make it possible to carry out optimization without first defining a stabilized objective. After discussing a number of algorithms for end-to-end training, we quantitatively evaluate the proposed method's effectiveness in disentangled feature learning. We demonstrate on four benchmark datasets that this approach performs similarly overall to β-VAE on several disentanglement metrics when few training points are available while being less sensitive to randomness and hyperparameter selection than β-VAE. We also present a deterministic initialization of Constr-DRKM's training algorithm that significantly improves the reproducibility of the results. Finally, we empirically evaluate and discuss the role of the number of layers in the proposed methodology, examining the influence of each principal component in every layer and showing that components in lower layers act as local feature detectors capturing the broad trends of the data distribution, while components in deeper layers use the representation learned by previous layers and more accurately reproduce higher-level features.
Collapse
Affiliation(s)
- Francesco Tonin
- Department of Electrical Engineering, ESAT-STADIUS, KU Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium.
| | - Panagiotis Patrinos
- Department of Electrical Engineering, ESAT-STADIUS, KU Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium.
| | - Johan A K Suykens
- Department of Electrical Engineering, ESAT-STADIUS, KU Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium.
| |
Collapse
|
10
|
Lauriola I, Aiolli F, Lavelli A, Rinaldi F. Learning adaptive representations for entity recognition in the biomedical domain. J Biomed Semantics 2021; 12:10. [PMID: 34001263 PMCID: PMC8127187 DOI: 10.1186/s13326-021-00238-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Accepted: 03/09/2021] [Indexed: 11/10/2022] Open
Abstract
Background Named Entity Recognition is a common task in Natural Language Processing applications, whose purpose is to recognize named entities in textual documents. Several systems exist to solve this task in the biomedical domain, based on Natural Language Processing techniques and Machine Learning algorithms. A crucial step of these applications is the choice of the representation which describes data. Several representations have been proposed in the literature, some of which are based on a strong knowledge of the domain, and they consist of features manually defined by domain experts. Usually, these representations describe the problem well, but they require a lot of human effort and annotated data. On the other hand, general-purpose representations like word-embeddings do not require human domain knowledge, but they could be too general for a specific task. Results This paper investigates methods to learn the best representation from data directly, by combining several knowledge-based representations and word embeddings. Two mechanisms have been considered to perform the combination, which are neural networks and Multiple Kernel Learning. To this end, we use a hybrid architecture for biomedical entity recognition which integrates dictionary look-up (also known as gazetteers) with machine learning techniques. Results on the CRAFT corpus clearly show the benefits of the proposed algorithm in terms of F1 score. Conclusions Our experiments show that the principled combination of general, domain specific, word-, and character-level representations improves the performance of entity recognition. We also discussed the contribution of each representation in the final solution.
Collapse
Affiliation(s)
- Ivano Lauriola
- Department of Mathematics, University of Padova, Via Trieste 63, Padova, 35121, Italy. .,Fondazione Bruno Kessler, Via Sommarive 18, Trento, 38123, Italy.
| | - Fabio Aiolli
- Department of Mathematics, University of Padova, Via Trieste 63, Padova, 35121, Italy
| | - Alberto Lavelli
- Fondazione Bruno Kessler, Via Sommarive 18, Trento, 38123, Italy
| | - Fabio Rinaldi
- Fondazione Bruno Kessler, Via Sommarive 18, Trento, 38123, Italy.,Dalle Molle Institute for Artificial Intelligence Research (IDSIA), Via Cantonale 2C, Manno, 6928, Svizzera.,Department of Quantitative Biomedicine, University of Zurich, Andreasstrasse 15, Zürich, 8050, Svizzera.,SIB, Swiss Institute of Bioinformatics, Génopode, Quartier UNIL-Sorge, bâtiment, Lausanne, 1015, Svizzera
| |
Collapse
|
11
|
Villa A, Mundanad Narayanan A, Van Huffel S, Bertrand A, Varon C. Utility metric for unsupervised feature selection. PeerJ Comput Sci 2021; 7:e477. [PMID: 33981839 PMCID: PMC8080425 DOI: 10.7717/peerj-cs.477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 03/16/2021] [Indexed: 06/12/2023]
Abstract
Feature selection techniques are very useful approaches for dimensionality reduction in data analysis. They provide interpretable results by reducing the dimensions of the data to a subset of the original set of features. When the data lack annotations, unsupervised feature selectors are required for their analysis. Several algorithms for this aim exist in the literature, but despite their large applicability, they can be very inaccessible or cumbersome to use, mainly due to the need for tuning non-intuitive parameters and the high computational demands. In this work, a publicly available ready-to-use unsupervised feature selector is proposed, with comparable results to the state-of-the-art at a much lower computational cost. The suggested approach belongs to the methods known as spectral feature selectors. These methods generally consist of two stages: manifold learning and subset selection. In the first stage, the underlying structures in the high-dimensional data are extracted, while in the second stage a subset of the features is selected to replicate these structures. This paper suggests two contributions to this field, related to each of the stages involved. In the manifold learning stage, the effect of non-linearities in the data is explored, making use of a radial basis function (RBF) kernel, for which an alternative solution for the estimation of the kernel parameter is presented for cases with high-dimensional data. Additionally, the use of a backwards greedy approach based on the least-squares utility metric for the subset selection stage is proposed. The combination of these new ingredients results in the utility metric for unsupervised feature selection U2FS algorithm. The proposed U2FS algorithm succeeds in selecting the correct features in a simulation environment. In addition, the performance of the method on benchmark datasets is comparable to the state-of-the-art, while requiring less computational time. Moreover, unlike the state-of-the-art, U2FS does not require any tuning of parameters.
Collapse
Affiliation(s)
- Amalia Villa
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Leuven.AI, KU Leuven Institute for AI, Leuven, Belgium
| | - Abhijith Mundanad Narayanan
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Leuven.AI, KU Leuven Institute for AI, Leuven, Belgium
| | - Sabine Van Huffel
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Leuven.AI, KU Leuven Institute for AI, Leuven, Belgium
| | - Alexander Bertrand
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Leuven.AI, KU Leuven Institute for AI, Leuven, Belgium
| | - Carolina Varon
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Circuits and Systems (CAS) Group, Delft University of Technology, Delft, The Netherlands
- e-Media Research Lab, Campus GroepT, KU Leuven, Leuven, Belgium
| |
Collapse
|
12
|
Lindenbaum O, Salhov M, Yeredor A, Averbuch A. Gaussian bandwidth selection for manifold learning and classification. Data Min Knowl Discov 2020; 34:1676-1712. [PMID: 32837252 PMCID: PMC7330274 DOI: 10.1007/s10618-020-00692-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Accepted: 05/16/2020] [Indexed: 12/03/2022]
Abstract
Kernel methods play a critical role in many machine learning algorithms. They are useful in manifold learning, classification, clustering and other data analysis tasks. Setting the kernel's scale parameter, also referred to as the kernel's bandwidth, highly affects the performance of the task in hand. We propose to set a scale parameter that is tailored to one of two types of tasks: classification and manifold learning. For manifold learning, we seek a scale which is best at capturing the manifold's intrinsic dimension. For classification, we propose three methods for estimating the scale, which optimize the classification results in different senses. The proposed frameworks are simulated on artificial and on real datasets. The results show a high correlation between optimal classification rates and the estimated scales. Finally, we demonstrate the approach on a seismic event classification task.
Collapse
Affiliation(s)
- Ofir Lindenbaum
- School of Electrical Engineering, Tel Aviv University, Tel Aviv, Israel
| | - Moshe Salhov
- School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Arie Yeredor
- School of Electrical Engineering, Tel Aviv University, Tel Aviv, Israel
| | - Amir Averbuch
- School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
13
|
Milton P, Coupland H, Giorgi E, Bhatt S. Spatial analysis made easy with linear regression and kernels. Epidemics 2019; 29:100362. [PMID: 31561884 DOI: 10.1016/j.epidem.2019.100362] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Revised: 08/05/2019] [Accepted: 08/19/2019] [Indexed: 11/29/2022] Open
Abstract
Kernel methods are a popular technique for extending linear models to handle non-linear spatial problems via a mapping to an implicit, high-dimensional feature space. While kernel methods are computationally cheaper than an explicit feature mapping, they are still subject to cubic cost on the number of points. Given only a few thousand locations, this computational cost rapidly outstrips the currently available computational power. This paper aims to provide an overview of kernel methods from first-principals (with a focus on ridge regression) and progress to a review of random Fourier features (RFF), a method that enables the scaling of kernel methods to big datasets. We show how the RFF method is capable of approximating the full kernel matrix, providing a significant computational speed-up for a negligible cost to accuracy and can be incorporated into many existing spatial methods using only a few lines of code. We give an example of the implementation of RFFs on a simulated spatial data set to illustrate these properties. Lastly, we summarise the main issues with RFFs and highlight some of the advanced techniques aimed at alleviating them. At each stage, the associated R code is provided.
Collapse
Affiliation(s)
- Philip Milton
- MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, Imperial College London, London, UK.
| | - Helen Coupland
- MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, Imperial College London, London, UK.
| | - Emanuele Giorgi
- CHICAS, Lancaster Medical School, Lancaster University, Lancaster, UK.
| | - Samir Bhatt
- MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, Imperial College London, London, UK.
| |
Collapse
|
14
|
Wu G, Zheng R, Tian Y, Liu D. Joint Ranking SVM and Binary Relevance with robust Low-rank learning for multi-label classification. Neural Netw 2019; 122:24-39. [PMID: 31675625 DOI: 10.1016/j.neunet.2019.10.002] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2018] [Revised: 08/14/2019] [Accepted: 10/07/2019] [Indexed: 10/25/2022]
Abstract
Multi-label classification studies the task where each example belongs to multiple labels simultaneously. As a representative method, Ranking Support Vector Machine (Rank-SVM) aims to minimize the Ranking Loss and can also mitigate the negative influence of the class-imbalance issue. However, due to its stacking-style way for thresholding, it may suffer error accumulation and thus reduces the final classification performance. Binary Relevance (BR) is another typical method, which aims to minimize the Hamming Loss and only needs one-step learning. Nevertheless, it might have the class-imbalance issue and does not take into account label correlations. To address the above issues, we propose a novel multi-label classification model, which joints Ranking support vector machine and Binary Relevance with robust Low-rank learning (RBRL). RBRL inherits the ranking loss minimization advantages of Rank-SVM, and thus overcomes the disadvantages of BR suffering the class-imbalance issue and ignoring the label correlations. Meanwhile, it utilizes the hamming loss minimization and one-step learning advantages of BR, and thus tackles the disadvantages of Rank-SVM including another thresholding learning step. Besides, a low-rank constraint is utilized to further exploit high-order label correlations under the assumption of low dimensional label space. Furthermore, to achieve nonlinear multi-label classifiers, we derive the kernelization RBRL. Two accelerated proximal gradient methods (APG) are used to solve the optimization problems efficiently. Extensive comparative experiments with several state-of-the-art methods illustrate a highly competitive or superior performance of our method RBRL.
Collapse
Affiliation(s)
- Guoqiang Wu
- School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China; Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, China.
| | - Ruobing Zheng
- Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China.
| | - Yingjie Tian
- Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, China; School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China; Key Laboratory of Big Data Mining and Knowledge management, Chinese Academy of Sciences, Beijing 100190, China.
| | - Dalian Liu
- Department of Basic Course Teaching, Beijing Union University, Beijing 100101, China.
| |
Collapse
|
15
|
Abstract
BACKGROUND Advances in medical technology have allowed for customized prognosis, diagnosis, and treatment regimens that utilize multiple heterogeneous data sources. Multiple kernel learning (MKL) is well suited for the integration of multiple high throughput data sources. MKL remains to be under-utilized by genomic researchers partly due to the lack of unified guidelines for its use, and benchmark genomic datasets. RESULTS We provide three implementations of MKL in R. These methods are applied to simulated data to illustrate that MKL can select appropriate models. We also apply MKL to combine clinical information with miRNA gene expression data of ovarian cancer study into a single analysis. Lastly, we show that MKL can identify gene sets that are known to play a role in the prognostic prediction of 15 cancer types using gene expression data from The Cancer Genome Atlas, as well as, identify new gene sets for the future research. CONCLUSION Multiple kernel learning coupled with modern optimization techniques provides a promising learning tool for building predictive models based on multi-source genomic data. MKL also provides an automated scheme for kernel prioritization and parameter tuning. The methods used in the paper are implemented as an R package called RMKL package, which is freely available for download through CRAN at https://CRAN.R-project.org/package=RMKL .
Collapse
Affiliation(s)
- Christopher M. Wilson
- Department of Biostatistics and Bioinformatics at Moffitt Cancer Center, Tampa, FL USA
| | - Kaiqiao Li
- Department of Applied Mathematics and Statistics at Stony Brook University, Stony Brook, NY USA
| | - Xiaoqing Yu
- Department of Biostatistics and Bioinformatics at Moffitt Cancer Center, Tampa, FL USA
| | - Pei-Fen Kuan
- Department of Applied Mathematics and Statistics at Stony Brook University, Stony Brook, NY USA
| | - Xuefeng Wang
- Department of Biostatistics and Bioinformatics at Moffitt Cancer Center, Tampa, FL USA
| |
Collapse
|
16
|
Teso S, Masera L, Diligenti M, Passerini A. Combining learning and constraints for genome-wide protein annotation. BMC Bioinformatics 2019; 20:338. [PMID: 31208327 PMCID: PMC6580517 DOI: 10.1186/s12859-019-2875-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Accepted: 05/03/2019] [Indexed: 11/28/2022] Open
Abstract
Background The advent of high-throughput experimental techniques paved the way to genome-wide computational analysis and predictive annotation studies. When considering the joint annotation of a large set of related entities, like all proteins of a certain genome, many candidate annotations could be inconsistent, or very unlikely, given the existing knowledge. A sound predictive framework capable of accounting for this type of constraints in making predictions could substantially contribute to the quality of machine-generated annotations at a genomic scale. Results We present Ocelot, a predictive pipeline which simultaneously addresses functional and interaction annotation of all proteins of a given genome. The system combines sequence-based predictors for functional and protein-protein interaction (PPI) prediction with a consistency layer enforcing (soft) constraints as fuzzy logic rules. The enforced rules represent the available prior knowledge about the classification task, including taxonomic constraints over each GO hierarchy (e.g. a protein labeled with a GO term should also be labeled with all ancestor terms) as well as rules combining interaction and function prediction. An extensive experimental evaluation on the Yeast genome shows that the integration of prior knowledge via rules substantially improves the quality of the predictions. The system largely outperforms GoFDR, the only high-ranking system at the last CAFA challenge with a readily available implementation, when GoFDR is given access to intra-genome information only (as Ocelot), and has comparable or better results (depending on the hierarchy and performance measure) when GoFDR is allowed to use information from other genomes. Our system also compares favorably to recent methods based on deep learning. Electronic supplementary material The online version of this article (10.1186/s12859-019-2875-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Stefano Teso
- Computer Science Department, KULeuven, Celestijnenlaan 200 A bus 2402, Leuven, 3001, Belgium
| | - Luca Masera
- Department of Information Engineering and Computer Science, University of Trento, Via Sommarive, 5, Povo di Trento, 38123, Italy
| | - Michelangelo Diligenti
- Department of Information Engineering and Mathematics, University of Siena, San Niccolò, via Roma, 56, Siena, 53100, Italy
| | - Andrea Passerini
- Department of Information Engineering and Computer Science, University of Trento, Via Sommarive, 5, Povo di Trento, 38123, Italy.
| |
Collapse
|
17
|
Abstract
This paper introduces novel deep architectures using the hybrid neural-kernel core model as the first building block. The proposed models follow a combination of a neural networks based architecture and a kernel based model enriched with pooling layers. In particular, in this context three kernel blocks with average, maxout and convolutional pooling layers are introduced and examined. We start with a simple merging layer which averages the output of the previous representation layers. The maxout layer on the other hand triggers competition among different representations of the input. Thanks to this pooling layer, not only the dimensionality of the output of multi-scale representations is reduced but also multiple sub-networks are formed within the same model. In the same context, the pointwise convolutional layer is also employed with the aim of projecting the multi-scale representations onto a new space. Experimental results show an improvement over the core deep hybrid model as well as kernel based models on several real-life datasets.
Collapse
Affiliation(s)
- Siamak Mehrkanoon
- Department of Data Science and Knowledge Engineering, Maastricht University, The Netherlands.
| |
Collapse
|
18
|
Nascimento ACA, Prudêncio RBC, Costa IG. A Drug-Target Network-Based Supervised Machine Learning Repurposing Method Allowing the Use of Multiple Heterogeneous Information Sources. Methods Mol Biol 2019; 1903:281-289. [PMID: 30547449 DOI: 10.1007/978-1-4939-8955-3_17] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Drug-target networks have an important role in pharmaceutical innovation, drug lead discovery, and recent drug repositioning tasks. Many different in silico approaches for the identification of new drug-target interactions have been proposed, many of them based on a particular class of machine learning algorithms called kernel methods. These pattern classification algorithms are able to incorporate previous knowledge in the form of similarity functions, i.e., a kernel, and they have been successful in a wide range of supervised learning problems. The selection of the right kernel function and its respective parameters can have a large influence on the performance of the classifier. Recently, multiple kernel learning algorithms have been introduced to address this problem, enabling one to combine multiple kernels into large drug-target interaction spaces in order to integrate multiple sources of biological information simultaneously. The Kronecker regularized least squares with multiple kernel learning (KronRLS-MKL) is a machine learning algorithm that aims at integrating heterogeneous information sources into a single chemogenomic space to predict new drug-target interactions. This chapter describes how to obtain data from heterogeneous sources and how to implement and use KronRLS-MKL to predict new interactions.
Collapse
Affiliation(s)
| | | | - Ivan G Costa
- Institute for Computational Genomics, Centre of Medical Technology (MTZ), RWTH Aachen University Medical School, Aachen, Germany
| |
Collapse
|
19
|
Sanz H, Valim C, Vegas E, Oller JM, Reverter F. SVM-RFE: selection and visualization of the most relevant features through non-linear kernels. BMC Bioinformatics 2018; 19:432. [PMID: 30453885 PMCID: PMC6245920 DOI: 10.1186/s12859-018-2451-4] [Citation(s) in RCA: 195] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Accepted: 10/30/2018] [Indexed: 02/02/2023] Open
Abstract
Background Support vector machines (SVM) are a powerful tool to analyze data with a number of predictors approximately equal or larger than the number of observations. However, originally, application of SVM to analyze biomedical data was limited because SVM was not designed to evaluate importance of predictor variables. Creating predictor models based on only the most relevant variables is essential in biomedical research. Currently, substantial work has been done to allow assessment of variable importance in SVM models but this work has focused on SVM implemented with linear kernels. The power of SVM as a prediction model is associated with the flexibility generated by use of non-linear kernels. Moreover, SVM has been extended to model survival outcomes. This paper extends the Recursive Feature Elimination (RFE) algorithm by proposing three approaches to rank variables based on non-linear SVM and SVM for survival analysis. Results The proposed algorithms allows visualization of each one the RFE iterations, and hence, identification of the most relevant predictors of the response variable. Using simulation studies based on time-to-event outcomes and three real datasets, we evaluate the three methods, based on pseudo-samples and kernel principal component analysis, and compare them with the original SVM-RFE algorithm for non-linear kernels. The three algorithms we proposed performed generally better than the gold standard RFE for non-linear kernels, when comparing the truly most relevant variables with the variable ranks produced by each algorithm in simulation studies. Generally, the RFE-pseudo-samples outperformed the other three methods, even when variables were assumed to be correlated in all tested scenarios. Conclusions The proposed approaches can be implemented with accuracy to select variables and assess direction and strength of associations in analysis of biomedical data using SVM for categorical or time-to-event responses. Conducting variable selection and interpreting direction and strength of associations between predictors and outcomes with the proposed approaches, particularly with the RFE-pseudo-samples approach can be implemented with accuracy when analyzing biomedical data. These approaches, perform better than the classical RFE of Guyon for realistic scenarios about the structure of biomedical data. Electronic supplementary material The online version of this article (10.1186/s12859-018-2451-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hector Sanz
- Department of Genetics, Microbiology and Statistics, Faculty of Biology, Universitat de Barcelona, Diagonal, 643, 08028, Barcelona, Catalonia, Spain.
| | - Clarissa Valim
- Department of Osteopathic Medical Specialties, Michigan State University, 909 Fee Road, Room B 309 West Fee Hall, East Lansing, MI, 48824, USA.,Department of Immunology and Infectious Diseases, Harvard T.H. Chen School of Public Health, 675 Huntington Ave, Boston, MA, 02115, USA
| | - Esteban Vegas
- Department of Genetics, Microbiology and Statistics, Faculty of Biology, Universitat de Barcelona, Diagonal, 643, 08028, Barcelona, Catalonia, Spain
| | - Josep M Oller
- Department of Genetics, Microbiology and Statistics, Faculty of Biology, Universitat de Barcelona, Diagonal, 643, 08028, Barcelona, Catalonia, Spain
| | - Ferran Reverter
- Department of Genetics, Microbiology and Statistics, Faculty of Biology, Universitat de Barcelona, Diagonal, 643, 08028, Barcelona, Catalonia, Spain.,Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain
| |
Collapse
|
20
|
Georga EI, Príncipe JC, Fotiadis DI. Short-term prediction of glucose in type 1 diabetes using kernel adaptive filters. Med Biol Eng Comput 2019; 57:27-46. [PMID: 29967934 DOI: 10.1007/s11517-018-1859-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Accepted: 06/11/2018] [Indexed: 10/28/2022]
Abstract
This study aims at presenting a nonlinear, recursive, multivariate prediction model of the subcutaneous glucose concentration in type 1 diabetes. Nonlinear regression is performed in a reproducing kernel Hilbert space, by either the fixed budget quantized kernel least mean square (QKLMS-FB) or the approximate linear dependency kernel recursive least-squares (KRLS-ALD) algorithm, such that a sparse model structure is accomplished. A multivariate feature set (i.e., subcutaneous glucose, food carbohydrates, insulin regime and physical activity) is used and its influence on short-term glucose prediction is investigated. The method is evaluated using data from 15 patients with type 1 diabetes in free-living conditions. In the case when all the input variables are considered: (i) the average root mean squared error (RMSE) of QKLMS-FB increases from 13.1 mg dL-1 (mean absolute percentage error (MAPE) 6.6%) for a 15-min prediction horizon (PH) to 37.7 mg dL-1 (MAPE 20.8%) for a 60-min PH and (ii) the RMSE of KRLS-ALD, being predictably lower, increases from 10.5 mg dL-1 (MAPE 5.2%) for a 15-min PH to 31.8 mg dL-1 (MAPE 18.0%) for a 60-min PH. Multivariate data improve systematically both the regularity and the time lag of the predictions, reducing the errors in critical glucose value regions for a PH ≥ 30 min. Graphical abstract ᅟ.
Collapse
|
21
|
Abstract
Many unsupervised kernel methods rely on the estimation of kernel covariance operator (kernel CO) or kernel cross-covariance operator (kernel CCO). Both are sensitive to contaminated data, even when bounded positive definite kernels are used. To the best of our knowledge, there are few well-founded robust kernel methods for statistical unsupervised learning. In addition, while the influence function (IF) of an estimator can characterize its robustness, asymptotic properties and standard error, the IF of a standard kernel canonical correlation analysis (standard kernel CCA) has not been derived yet. To fill this gap, we first propose a robust kernel covariance operator (robust kernel CO) and a robust kernel cross-covariance operator (robust kernel CCO) based on a generalized loss function instead of the quadratic loss function. Second, we derive the IF for robust kernel CCO and standard kernel CCA. Using the IF of the standard kernel CCA, we can detect influential observations from two sets of data. Finally, we propose a method based on the robust kernel CO and the robust kernel CCO, called robust kernel CCA, which is less sensitive to noise than the standard kernel CCA. The introduced principles can also be applied to many other kernel methods involving kernel CO or kernel CCO. Our experiments on both synthesized and imaging genetics data demonstrate that the proposed IF of standard kernel CCA can identify outliers. It is also seen that the proposed robust kernel CCA method performs better for ideal and contaminated data than the standard kernel CCA.
Collapse
Affiliation(s)
- Md Ashad Alam
- Department of Biomedical Engineering, Tulane University, New Orleans, LA 70118, USA
| | - Kenji Fukumizu
- The Institute of Statistical Mathematics, Tachikawa, Tokyo 190-8562, Japan
| | - Yu-Ping Wang
- Department of Biomedical Engineering, Tulane University, New Orleans, LA 70118, USA
| |
Collapse
|
22
|
Tezuka T. Multineuron spike train analysis with R-convolution linear combination kernel. Neural Netw 2018; 102:67-77. [PMID: 29544140 DOI: 10.1016/j.neunet.2018.02.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2017] [Revised: 11/14/2017] [Accepted: 02/20/2018] [Indexed: 11/16/2022]
Abstract
A spike train kernel provides an effective way of decoding information represented by a spike train. Some spike train kernels have been extended to multineuron spike trains, which are simultaneously recorded spike trains obtained from multiple neurons. However, most of these multineuron extensions were carried out in a kernel-specific manner. In this paper, a general framework is proposed for extending any single-neuron spike train kernel to multineuron spike trains, based on the R-convolution kernel. Special subclasses of the proposed R-convolution linear combination kernel are explored. These subclasses have a smaller number of parameters and make optimization tractable when the size of data is limited. The proposed kernel was evaluated using Gaussian process regression for multineuron spike trains recorded from an animal brain. It was compared with the sum kernel and the population Spikernel, which are existing ways of decoding multineuron spike trains using kernels. The results showed that the proposed approach performs better than these kernels and also other commonly used neural decoding methods.
Collapse
Affiliation(s)
- Taro Tezuka
- Center for Artificial Intelligence Research, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Japan; Faculty of Library, Information, and Media Science, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Japan.
| |
Collapse
|
23
|
Zampieri G, Tran DV, Donini M, Navarin N, Aiolli F, Sperduti A, Valle G. Scuba: scalable kernel-based gene prioritization. BMC Bioinformatics 2018; 19:23. [PMID: 29370760 PMCID: PMC5785908 DOI: 10.1186/s12859-018-2025-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2016] [Accepted: 01/15/2018] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND The uncovering of genes linked to human diseases is a pressing challenge in molecular biology and precision medicine. This task is often hindered by the large number of candidate genes and by the heterogeneity of the available information. Computational methods for the prioritization of candidate genes can help to cope with these problems. In particular, kernel-based methods are a powerful resource for the integration of heterogeneous biological knowledge, however, their practical implementation is often precluded by their limited scalability. RESULTS We propose Scuba, a scalable kernel-based method for gene prioritization. It implements a novel multiple kernel learning approach, based on a semi-supervised perspective and on the optimization of the margin distribution. Scuba is optimized to cope with strongly unbalanced settings where known disease genes are few and large scale predictions are required. Importantly, it is able to efficiently deal both with a large amount of candidate genes and with an arbitrary number of data sources. As a direct consequence of scalability, Scuba integrates also a new efficient strategy to select optimal kernel parameters for each data source. We performed cross-validation experiments and simulated a realistic usage setting, showing that Scuba outperforms a wide range of state-of-the-art methods. CONCLUSIONS Scuba achieves state-of-the-art performance and has enhanced scalability compared to existing kernel-based approaches for genomic data. This method can be useful to prioritize candidate genes, particularly when their number is large or when input data is highly heterogeneous. The code is freely available at https://github.com/gzampieri/Scuba .
Collapse
Affiliation(s)
- Guido Zampieri
- CRIBI Biotechnology Center, University of Padova, viale G. Colombo, 3, Padova, Italy.,Department of Women's and Children's Health, University of Padova, via Giustiniani, 3, Padova, Italy
| | - Dinh Van Tran
- Department of Mathematics, University of Padova, via Trieste, 63, Padova, Italy
| | - Michele Donini
- Istituto Italiano di Tecnologia, Via Morego, 30, Genoa, Italy
| | - Nicolò Navarin
- Department of Mathematics, University of Padova, via Trieste, 63, Padova, Italy
| | - Fabio Aiolli
- Department of Mathematics, University of Padova, via Trieste, 63, Padova, Italy
| | - Alessandro Sperduti
- Department of Mathematics, University of Padova, via Trieste, 63, Padova, Italy
| | - Giorgio Valle
- CRIBI Biotechnology Center, University of Padova, viale G. Colombo, 3, Padova, Italy. .,Department of Biology, University of Padova, viale G. Colombo, 3, Padova, Italy.
| |
Collapse
|
24
|
Abstract
Identification of drug-target interactions is a crucial process in drug discovery. In this chapter, we present protocols for recent advancements in machine learning methods for predicting drug-target interactions from heterogeneous biological data in a chemogenomic framework, in which prediction is based on the chemical structure data of drug candidate compounds and translated genomic sequence data of target candidate proteins. Most existing methods are based on either linear modeling or kernel modeling. To illustrate linear modeling, we introduce sparsity-induced binary classifiers and sparse canonical correlation analysis. To illustrate kernel modeling, we introduce pairwise kernel-based support vector machines and kernel-based distance learning. Workflows for using these techniques are presented. We also discuss the characteristics of each method and suggest some directions for future research.
Collapse
Affiliation(s)
- Yoshihiro Yamanishi
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, Iizuka, Fukuoka, Japan.
- PRESTO, Japan Science and Technology Agency, Kawaguchi, Saitama, Japan.
| |
Collapse
|
25
|
Gallego V, Luz Calle M, Oller R. Kernel-Based Measure of Variable Importance for Genetic Association Studies. Int J Biostat 2017. [PMID: 28628480 DOI: 10.1515/ijb-2016-0087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The identification of genetic variants that are associated with disease risk is an important goal of genetic association studies. Standard approaches perform univariate analysis where each genetic variant, usually Single Nucleotide Polymorphisms (SNPs), is tested for association with disease status. Though many genetic variants have been identified and validated so far using this univariate approach, for most complex diseases a large part of their genetic component is still unknown, the so called missing heritability. We propose a Kernel-based measure of variable importance (KVI) that provides the contribution of a SNP, or a group of SNPs, to the joint genetic effect of a set of genetic variants. KVI can be used for ranking genetic markers individually, sets of markers that form blocks of linkage disequilibrium or sets of genetic variants that lie in a gene or a genetic pathway. We prove that, unlike the univariate analysis, KVI captures the relationship with other genetic variants in the analysis, even when measured at the individual level for each genetic variable separately. This is specially relevant and powerful for detecting genetic interactions. We illustrate the results with data from an Alzheimer's disease study and show through simulations that the rankings based on KVI improve those rankings based on two measures of importance provided by the Random Forest. We also prove with a simulation study that KVI is very powerful for detecting genetic interactions.
Collapse
|
26
|
Roche-Lima A. Implementation and comparison of kernel-based learning methods to predict metabolic networks. ACTA ACUST UNITED AC 2016; 5:26. [PMID: 27471658 DOI: 10.1007/s13721-016-0134-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2016] [Revised: 06/29/2016] [Accepted: 07/01/2016] [Indexed: 12/02/2022]
Abstract
Metabolic pathways can be conceptualized as the biological equivalent of a data pipeline. In living cells, series of chemical reactions are carried out by different proteins called enzymes in a stepwise manner. However, many pathways remain incompletely characterized, and in some of them, not all enzyme components have been identified. Kernel methods are useful in many difficult problem areas, such as document classification and bioinformatics. Specifically, kernel methods have been used recently to predict biological networks, such as protein–protein interaction networks and metabolic networks. In this paper, we implement and compare different methods and types of data to predict metabolic networks. The methods are Penalized Kernel Matrix Regression (PKMR) and pairwise Support Vector Machine (pSVM). We develop several experiments using these methods with sequence, non-sequence, and combined data. We obtain better accuracy when the sequence data are used in both methods. Whereas when the methods are compared using the same type of data, the pSVM approach shows better accuracy. The best results are obtained with pSVM using all combined kernels.
Collapse
|
27
|
Abstract
The appearance of massive data has become increasingly common in contemporary scientific research. When sample size n is huge, classical learning methods become computationally costly for the regression purpose. Recently, the orthogonal greedy algorithm (OGA) has been revitalized as an efficient alternative in the context of kernel-based statistical learning. In a learning problem, accurate and fast prediction is often of interest. This makes an appropriate termination crucial for OGA. In this paper, we propose a new termination rule for OGA via investigating its predictive performance. The proposed rule is conceptually simple and convenient for implementation, which suggests an [Formula: see text] number of essential updates in an OGA process. It therefore provides an appealing route to conduct efficient learning for massive data. With a sample dependent kernel dictionary, we show that the proposed method is strongly consistent with an [Formula: see text] convergence rate to the oracle prediction. The promising performance of the method is supported by both simulation and real data examples.
Collapse
Affiliation(s)
- Chen Xu
- The Pennsylvania State University
| | | | | | - Runze Li
- The Pennsylvania State University
| |
Collapse
|
28
|
Soguero-Ruiz C, Hindberg K, Mora-Jiménez I, Rojo-Álvarez JL, Skrøvseth SO, Godtliebsen F, Mortensen K, Revhaug A, Lindsetmo RO, Augestad KM, Jenssen R. Predicting colorectal surgical complications using heterogeneous clinical data and kernel methods. J Biomed Inform 2016; 61:87-96. [PMID: 26980235 DOI: 10.1016/j.jbi.2016.03.008] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2015] [Revised: 01/27/2016] [Accepted: 03/06/2016] [Indexed: 10/22/2022]
Abstract
OBJECTIVE In this work, we have developed a learning system capable of exploiting information conveyed by longitudinal Electronic Health Records (EHRs) for the prediction of a common postoperative complication, Anastomosis Leakage (AL), in a data-driven way and by fusing temporal population data from different and heterogeneous sources in the EHRs. MATERIAL AND METHODS We used linear and non-linear kernel methods individually for each data source, and leveraging the powerful multiple kernels for their effective combination. To validate the system, we used data from the EHR of the gastrointestinal department at a university hospital. RESULTS We first investigated the early prediction performance from each data source separately, by computing Area Under the Curve values for processed free text (0.83), blood tests (0.74), and vital signs (0.65), respectively. When exploiting the heterogeneous data sources combined using the composite kernel framework, the prediction capabilities increased considerably (0.92). Finally, posterior probabilities were evaluated for risk assessment of patients as an aid for clinicians to raise alertness at an early stage, in order to act promptly for avoiding AL complications. DISCUSSION Machine-learning statistical model from EHR data can be useful to predict surgical complications. The combination of EHR extracted free text, blood samples values, and patient vital signs, improves the model performance. These results can be used as a framework for preoperative clinical decision support.
Collapse
Affiliation(s)
- Cristina Soguero-Ruiz
- Dept. of Signal Theory and Communications, Telematics and Computing, Universidad Rey Juan Carlos, Fuenlabrada, Spain.
| | - Kristian Hindberg
- Dept. Mathematics and Statistics, University of Tromsø (UiT), Tromsø, Norway
| | - Inmaculada Mora-Jiménez
- Dept. of Signal Theory and Communications, Telematics and Computing, Universidad Rey Juan Carlos, Fuenlabrada, Spain
| | - José Luis Rojo-Álvarez
- Dept. of Signal Theory and Communications, Telematics and Computing, Universidad Rey Juan Carlos, Fuenlabrada, Spain
| | - Stein Olav Skrøvseth
- Norwegian Centre for Integrated Care and Telemedicine, Norway; University Hospital of North Norway (UNN), Norway; IBM T.J. Watson Research Center, Yorktown Heights, NY, USA
| | - Fred Godtliebsen
- Dept. Mathematics and Statistics, University of Tromsø (UiT), Tromsø, Norway
| | - Kim Mortensen
- Dept. of Gastrointestinal Surgery, UNN, Tromsø, Norway; Institute of Clinical Medicine, UiT, Tromsø, Norway
| | - Arthur Revhaug
- Dept. of Gastrointestinal Surgery, UNN, Tromsø, Norway; Clinic for Surgery, Cancer and Women's Health, UNN, Tromsø, Norway
| | - Rolv-Ole Lindsetmo
- Dept. of Gastrointestinal Surgery, UNN, Tromsø, Norway; Institute of Clinical Medicine, UiT, Tromsø, Norway
| | - Knut Magne Augestad
- Norwegian Centre for Integrated Care and Telemedicine, Norway; Dept. of Surgery, Hammerfest Hospital, Norway; Dept. of Colorectal Surgery, University Hospitals Case Medical Center, Cleveland, USA; Institute of Clinical Medicine, UiT, Tromsø, Norway
| | - Robert Jenssen
- Norwegian Centre for Integrated Care and Telemedicine, Norway; Dept. of Physics and Technology, UiT, Tromsø, Norway
| |
Collapse
|
29
|
Noh YK, Lee DD, Yang KA, Kim C, Zhang BT. Molecular learning with DNA kernel machines. Biosystems 2015; 137:73-83. [PMID: 26163381 DOI: 10.1016/j.biosystems.2015.06.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2015] [Revised: 06/23/2015] [Accepted: 06/25/2015] [Indexed: 11/27/2022]
Abstract
We present a computational learning method for bio-molecular classification. This method shows how to design biochemical operations both for learning and pattern classification. As opposed to prior work, our molecular algorithm learns generic classes considering the realization in vitro via a sequence of molecular biological operations on sets of DNA examples. Specifically, hybridization between DNA molecules is interpreted as computing the inner product between embedded vectors in a corresponding vector space, and our algorithm performs learning of a binary classifier in this vector space. We analyze the thermodynamic behavior of these learning algorithms, and show simulations on artificial and real datasets as well as demonstrate preliminary wet experimental results using gel electrophoresis.
Collapse
Affiliation(s)
- Yung-Kyun Noh
- Department of Mechanical and Aerospace Engineering, Seoul National University, Republic of Korea
| | - Daniel D Lee
- Department of Electrical and Systems Engineering, University of Pennsylvania, USA
| | | | - Cheongtag Kim
- Department of Psychology, Seoul National University, Republic of Korea
| | - Byoung-Tak Zhang
- School of Computer Science and Engineering, Seoul National University, Republic of Korea.
| |
Collapse
|
30
|
Fong Y, Datta S, Georgiev IS, Kwong PD, Tomaras GD. Kernel-based logistic regression model for protein sequence without vectorialization. Biostatistics 2014; 16:480-92. [PMID: 25532524 DOI: 10.1093/biostatistics/kxu056] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2014] [Accepted: 11/13/2014] [Indexed: 11/12/2022] Open
Abstract
Protein sequence data arise more and more often in vaccine and infectious disease research. These types of data are discrete, high-dimensional, and complex. We propose to study the impact of protein sequences on binary outcomes using a kernel-based logistic regression model, which models the effect of protein through a random effect whose variance-covariance matrix is mostly determined by a kernel function. We propose a novel, biologically motivated, profile hidden Markov model (HMM)-based mutual information (MI) kernel. Hypothesis testing can be carried out using the maximum of the score statistics and a parametric bootstrap procedure. To improve the power of testing, we propose intuitive modifications to the test statistic. We show through simulation studies that the profile HMM-based MI kernel can be substantially more powerful than competing kernels, and that the modified test statistics bring incremental gains in power. We use these proposed methods to investigate two problems from HIV-1 vaccine research: (1) identifying segments of HIV-1 envelope (Env) protein that confer resistance to neutralizing antibody and (2) identifying segments of Env that are associated with attenuation of protective vaccine effect by antibodies of isotype A in the RV144 vaccine trial.
Collapse
Affiliation(s)
- Youyi Fong
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98006, USA
| | - Saheli Datta
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98006, USA
| | - Ivelin S Georgiev
- Vaccine Research Center, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Peter D Kwong
- Vaccine Research Center, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Georgia D Tomaras
- Duke Human Vaccine Institute, Duke University Medical Center, Durham, NC 27710, USA
| |
Collapse
|
31
|
Abstract
In online learning with kernels, it is vital to control the size (budget) of the support set because of the curse of kernelization. In this paper, we propose two simple and effective stochastic strategies for controlling the budget. Both algorithms have an expected regret that is sublinear in the horizon. Experimental results on a number of benchmark data sets demonstrate encouraging performance in terms of both efficacy and efficiency.
Collapse
Affiliation(s)
- Wenwu He
- Department of Mathematics and Physics, Fujian University of Technology, Fuzhou, Fujian 350118, China; Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong.
| | - James T Kwok
- Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong.
| |
Collapse
|
32
|
Bromuri S, Zufferey D, Hennebert J, Schumacher M. Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms. J Biomed Inform 2014; 51:165-75. [PMID: 24879897 DOI: 10.1016/j.jbi.2014.05.010] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2013] [Revised: 05/16/2014] [Accepted: 05/19/2014] [Indexed: 11/25/2022]
Abstract
OBJECTIVE This research is motivated by the issue of classifying illnesses of chronically ill patients for decision support in clinical settings. Our main objective is to propose multi-label classification of multivariate time series contained in medical records of chronically ill patients, by means of quantization methods, such as bag of words (BoW), and multi-label classification algorithms. Our second objective is to compare supervised dimensionality reduction techniques to state-of-the-art multi-label classification algorithms. The hypothesis is that kernel methods and locality preserving projections make such algorithms good candidates to study multi-label medical time series. METHODS We combine BoW and supervised dimensionality reduction algorithms to perform multi-label classification on health records of chronically ill patients. The considered algorithms are compared with state-of-the-art multi-label classifiers in two real world datasets. Portavita dataset contains 525 diabetes type 2 (DT2) patients, with co-morbidities of DT2 such as hypertension, dyslipidemia, and microvascular or macrovascular issues. MIMIC II dataset contains 2635 patients affected by thyroid disease, diabetes mellitus, lipoid metabolism disease, fluid electrolyte disease, hypertensive disease, thrombosis, hypotension, chronic obstructive pulmonary disease (COPD), liver disease and kidney disease. The algorithms are evaluated using multi-label evaluation metrics such as hamming loss, one error, coverage, ranking loss, and average precision. RESULTS Non-linear dimensionality reduction approaches behave well on medical time series quantized using the BoW algorithm, with results comparable to state-of-the-art multi-label classification algorithms. Chaining the projected features has a positive impact on the performance of the algorithm with respect to pure binary relevance approaches. CONCLUSIONS The evaluation highlights the feasibility of representing medical health records using the BoW for multi-label classification tasks. The study also highlights that dimensionality reduction algorithms based on kernel methods, locality preserving projections or both are good candidates to deal with multi-label classification tasks in medical time series with many missing values and high label density.
Collapse
Affiliation(s)
- Stefano Bromuri
- University of Applied Sciences Western Switzerland, Institute of Business Information Systems, TechnoArk 3, CH-3960 Sierre, Switzerland.
| | - Damien Zufferey
- University of Applied Sciences Western Switzerland, Institute of Business Information Systems, TechnoArk 3, CH-3960 Sierre, Switzerland
| | - Jean Hennebert
- University of Applied Sciences Western Switzerland, Institute of Information and Communication Technologies, Bd de Pérolles 80, CH-1705 Fribourg, Switzerland
| | - Michael Schumacher
- University of Applied Sciences Western Switzerland, Institute of Business Information Systems, TechnoArk 3, CH-3960 Sierre, Switzerland
| |
Collapse
|
33
|
Abstract
Kernel learning methods, whether Bayesian or frequentist, typically involve multiple levels of inference, with the coefficients of the kernel expansion being determined at the first level and the kernel and regularisation parameters carefully tuned at the second level, a process known as model selection. Model selection for kernel machines is commonly performed via optimisation of a suitable model selection criterion, often based on cross-validation or theoretical performance bounds. However, if there are a large number of kernel parameters, as for instance in the case of automatic relevance determination (ARD), there is a substantial risk of over-fitting the model selection criterion, resulting in poor generalisation performance. In this paper we investigate the possibility of learning the kernel, for the Least-Squares Support Vector Machine (LS-SVM) classifier, at the first level of inference, i.e. parameter optimisation. The kernel parameters and the coefficients of the kernel expansion are jointly optimised at the first level of inference, minimising a training criterion with an additional regularisation term acting on the kernel parameters. The key advantage of this approach is that the values of only two regularisation parameters need be determined in model selection, substantially alleviating the problem of over-fitting the model selection criterion. The benefits of this approach are demonstrated using a suite of synthetic and real-world binary classification benchmark problems, where kernel learning at the first level of inference is shown to be statistically superior to the conventional approach, improves on our previous work (Cawley and Talbot, 2007) and is competitive with Multiple Kernel Learning approaches, but with reduced computational expense.
Collapse
Affiliation(s)
- Gavin C Cawley
- School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK.
| | - Nicola L C Talbot
- School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK.
| |
Collapse
|
34
|
Wang YC, Deng N, Chen S, Wang Y. Computational Study of Drugs by Integrating Omics Data with Kernel Methods. Mol Inform 2013; 32:930-41. [PMID: 27481139 DOI: 10.1002/minf.201300090] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2013] [Accepted: 11/13/2013] [Indexed: 01/02/2023]
Abstract
With the rapid development of genomic and chemogenomic techniques, many omics data sources for drugs have been publicly available. These data sources illustrate drug's biological function in the living cell from different levels and different aspects. One straightforward idea is to learn understandable rules via computational models and algorithms to mine and integrate these data sources. Here, we review our recent efforts on developing kernel-based methods to integrate drug related omics data sources. Three promising applications of our framework are shown to predict drug targets, assign drug's ATC-code annotation, and reveal drug repositioning. We demonstrate that data integration does provide more information and improve the accuracy by recovering more experimentally observed target proteins, ATC-codes, and drug repositioning. Importantly, data integration can indicate novel predictions which are supported by database search and functional annotation analysis and worthy of further experimental validation. In conclusion, kernel methods can efficiently integrate heterogeneous data sources to computationally study drugs, and will promote the further research in drug discovery in a low-cost way.
Collapse
Affiliation(s)
- Yongcui C Wang
- Key Laboratory of Adaptation and Evolution of Plateau Biota, Northwest Institute of Plateau Biology, Chinese Academy of Sciences, No. 23, Xinning Road, Xining, Qinghai Province, P. R. China
| | - Naiyang Deng
- College of Science, China Agriculture University, No. 17. Qinghua East Road, Beijing, P. R. China
| | - Shilong Chen
- Key Laboratory of Adaptation and Evolution of Plateau Biota, Northwest Institute of Plateau Biology, Chinese Academy of Sciences, No. 23, Xinning Road, Xining, Qinghai Province, P. R. China.
| | - Yong Wang
- National Centre for Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, N0.55, Zhongguancun East Road, Beijing, P. R. China. .,Molecular Profiling Research Center for Drug Discovery, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan.
| |
Collapse
|
35
|
Abstract
In this paper, we propose a recurrent kernel algorithm with selectively sparse updates for online learning. The algorithm introduces a linear recurrent term in the estimation of the current output. This makes the past information reusable for updating of the algorithm in the form of a recurrent gradient term. To ensure that the reuse of this recurrent gradient indeed accelerates the convergence speed, a novel hybrid recurrent training is proposed to switch on or off learning the recurrent information according to the magnitude of the current training error. Furthermore, the algorithm includes a data-dependent adaptive learning rate which can provide guaranteed system weight convergence at each training iteration. The learning rate is set as zero when the training violates the derived convergence conditions, which makes the algorithm updating process sparse. Theoretical analyses of the weight convergence are presented and experimental results show the good performance of the proposed algorithm in terms of convergence speed and estimation accuracy.
Collapse
Affiliation(s)
- Haijin Fan
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore.
| | - Qing Song
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore
| |
Collapse
|
36
|
Abstract
The support vector machine (SVM) methodology has become a popular and well-used component of present chemometric analysis. We assess a relatively recent development of the algorithm, multiple kernel learning (MKL), on published structure-property relationship (SPR) data. The MKL algorithm learns a weighting across multiple kernel-based representations of the data during supervised classifier creation and, thereby, may be used to describe the influence of distinct groups of structural descriptors upon a single structure-property classifier without explicitly omitting any of them. We observe a statistically significant performance improvement over a conventional, single kernel SVM on all three SPR data sets analysed. Furthermore, MKL output is observed to provide useful information regarding the relative influence of five distinct descriptor subsets present in each data set.
Collapse
Affiliation(s)
- Nicholas C V Pilkington
- University of Cambridge Computer Laboratory, 15 JJ Thomson Avenue, Cambridge, CB3 0FD, UK phone: +44 (0)1223 763725
| | - Matthew W B Trotter
- Anne McLaren Laboratory for Regenerative Medicine & Department of Surgery, University of Cambridge, UK.,Celgene Institute for Translational Research Europe (CITRE), Sevilla, Spain
| | - Sean B Holden
- University of Cambridge Computer Laboratory, 15 JJ Thomson Avenue, Cambridge, CB3 0FD, UK phone: +44 (0)1223 763725.
| |
Collapse
|
37
|
Li X, Shu L. Kernel Based Nonlinear Dimensionality Reduction and Classification for Genomic Microarray. Sensors (Basel) 2008; 8:4186-4200. [PMID: 27879930 PMCID: PMC3697169 DOI: 10.3390/s8074186] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/04/2008] [Revised: 06/17/2008] [Accepted: 07/06/2008] [Indexed: 11/16/2022]
Abstract
Genomic microarrays are powerful research tools in bioinformatics and modern medicinal research because they enable massively-parallel assays and simultaneous monitoring of thousands of gene expression of biological samples. However, a simple microarray experiment often leads to very high-dimensional data and a huge amount of information, the vast amount of data challenges researchers into extracting the important features and reducing the high dimensionality. In this paper, a nonlinear dimensionality reduction kernel method based locally linear embedding(LLE) is proposed, and fuzzy K-nearest neighbors algorithm which denoises datasets will be introduced as a replacement to the classical LLE's KNN algorithm. In addition, kernel method based support vector machine (SVM) will be used to classify genomic microarray data sets in this paper. We demonstrate the application of the techniques to two published DNA microarray data sets. The experimental results confirm the superiority and high success rates of the presented method.
Collapse
Affiliation(s)
- Xuehua Li
- School of Applied Mathematics, University of Electronic Science and Technology of China, Chengdu, 610054, P.R. China..
| | - Lan Shu
- School of Applied Mathematics, University of Electronic Science and Technology of China, Chengdu, 610054, P.R. China
| |
Collapse
|