1
|
Manome N, Shinohara S, Takahashi T, Chen Y, Chung UI. Self-incremental learning vector quantization with human cognitive biases. Sci Rep 2021; 11:3910. [PMID: 33594132 PMCID: PMC7887244 DOI: 10.1038/s41598-021-83182-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Accepted: 01/27/2021] [Indexed: 11/30/2022] Open
Abstract
Human beings have adaptively rational cognitive biases for efficiently acquiring concepts from small-sized datasets. With such inductive biases, humans can generalize concepts by learning a small number of samples. By incorporating human cognitive biases into learning vector quantization (LVQ), a prototype-based online machine learning method, we developed self-incremental LVQ (SILVQ) methods that can be easily interpreted. We first describe a method to automatically adjust the learning rate that incorporates human cognitive biases. Second, SILVQ, which self-increases the prototypes based on the method for automatically adjusting the learning rate, is described. The performance levels of the proposed methods are evaluated in experiments employing four real and two artificial datasets. Compared with the original learning vector quantization algorithms, our methods not only effectively remove the need for parameter tuning, but also achieve higher accuracy from learning small numbers of instances. In the cases of larger numbers of instances, SILVQ can still achieve an accuracy that is equal to or better than those of existing representative LVQ algorithms. Furthermore, SILVQ can learn linearly inseparable conceptual structures with the required and sufficient number of prototypes without overfitting.
Collapse
Affiliation(s)
- Nobuhito Manome
- Graduate School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan.
- Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan.
| | - Shuji Shinohara
- Graduate School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
| | - Tatsuji Takahashi
- Graduate School of Science and Engineering, Tokyo Denki University, Saitama, Japan
| | - Yu Chen
- Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Ung-Il Chung
- Graduate School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
- Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
2
|
Can Learning Vector Quantization be an Alternative to SVM and Deep Learning? - Recent Trends and Advanced Variants of Learning Vector Quantization for Classification Learning. JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH 2016. [DOI: 10.1515/jaiscr-2017-0005] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Abstract
Learning vector quantization (LVQ) is one of the most powerful approaches for prototype based classification of vector data, intuitively introduced by Kohonen. The prototype adaptation scheme relies on its attraction and repulsion during the learning providing an easy geometric interpretability of the learning as well as of the classification decision scheme. Although deep learning architectures and support vector classifiers frequently achieve comparable or even better results, LVQ models are smart alternatives with low complexity and computational costs making them attractive for many industrial applications like intelligent sensor systems or advanced driver assistance systems.
Nowadays, the mathematical theory developed for LVQ delivers sufficient justification of the algorithm making it an appealing alternative to other approaches like support vector machines and deep learning techniques.
This review article reports current developments and extensions of LVQ starting from the generalized LVQ (GLVQ), which is known as the most powerful cost function based realization of the original LVQ. The cost function minimized in GLVQ is an soft-approximation of the standard classification error allowing gradient descent learning techniques. The GLVQ variants considered in this contribution, cover many aspects like bordersensitive learning, application of non-Euclidean metrics like kernel distances or divergences, relevance learning as well as optimization of advanced statistical classification quality measures beyond the accuracy including sensitivity and specificity or area under the ROC-curve.
According to these topics, the paper highlights the basic motivation for these variants and extensions together with the mathematical prerequisites and treatments for integration into the standard GLVQ scheme and compares them to other machine learning approaches. For detailed description and mathematical theory behind all, the reader is referred to the respective original articles.
Thus, the intention of the paper is to provide a comprehensive overview of the stateof- the-art serving as a starting point to search for an appropriate LVQ variant in case of a given specific classification problem as well as a reference to recently developed variants and improvements of the basic GLVQ scheme.
Collapse
|
3
|
|
4
|
|
5
|
Kaden M, Riedel M, Hermann W, Villmann T. Border-sensitive learning in generalized learning vector quantization: an alternative to support vector machines. Soft comput 2014. [DOI: 10.1007/s00500-014-1496-1] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
6
|
|
7
|
Fouad S, Tino P. Adaptive Metric Learning Vector Quantization for Ordinal Classification. Neural Comput 2012; 24:2825-51. [DOI: 10.1162/neco_a_00358] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Many pattern analysis problems require classification of examples into naturally ordered classes. In such cases, nominal classification schemes will ignore the class order relationships, which can have a detrimental effect on classification accuracy. This article introduces two novel ordinal learning vector quantization (LVQ) schemes, with metric learning, specifically designed for classifying data items into ordered classes. In ordinal LVQ, unlike in nominal LVQ, the class order information is used during training in selecting the class prototypes to be adapted, as well as in determining the exact manner in which the prototypes get updated. Prototype-based models in general are more amenable to interpretations and can often be constructed at a smaller computational cost than alternative nonlinear classification models. Experiments demonstrate that the proposed ordinal LVQ formulations compare favorably with their nominal counterparts. Moreover, our methods achieve competitive performance against existing benchmark ordinal regression models.
Collapse
Affiliation(s)
- Shereen Fouad
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, U.K
| | - Peter Tino
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, U.K
| |
Collapse
|
8
|
Huber MB, Bunte K, Nagarajan MB, Biehl M, Ray LA, Wismüller A. Texture feature ranking with relevance learning to classify interstitial lung disease patterns. Artif Intell Med 2012; 56:91-7. [PMID: 23010586 DOI: 10.1016/j.artmed.2012.07.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2011] [Revised: 05/26/2012] [Accepted: 07/12/2012] [Indexed: 11/26/2022]
Abstract
OBJECTIVE The generalized matrix learning vector quantization (GMLVQ) is used to estimate the relevance of texture features in their ability to classify interstitial lung disease patterns in high-resolution computed tomography images. METHODOLOGY After a stochastic gradient descent, the GMLVQ algorithm provides a discriminative distance measure of relevance factors, which can account for pairwise correlations between different texture features and their importance for the classification of healthy and diseased patterns. 65 texture features were extracted from gray-level co-occurrence matrices (GLCMs). These features were ranked and selected according to their relevance obtained by GMLVQ and, for comparison, to a mutual information (MI) criteria. The classification performance for different feature subsets was calculated for a k-nearest-neighbor (kNN) and a random forests classifier (RanForest), and support vector machines with a linear and a radial basis function kernel (SVMlin and SVMrbf). RESULTS For all classifiers, feature sets selected by the relevance ranking assessed by GMLVQ had a significantly better classification performance (p<0.05) for many texture feature sets compared to the MI approach. For kNN, RanForest, and SVMrbf, some of these feature subsets had a significantly better classification performance when compared to the set consisting of all features (p<0.05). CONCLUSION While this approach estimates the relevance of single features, future considerations of GMLVQ should include the pairwise correlation for the feature ranking, e.g. to reduce the redundancy of two equally relevant features.
Collapse
Affiliation(s)
- Markus B Huber
- Departments of Imaging Sciences and Biomedical Engineering, University of Rochester, NY, United States.
| | | | | | | | | | | |
Collapse
|
9
|
Kästner M, Hammer B, Biehl M, Villmann T. Functional relevance learning in generalized learning vector quantization. Neurocomputing 2012. [DOI: 10.1016/j.neucom.2011.11.029] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
10
|
Schleif FM, Villmann T, Hammer B, Schneider P. Efficient Kernelized prototype based classification. Int J Neural Syst 2012; 21:443-57. [PMID: 22131298 DOI: 10.1142/s012906571100295x] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Prototype based classifiers are effective algorithms in modeling classification problems and have been applied in multiple domains. While many supervised learning algorithms have been successfully extended to kernels to improve the discrimination power by means of the kernel concept, prototype based classifiers are typically still used with Euclidean distance measures. Kernelized variants of prototype based classifiers are currently too complex to be applied for larger data sets. Here we propose an extension of Kernelized Generalized Learning Vector Quantization (KGLVQ) employing a sparsity and approximation technique to reduce the learning complexity. We provide generalization error bounds and experimental results on real world data, showing that the extended approach is comparable to SVM on different public data.
Collapse
Affiliation(s)
- F-M Schleif
- Department of Techn., Univ. of Bielefeld, Universitätsstrasse 21-23, 33615 Bielefeld, Germany.
| | | | | | | |
Collapse
|
11
|
Bunte K, Schneider P, Hammer B, Schleif FM, Villmann T, Biehl M. Limited Rank Matrix Learning, discriminative dimension reduction and visualization. Neural Netw 2011; 26:159-73. [PMID: 22041220 DOI: 10.1016/j.neunet.2011.10.001] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2010] [Revised: 09/13/2011] [Accepted: 10/07/2011] [Indexed: 11/20/2022]
Abstract
We present an extension of the recently introduced Generalized Matrix Learning Vector Quantization algorithm. In the original scheme, adaptive square matrices of relevance factors parameterize a discriminative distance measure. We extend the scheme to matrices of limited rank corresponding to low-dimensional representations of the data. This allows to incorporate prior knowledge of the intrinsic dimension and to reduce the number of adaptive parameters efficiently. In particular, for very large dimensional data, the limitation of the rank can reduce computation time and memory requirements significantly. Furthermore, two- or three-dimensional representations constitute an efficient visualization method for labeled data sets. The identification of a suitable projection is not treated as a pre-processing step but as an integral part of the supervised training. Several real world data sets serve as an illustration and demonstrate the usefulness of the suggested method.
Collapse
Affiliation(s)
- Kerstin Bunte
- University of Groningen, Johann Bernoulli Institute for Mathematics and Computer Science, The Netherlands.
| | | | | | | | | | | |
Collapse
|
12
|
Bunte K, Hammer B, Wismüller A, Biehl M. Adaptive local dissimilarity measures for discriminative dimension reduction of labeled data. Neurocomputing 2010. [DOI: 10.1016/j.neucom.2009.11.017] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
13
|
Schneider P, Biehl M, Hammer B. Hyperparameter learning in probabilistic prototype-based models. Neurocomputing 2010. [DOI: 10.1016/j.neucom.2009.11.021] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
14
|
Marchiori E. Class conditional nearest neighbor for large margin instance selection. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2010; 32:364-370. [PMID: 20075464 DOI: 10.1109/tpami.2009.164] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
This paper presents a relational framework for studying properties of labeled data points related to proximity and labeling information in order to improve the performance of the 1NN rule. Specifically, the class conditional nearest neighbor (ccnn) relation over pairs of points in a labeled training set is introduced. For a given class label c, this relation associates to each point a its nearest neighbor computed among only those points with class label c (excluded a). A characterization of ccnn in terms of two graphs is given. These graphs are used for defining a novel scoring function over instances by means of an information-theoretic divergence measure applied to the degree distributions of these graphs. The scoring function is employed to develop an effective large margin instance selection method, which is empirically demonstrated to improve storage and accuracy performance of the 1NN rule on artificial and real-life data sets.
Collapse
Affiliation(s)
- Elena Marchiori
- Institute for Computing and Information Sciences (ICIS), Faculty of Science, Radboud University, Toernooiveld 1, NL 6525 ED Nijmegen, The Netherlands.
| |
Collapse
|
15
|
Schneider P, Biehl M, Hammer B. Adaptive relevance matrices in learning vector quantization. Neural Comput 2009; 21:3532-61. [PMID: 19764875 DOI: 10.1162/neco.2009.11-08-908] [Citation(s) in RCA: 92] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
We propose a new matrix learning scheme to extend relevance learning vector quantization (RLVQ), an efficient prototype-based classification algorithm, toward a general adaptive metric. By introducing a full matrix of relevance factors in the distance measure, correlations between different features and their importance for the classification scheme can be taken into account and automated, and general metric adaptation takes place during training. In comparison to the weighted Euclidean metric used in RLVQ and its variations, a full matrix is more powerful to represent the internal structure of the data appropriately. Large margin generalization bounds can be transferred to this case, leading to bounds that are independent of the input dimensionality. This also holds for local metrics attached to each prototype, which corresponds to piecewise quadratic decision boundaries. The algorithm is tested in comparison to alternative learning vector quantization schemes using an artificial data set, a benchmark multiclass problem from the UCI repository, and a problem from bioinformatics, the recognition of splice sites for C. elegans.
Collapse
Affiliation(s)
- Petra Schneider
- Institute of Mathematics and Computing Science, University of Groningen, 9700 AK Groningen, The Netherlands.
| | | | | |
Collapse
|
16
|
Schneider P, Biehl M, Hammer B. Distance learning in discriminative vector quantization. Neural Comput 2009; 21:2942-69. [PMID: 19635012 DOI: 10.1162/neco.2009.10-08-892] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Discriminative vector quantization schemes such as learning vector quantization (LVQ) and extensions thereof offer efficient and intuitive classifiers based on the representation of classes by prototypes. The original methods, however, rely on the Euclidean distance corresponding to the assumption that the data can be represented by isotropic clusters. For this reason, extensions of the methods to more general metric structures have been proposed, such as relevance adaptation in generalized LVQ (GLVQ) and matrix learning in GLVQ. In these approaches, metric parameters are learned based on the given classification task such that a data-driven distance measure is found. In this letter, we consider full matrix adaptation in advanced LVQ schemes. In particular, we introduce matrix learning to a recent statistical formalization of LVQ, robust soft LVQ, and we compare the results on several artificial and real-life data sets to matrix learning in GLVQ, a derivation of LVQ-like learning based on a (heuristic) cost function. In all cases, matrix adaptation allows a significant improvement of the classification accuracy. Interestingly, however, the principled behavior of the models with respect to prototype locations and extracted matrix dimensions shows several characteristic differences depending on the data sets.
Collapse
Affiliation(s)
- Petra Schneider
- Institute of Mathematics and Computing Science, University of Groningen, 9700 AK Groningen, The Netherlands.
| | | | | |
Collapse
|
17
|
Schleif FM, Villmann T, Kostrzewa M, Hammer B, Gammerman A. Cancer informatics by prototype networks in mass spectrometry. Artif Intell Med 2009; 45:215-28. [DOI: 10.1016/j.artmed.2008.07.018] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2007] [Revised: 07/25/2008] [Accepted: 07/26/2008] [Indexed: 11/26/2022]
|
18
|
Mendenhall M, Merenyi E. Relevance-Based Feature Extraction for Hyperspectral Images. ACTA ACUST UNITED AC 2008; 19:658-72. [DOI: 10.1109/tnn.2007.914156] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
19
|
|
20
|
Villmann T, Hammer B, Schleif F, Geweniger T, Herrmann W. Fuzzy classification by fuzzy labeled neural gas. Neural Netw 2006; 19:772-9. [PMID: 16815673 DOI: 10.1016/j.neunet.2006.05.026] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
We extend the neural gas for supervised fuzzy classification. In this way we are able to learn crisp as well as fuzzy clustering, given labeled data. Based on the neural gas cost function, we propose three different ways to incorporate the additional class information into the learning algorithm. We demonstrate the effect on the location of the prototypes and the classification accuracy. Further, we show that relevance learning can be easily included.
Collapse
Affiliation(s)
- Th Villmann
- University Leipzig, Clinic for Psychotherapy, Karl-Tauchnitz-Str. 25, 04107 Leipzig, Germany.
| | | | | | | | | |
Collapse
|
21
|
Abstract
Learning vector quantization (LVQ) constitutes a powerful and intuitive method for adaptive nearest prototype classification. However, original LVQ has been introduced based on heuristics and numerous modifications exist to achieve better convergence and stability. Recently, a mathematical foundation by means of a cost function has been proposed which, as a limiting case, yields a learning rule similar to classical LVQ2.1. It also motivates a modification which shows better stability. However, the exact dynamics as well as the generalization ability of many LVQ algorithms have not been thoroughly investigated so far. Using concepts from statistical physics and the theory of on-line learning, we present a mathematical framework to analyse the performance of different LVQ algorithms in a typical scenario in terms of their dynamics, sensitivity to initial conditions, and generalization ability. Significant differences in the algorithmic stability and generalization ability can be found already for slightly different variants of LVQ. We study five LVQ algorithms in detail: Kohonen's original LVQ1, unsupervised vector quantization (VQ), a mixture of VQ and LVQ, LVQ2.1, and a variant of LVQ which is based on a cost function. Surprisingly, basic LVQ1 shows very good performance in terms of stability, asymptotic generalization ability, and robustness to initializations and model parameters which, in many cases, is superior to recent alternative proposals.
Collapse
Affiliation(s)
- Anarta Ghosh
- Rijksuniversiteit Groningen, Mathematics and Computing Science, P.O. Box 800, NL-9700 AV Groningen, The Netherlands.
| | | | | |
Collapse
|
22
|
Villmann T, Schleif F, Hammer B. Comparison of relevance learning vector quantization with other metric adaptive classification methods. Neural Netw 2006; 19:610-22. [PMID: 16343848 DOI: 10.1016/j.neunet.2005.07.013] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2005] [Accepted: 07/18/2005] [Indexed: 11/18/2022]
Abstract
The paper deals with the concept of relevance learning in learning vector quantization and classification. Recent machine learning approaches with the ability of metric adaptation but based on different concepts are considered in comparison to variants of relevance learning vector quantization. We compare these methods with respect to their theoretical motivation and we demonstrate the differences of their behavior for several real world data sets.
Collapse
Affiliation(s)
- Th Villmann
- Clinic for Psychotherapy, University Leipzig, Karl-Tauchnitz-Str. 25, 04107 Leipzig, Germany.
| | | | | |
Collapse
|
23
|
Biehl M, Ghosh A, Hammer B. Learning vector quantization: The dynamics of winner-takes-all algorithms. Neurocomputing 2006. [DOI: 10.1016/j.neucom.2005.12.007] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
24
|
Strickert M, Seiffert U, Sreenivasulu N, Weschke W, Villmann T, Hammer B. Generalized relevance LVQ (GRLVQ) with correlation measures for gene expression analysis. Neurocomputing 2006. [DOI: 10.1016/j.neucom.2005.12.004] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|