1
|
Xia S, Zheng Y, Wang G, He P, Li H, Chen Z. Random Space Division Sampling for Label-Noisy Classification or Imbalanced Classification. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:10444-10457. [PMID: 33909577 DOI: 10.1109/tcyb.2021.3070005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
This article presents a simple sampling method, which is very easy to be implemented, for classification by introducing the idea of random space division, called "random space division sampling" (RSDS). It can extract the boundary points as the sampled result by efficiently distinguishing the label noise points, inner points, and boundary points. This makes it the first general sampling method for classification that not only can reduce the data size but also enhance the classification accuracy of a classifier, especially in the label-noisy classification. The "general" means that it is not restricted to any specific classifiers or datasets (regardless of whether a dataset is linear or not). Furthermore, the RSDS can online accelerate most classifiers because of its lower time complexity than most classifiers. Moreover, the RSDS can be used as an undersampling method for imbalanced classification. The experimental results on benchmark datasets demonstrate its effectiveness and efficiency. The code of the RSDS and comparison algorithms is available at: https://github.com/syxiaa/RSDS.
Collapse
|
2
|
Yu Z, Lan K, Liu Z, Han G. Progressive Ensemble Kernel-Based Broad Learning System for Noisy Data Classification. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:9656-9669. [PMID: 33784632 DOI: 10.1109/tcyb.2021.3064821] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The broad learning system (BLS) is an algorithm that facilitates feature representation learning and data classification. Although weights of BLS are obtained by analytical computation, which brings better generalization and higher efficiency, BLS suffers from two drawbacks: 1) the performance depends on the number of hidden nodes, which requires manual tuning, and 2) double random mappings bring about the uncertainty, which leads to poor resistance to noise data, as well as unpredictable effects on performance. To address these issues, a kernel-based BLS (KBLS) method is proposed by projecting feature nodes obtained from the first random mapping into kernel space. This manipulation reduces the uncertainty, which contributes to performance improvements with the fixed number of hidden nodes, and indicates that manually tuning is no longer needed. Moreover, to further improve the stability and noise resistance of KBLS, a progressive ensemble framework is proposed, in which the residual of the previous base classifiers is used to train the following base classifier. We conduct comparative experiments against the existing state-of-the-art hierarchical learning methods on multiple noisy real-world datasets. The experimental results indicate our approaches achieve the best or at least comparable performance in terms of accuracy.
Collapse
|
3
|
Yang Y, Hu Y, Zhang X, Wang S. Two-Stage Selective Ensemble of CNN via Deep Tree Training for Medical Image Classification. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:9194-9207. [PMID: 33705343 DOI: 10.1109/tcyb.2021.3061147] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Medical image classification is an important task in computer-aided diagnosis systems. Its performance is critically determined by the descriptiveness and discriminative power of features extracted from images. With rapid development of deep learning, deep convolutional neural networks (CNNs) have been widely used to learn the optimal high-level features from the raw pixels of images for a given classification task. However, due to the limited amount of labeled medical images with certain quality distortions, such techniques crucially suffer from the training difficulties, including overfitting, local optimums, and vanishing gradients. To solve these problems, in this article, we propose a two-stage selective ensemble of CNN branches via a novel training strategy called deep tree training (DTT). In our approach, DTT is adopted to jointly train a series of networks constructed from the hidden layers of CNN in a hierarchical manner, leading to the advantage that vanishing gradients can be mitigated by supplementing gradients for hidden layers of CNN, and intrinsically obtain the base classifiers on the middle-level features with minimum computation burden for an ensemble solution. Moreover, the CNN branches as base learners are combined into the optimal classifier via the proposed two-stage selective ensemble approach based on both accuracy and diversity criteria. Extensive experiments on CIFAR-10 benchmark and two specific medical image datasets illustrate that our approach achieves better performance in terms of accuracy, sensitivity, specificity, and F1 score measurement.
Collapse
|
4
|
Xu Y, Yu Z, Cao W, Chen CLP. Adaptive Dense Ensemble Model for Text Classification. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:7513-7526. [PMID: 34990374 DOI: 10.1109/tcyb.2021.3133106] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Text classification has been widely explored in natural language processing. In this article, we propose a novel adaptive dense ensemble model (AdaDEM) for text classification, which includes local ensemble stage (LES) and global dense ensemble stage (GDES). To strengthen the classification ability and robustness of the enhanced layer, we propose a selective ensemble model based on enhanced attention convolutional neural networks (EnCNNs). To increase the diversity of the ensemble system, these EnCNNs are generated by using two manners: 1) different sample subsets and 2) different granularity kernels. Then, an evaluation criterion that considers both accuracy and diversity is proposed in LES to obtain effective integration results. Furthermore, to make better use of information flow, we develop an adaptive dense ensemble structure with multiple enhanced layers in GDES to mitigate the issue that there may be redundant or invalid enhanced layers in the cascade structure. We conducted extensive experiments against state-of-the-art methods on multiple real-world datasets, including long and short texts, which has verified the effectiveness and generality of our method.
Collapse
|
5
|
Sheng B, Li P, Ali R, Chen CLP. Improving Video Temporal Consistency via Broad Learning System. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:6662-6675. [PMID: 34077381 DOI: 10.1109/tcyb.2021.3079311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Applying image-based processing methods to original videos on a framewise level breaks the temporal consistency between consecutive frames. Traditional video temporal consistency methods reconstruct an original frame containing flickers from corresponding nonflickering frames, but the inaccurate correspondence realized by optical flow restricts their practical use. In this article, we propose a temporally broad learning system (TBLS), an approach that enforces temporal consistency between frames. We establish the TBLS as a flat network comprising the input data, consisting of an original frame in an original video, a corresponding frame in the temporally inconsistent video on which the image-based technique was applied, and an output frame of the last original frame, as mapped features in feature nodes. Then, we refine extracted features by enhancing the mapped features as enhancement nodes with randomly generated weights. We then connect all extracted features to the output layer with a target weight vector. With the target weight vector, we can minimize the temporal information loss between consecutive frames and the video fidelity loss in the output videos. Finally, we remove the temporal inconsistency in the processed video and output a temporally consistent video. Besides, we propose an alternative incremental learning algorithm based on the increment of the mapped feature nodes, enhancement nodes, or input data to improve learning accuracy by a broad expansion. We demonstrate the superiority of our proposed TBLS by conducting extensive experiments.
Collapse
|
6
|
An approach of classifiers fusion based on hierarchical modifications. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02777-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
7
|
A multi-level consensus function clustering ensemble. Soft comput 2021. [DOI: 10.1007/s00500-021-06092-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
8
|
Mao S, Lin W, Jiao L, Gou S, Chen JW. End-to-End Ensemble Learning by Exploiting the Correlation Between Individuals and Weights. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:2835-2846. [PMID: 31425063 DOI: 10.1109/tcyb.2019.2931071] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Ensemble learning performs better than a single classifier in most tasks due to the diversity among multiple classifiers. However, the enhancement of the diversity is at the expense of reducing the accuracies of individual classifiers in general and, thus, how to balance the diversity and accuracies is crucial for improving the ensemble performance. In this paper, we propose a new ensemble method which exploits the correlation between individual classifiers and their corresponding weights by constructing a joint optimization model to achieve the tradeoff between the diversity and the accuracy. Specifically, the proposed framework can be modeled as a shallow network and efficiently trained by the end-to-end manner. In the proposed ensemble method, not only can a high total classification performance be achieved by the weighted classifiers but also the individual classifier can be updated based on the error of the optimized weighted classifiers ensemble. Furthermore, the sparsity constraint is imposed on the weight to enforce that partial individual classifiers are selected for final classification. Finally, the experimental results on the UCI datasets demonstrate that the proposed method effectively improves the performance of classification compared with relevant existing ensemble methods.
Collapse
|
9
|
|
10
|
Alharthi AM, Lee MH, Algamal ZY. Gene selection and classification of microarray gene expression data based on a new adaptive L1-norm elastic net penalty. INFORMATICS IN MEDICINE UNLOCKED 2021. [DOI: 10.1016/j.imu.2021.100622] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
11
|
Jan ZM, Verma B. Multiple Elimination of Base Classifiers in Ensemble Learning Using Accuracy and Diversity Comparisons. ACM T INTEL SYST TEC 2020. [DOI: 10.1145/3405790] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
When generating ensemble classifiers, selecting the best set of classifiers from the base classifier pool is considered a combinatorial problem and an efficient classifier selection methodology must be utilized. Different researchers have used different strategies such as evolutionary algorithms, genetic algorithms, rule-based algorithms, simulated annealing, and so forth to select the best set of classifiers that can maximize overall ensemble classifier accuracy. In this article, we present a novel classifier selection approach to generate an ensemble classifier. The proposed approach selects classifiers in multiple rounds of elimination. In each round, a classifier is given a chance to be selected to become a part of the ensemble, if it can contribute to the overall ensemble accuracy or diversity; otherwise, it is put back into the pool. Each classifier is given multiple opportunities to participate in rounds of selection and they are discarded only if they have no remaining chances. The process is repeated until no classifier in the pool has any chance left to participate in the round of selection. To test the efficacy of the proposed approach, 13 benchmark datasets from the UCI repository are used and results are compared with single classifier models and existing state-of-the-art ensemble classifier approaches. Statistical significance testing is conducted to further validate the results, and an analysis is provided.
Collapse
Affiliation(s)
- Zohaib Md. Jan
- Center for Intelligent Systems, Central Queensland University, Brisbane, Queensland, Australia
| | - Brijesh Verma
- Center for Intelligent Systems, Central Queensland University, Brisbane, Queensland, Australia
| |
Collapse
|
12
|
Chu F, Liang T, Chen CLP, Wang X, Ma X. Weighted Broad Learning System and Its Application in Nonlinear Industrial Process Modeling. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:3017-3031. [PMID: 31514158 DOI: 10.1109/tnnls.2019.2935033] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Broad learning system (BLS) is a novel neural network with effective and efficient learning ability. BLS has attracted increasing attention from many scholars owing to its excellent performance. This article proposes a weighted BLS (WBLS) based on BLS to tackle the noise and outliers in an industrial process. WBLS provides a unified framework for easily using different methods of calculating the weighted penalty factor. Using the weighted penalty factor to constrain the contribution of each sample to modeling, the normal and abnormal samples were allocated higher and lower weights to increase and decrease their contributions, respectively. Hence, the WBLS can eliminate the bad effect of noise and outliers on the modeling. The weighted ridge regression algorithm is used to compute the algorithm solution. Weighted incremental learning algorithms are also developed using the weighted penalty factor to tackle the noise and outliers in the additional samples and quickly increase nodes or samples without retraining. The proposed weighted incremental learning algorithms provide a unified framework for using different methods of computing weights. We test the feasibility of the proposed algorithms on some public data sets and a real-world application. Experiment results show that our method has better generalization and robustness.
Collapse
|
13
|
Mahmoudi MR, Akbarzadeh H, Parvin H, Nejatian S, Rezaie V, Alinejad-Rokny H. Consensus function based on cluster-wise two level clustering. Artif Intell Rev 2020. [DOI: 10.1007/s10462-020-09862-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
14
|
Shi Y, Yu Z, Chen CLP, You J, Wong HS, Wang Y, Zhang J. Transfer Clustering Ensemble Selection. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:2872-2885. [PMID: 30596592 DOI: 10.1109/tcyb.2018.2885585] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Clustering ensemble (CE) takes multiple clustering solutions into consideration in order to effectively improve the accuracy and robustness of the final result. To reduce redundancy as well as noise, a CE selection (CES) step is added to further enhance performance. Quality and diversity are two important metrics of CES. However, most of the CES strategies adopt heuristic selection methods or a threshold parameter setting to achieve tradeoff between quality and diversity. In this paper, we propose a transfer CES (TCES) algorithm which makes use of the relationship between quality and diversity in a source dataset, and transfers it into a target dataset based on three objective functions. Furthermore, a multiobjective self-evolutionary process is designed to optimize these three objective functions. Finally, we construct a transfer CE framework (TCE-TCES) based on TCES to obtain better clustering results. The experimental results on 12 transfer clustering tasks obtained from the 20newsgroups dataset show that TCE-TCES can find a better tradeoff between quality and diversity, as well as obtaining more desirable clustering results.
Collapse
|
15
|
Abstract
In the field of machine learning, an ensemble approach is often utilized as an effective means of improving on the accuracy of multiple weak base classifiers. A concern associated with these ensemble algorithms is that they can suffer from the Curse of Conflict, where a classifier’s true prediction is negated by another classifier’s false prediction during the consensus period. Another concern of the ensemble technique is that it cannot effectively mitigate the problem of Imbalanced Classification, where an ensemble classifier usually presents a similar magnitude of bias to the same class as its imbalanced base classifiers. We proposed an improved ensemble algorithm called “Sieve” that overcomes the aforementioned shortcomings through the establishment of the novel concept of Global Consensus. The proposed Sieve ensemble approach was benchmarked against various ensemble classifiers, and was trained using different ensemble algorithms with the same base classifiers. The results demonstrate that better accuracy and stability was achieved.
Collapse
|
16
|
Krishnan R, Jagannathan S, Samaranayake VA. Direct Error-Driven Learning for Deep Neural Networks With Applications to Big Data. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:1763-1770. [PMID: 31329564 DOI: 10.1109/tnnls.2019.2920964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this brief, heterogeneity and noise in big data are shown to increase the generalization error for a traditional learning regime utilized for deep neural networks (deep NNs). To reduce this error, while overcoming the issue of vanishing gradients, a direct error-driven learning (EDL) scheme is proposed. First, to reduce the impact of heterogeneity and data noise, the concept of a neighborhood is introduced. Using this neighborhood, an approximation of generalization error is obtained and an overall error, comprised of learning and the approximate generalization errors, is defined. A novel NN weight-tuning law is obtained through a layer-wise performance measure enabling the direct use of overall error for learning. Additional constraints are introduced into the layer-wise performance measure to guide and improve the learning process in the presence of noisy dimensions. The proposed direct EDL scheme effectively addresses the issue of heterogeneity and noise while mitigating vanishing gradients and noisy dimensions. A comprehensive simulation study is presented where the proposed approach is shown to mitigate the vanishing gradient problem while improving generalization by 6%.
Collapse
|
17
|
|
18
|
Wang Z, Cao C. Cascade interpolation learning with double subspaces and confidence disturbance for imbalanced problems. Neural Netw 2019; 118:17-31. [DOI: 10.1016/j.neunet.2019.06.003] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Revised: 04/04/2019] [Accepted: 06/03/2019] [Indexed: 10/26/2022]
|
19
|
Wang Y, Li X, Ruiz R. Weighted General Group Lasso for Gene Selection in Cancer Classification. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:2860-2873. [PMID: 29993764 DOI: 10.1109/tcyb.2018.2829811] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Relevant gene selection is crucial for analyzing cancer gene expression datasets including two types of tumors in cancer classification. Intrinsic interactions among selected genes cannot be fully identified by most existing gene selection methods. In this paper, we propose a weighted general group lasso (WGGL) model to select cancer genes in groups. A gene grouping heuristic method is presented based on weighted gene co-expression network analysis. To determine the importance of genes and groups, a method for calculating gene and group weights is presented in terms of joint mutual information. To implement the complex calculation process of WGGL, a gene selection algorithm is developed. Experimental results on both random and three cancer gene expression datasets demonstrate that the proposed model achieves better classification performance than two existing state-of-the-art gene selection methods.
Collapse
|
20
|
Nguyen TT, Dang MT, Liew AW, Bezdek JC. A weighted multiple classifier framework based on random projection. Inf Sci (N Y) 2019. [DOI: 10.1016/j.ins.2019.03.067] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
21
|
Nguyen TT, Pham XC, Liew AWC, Pedrycz W. Aggregation of Classifiers: A Justifiable Information Granularity Approach. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:2168-2177. [PMID: 29993920 DOI: 10.1109/tcyb.2018.2821679] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In this paper, we introduced a new approach of combining multiple classifiers in a heterogeneous ensemble system. Instead of using numerical membership values when combining, we constructed interval membership values for each class prediction from the meta-data of observation by using the concept of information granule. In the proposed method, the uncertainty (diversity) of the predictions produced by the base classifiers is quantified by the interval-based information granules. The decision model is then generated by considering both bound and length of the intervals. Extensive experimentation using the UCI datasets has demonstrated the superior performance of our algorithm over other algorithms including six fixed combining methods, one trainable combining method, AdaBoost, bagging, and random subspace.
Collapse
|
22
|
Yang Y, Jiang J. Adaptive Bi-Weighting Toward Automatic Initialization and Model Selection for HMM-Based Hybrid Meta-Clustering Ensembles. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:1657-1668. [PMID: 29994293 DOI: 10.1109/tcyb.2018.2809562] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Temporal data clustering can provide underpinning techniques for the discovery of intrinsic structures, which proved important in condensing or summarizing information demanded in various fields of information sciences, ranging from time series analysis to sequential data understanding. In this paper, we propose a novel hidden Markov model (HMM)-based hybrid meta-clustering ensemble with bi-weighting scheme to solve the problems of initialization and model selection associated with temporal data clustering. To improve the performance of the ensemble techniques, the proposed bi-weighting scheme adaptively examines the partition process and hence optimizes the fusion of consensus functions. Specifically, three consensus functions are used to combine the input partitions, generated by HMM-based K -models under different initializations, into a robust consensus partition. An optimal consensus partition is then selected from the three candidates by a normalized mutual information-based objective function. Finally, the optimal consensus partition is further refined by the HMM-based agglomerative clustering algorithm in association with dendrogram-based similarity partitioning algorithm, leading to the advantage that the number of clusters can be automatically and adaptively determined. Extensive experiments on synthetic data, time series, and real-world motion trajectory datasets illustrate that our proposed approach outperforms all the selected benchmarks and hence providing promising potentials for developing improved clustering tools for information analysis and management.
Collapse
|
23
|
Yu Z, Wang D, Zhao Z, Chen CLP, You J, Wong HS, Zhang J. Hybrid Incremental Ensemble Learning for Noisy Real-World Data Classification. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:403-416. [PMID: 29990215 DOI: 10.1109/tcyb.2017.2774266] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Traditional ensemble learning approaches explore the feature space and the sample space, respectively, which will prevent them to construct more powerful learning models for noisy real-world dataset classification. The random subspace method only search for the selection of features. Meanwhile, the bagging approach only search for the selection of samples. To overcome these limitations, we propose the hybrid incremental ensemble learning (HIEL) approach which takes into consideration the feature space and the sample space simultaneously to handle noisy dataset. Specifically, HIEL first adopts the bagging technique and linear discriminant analysis to remove noisy attributes, and generates a set of bootstraps and the corresponding ensemble members in the subspaces. Then, the classifiers are selected incrementally based on a classifier-specific criterion function and an ensemble criterion function. The corresponding weights for the classifiers are assigned during the same process. Finally, the final label is summarized by a weighted voting scheme, which serves as the final result of the classification. We also explore various classifier-specific criterion functions based on different newly proposed similarity measures, which will alleviate the effect of noisy samples on the distance functions. In addition, the computational cost of HIEL is analyzed theoretically. A set of nonparametric tests are adopted to compare HIEL and other algorithms over several datasets. The experiment results show that HIEL performs well on the noisy datasets. HIEL outperforms most of the compared classifier ensemble methods on 14 out of 24 noisy real-world UCI and KEEL datasets.
Collapse
|
24
|
Nguyen TT, Nguyen MP, Pham XC, Liew AWC, Pedrycz W. Combining heterogeneous classifiers via granular prototypes. Appl Soft Comput 2018. [DOI: 10.1016/j.asoc.2018.09.021] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
25
|
Li J, Dong W, Meng D. Grouped Gene Selection of Cancer via Adaptive Sparse Group Lasso Based on Conditional Mutual Information. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:2028-2038. [PMID: 29028206 DOI: 10.1109/tcbb.2017.2761871] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
This paper deals with the problems of cancer classification and grouped gene selection. The weighted gene co-expression network on cancer microarray data is employed to identify modules corresponding to biological pathways, based on which a strategy of dividing genes into groups is presented. Using the conditional mutual information within each divided group, an integrated criterion is proposed and the data-driven weights are constructed. They are shown with the ability to evaluate both the individual gene significance and the influence to improve correlation of all the other pairwise genes in each group. Furthermore, an adaptive sparse group lasso is proposed, by which an improved blockwise descent algorithm is developed. The results on four cancer data sets demonstrate that the proposed adaptive sparse group lasso can effectively perform classification and grouped gene selection.
Collapse
|
26
|
Yang E, Deng C, Li C, Liu W, Li J, Tao D. Shared Predictive Cross-Modal Deep Quantization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:5292-5303. [PMID: 29994640 DOI: 10.1109/tnnls.2018.2793863] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
With explosive growth of data volume and ever-increasing diversity of data modalities, cross-modal similarity search, which conducts nearest neighbor search across different modalities, has been attracting increasing interest. This paper presents a deep compact code learning solution for efficient cross-modal similarity search. Many recent studies have proven that quantization-based approaches perform generally better than hashing-based approaches on single-modal similarity search. In this paper, we propose a deep quantization approach, which is among the early attempts of leveraging deep neural networks into quantization-based cross-modal similarity search. Our approach, dubbed shared predictive deep quantization (SPDQ), explicitly formulates a shared subspace across different modalities and two private subspaces for individual modalities, and representations in the shared subspace and the private subspaces are learned simultaneously by embedding them to a reproducing kernel Hilbert space, where the mean embedding of different modality distributions can be explicitly compared. In addition, in the shared subspace, a quantizer is learned to produce the semantics preserving compact codes with the help of label alignment. Thanks to this novel network architecture in cooperation with supervised quantization training, SPDQ can preserve intramodal and intermodal similarities as much as possible and greatly reduce quantization error. Experiments on two popular benchmarks corroborate that our approach outperforms state-of-the-art methods.
Collapse
|
27
|
Li J, Wang Y, Jiang T, Xiao H, Song X. Grouped gene selection and multi-classification of acute leukemia via new regularized multinomial regression. Gene 2018; 667:18-24. [DOI: 10.1016/j.gene.2018.05.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2017] [Revised: 04/26/2018] [Accepted: 05/02/2018] [Indexed: 12/13/2022]
|
28
|
|
29
|
Liu Z, Pan Q, Dezert J, Han JW, He Y. Classifier Fusion With Contextual Reliability Evaluation. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:1605-1618. [PMID: 28613193 DOI: 10.1109/tcyb.2017.2710205] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Classifier fusion is an efficient strategy to improve the classification performance for the complex pattern recognition problem. In practice, the multiple classifiers to combine can have different reliabilities and the proper reliability evaluation plays an important role in the fusion process for getting the best classification performance. We propose a new method for classifier fusion with contextual reliability evaluation (CF-CRE) based on inner reliability and relative reliability concepts. The inner reliability, represented by a matrix, characterizes the probability of the object belonging to one class when it is classified to another class. The elements of this matrix are estimated from the -nearest neighbors of the object. A cautious discounting rule is developed under belief functions framework to revise the classification result according to the inner reliability. The relative reliability is evaluated based on a new incompatibility measure which allows to reduce the level of conflict between the classifiers by applying the classical evidence discounting rule to each classifier before their combination. The inner reliability and relative reliability capture different aspects of the classification reliability. The discounted classification results are combined with Dempster-Shafer's rule for the final class decision making support. The performance of CF-CRE have been evaluated and compared with those of main classical fusion methods using real data sets. The experimental results show that CF-CRE can produce substantially higher accuracy than other fusion methods in general. Moreover, CF-CRE is robust to the changes of the number of nearest neighbors chosen for estimating the reliability matrix, which is appealing for the applications.
Collapse
|
30
|
Yuen PC, Chellappa R. Learning Common and Feature-Specific Patterns: A Novel Multiple-Sparse-Representation-Based Tracker. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:2022-2037. [PMID: 29989985 DOI: 10.1109/tip.2017.2777183] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The use of multiple features has been shown to be an effective strategy for visual tracking because of their complementary contributions to appearance modeling. The key problem is how to learn a fused representation from multiple features for appearance modeling. Different features extracted from the same object should share some commonalities in their representations while each feature should also have some feature-specific representation patterns which reflect its complementarity in appearance modeling. Different from existing multi-feature sparse trackers which only consider the commonalities among the sparsity patterns of multiple features, this paper proposes a novel multiple sparse representation framework for visual tracking which jointly exploits the shared and feature-specific properties of different features by decomposing multiple sparsity patterns. Moreover, we introduce a novel online multiple metric learning to efficiently and adaptively incorporate the appearance proximity constraint, which ensures that the learned commonalities of multiple features are more representative. Experimental results on tracking benchmark videos and other challenging videos demonstrate the effectiveness of the proposed tracker.
Collapse
|
31
|
Chen CLP, Liu Z. Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:10-24. [PMID: 28742048 DOI: 10.1109/tnnls.2017.2716952] [Citation(s) in RCA: 280] [Impact Index Per Article: 46.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Broad Learning System (BLS) that aims to offer an alternative way of learning in deep structure is proposed in this paper. Deep structure and learning suffer from a time-consuming training process because of a large number of connecting parameters in filters and layers. Moreover, it encounters a complete retraining process if the structure is not sufficient to model the system. The BLS is established in the form of a flat network, where the original inputs are transferred and placed as "mapped features" in feature nodes and the structure is expanded in wide sense in the "enhancement nodes." The incremental learning algorithms are developed for fast remodeling in broad expansion without a retraining process if the network deems to be expanded. Two incremental learning algorithms are given for both the increment of the feature nodes (or filters in deep structure) and the increment of the enhancement nodes. The designed model and algorithms are very versatile for selecting a model rapidly. In addition, another incremental learning is developed for a system that has been modeled encounters a new incoming input. Specifically, the system can be remodeled in an incremental way without the entire retraining from the beginning. Satisfactory result for model reduction using singular value decomposition is conducted to simplify the final structure. Compared with existing deep neural networks, experimental results on the Modified National Institute of Standards and Technology database and NYU NORB object recognition dataset benchmark data demonstrate the effectiveness of the proposed BLS.
Collapse
|
32
|
Yu Z, Wang Z, You J, Zhang J, Liu J, Wong HS, Han G. A New Kind of Nonparametric Test for Statistical Comparison of Multiple Classifiers Over Multiple Datasets. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:4418-4431. [PMID: 28113414 DOI: 10.1109/tcyb.2016.2611020] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Nonparametric statistical analysis, such as the Friedman test (FT), is gaining more and more attention due to its useful applications in a lot of experimental studies. However, traditional FT for the comparison of multiple learning algorithms on different datasets adopts the naive ranking approach. The ranking is based on the average accuracy values obtained by the set of learning algorithms on the datasets, which neither considers the differences of the results obtained by the learning algorithms on each dataset nor takes into account the performance of the learning algorithms in each run. In this paper, we will first propose three kinds of ranking approaches, which are the weighted ranking approach, the global ranking approach (GRA), and the weighted GRA. Then, a theoretical analysis is performed to explore the properties of the proposed ranking approaches. Next, a set of the modified FTs based on the proposed ranking approaches are designed for the comparison of the learning algorithms. Finally, the modified FTs are evaluated through six classifier ensemble approaches on 34 real-world datasets. The experiments show the effectiveness of the modified FTs.
Collapse
|
33
|
Yu Z, Zhu X, Wong HS, You J, Zhang J, Han G. Distribution-Based Cluster Structure Selection. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:3554-3567. [PMID: 27254876 DOI: 10.1109/tcyb.2016.2569529] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The objective of cluster structure ensemble is to find a unified cluster structure from multiple cluster structures obtained from different datasets. Unfortunately, not all the cluster structures contribute to the unified cluster structure. This paper investigates the problem of how to select the suitable cluster structures in the ensemble which will be summarized to a more representative cluster structure. Specifically, the cluster structure is first represented by a mixture of Gaussian distributions, the parameters of which are estimated using the expectation-maximization algorithm. Then, several distribution-based distance functions are designed to evaluate the similarity between two cluster structures. Based on the similarity comparison results, we propose a new approach, which is referred to as the distribution-based cluster structure ensemble (DCSE) framework, to find the most representative unified cluster structure. We then design a new technique, the distribution-based cluster structure selection strategy (DCSSS), to select a subset of cluster structures. Finally, we propose using a distribution-based normalized hypergraph cut algorithm to generate the final result. In our experiments, a nonparametric test is adopted to evaluate the difference between DCSE and its competitors. We adopt 20 real-world datasets obtained from the University of California, Irvine and knowledge extraction based on evolutionary learning repositories, and a number of cancer gene expression profiles to evaluate the performance of the proposed methods. The experimental results show that: 1) DCSE works well on the real-world datasets and 2) DCSE based on DCSSS can further improve the performance of the algorithm.
Collapse
|
34
|
Liu C, Wang W, Tu G, Xiang Y, Wang S, Lv F. A new Centroid-Based Classification model for text categorization. Knowl Based Syst 2017. [DOI: 10.1016/j.knosys.2017.08.020] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
35
|
Lu H, Chen J, Yan K, Jin Q, Xue Y, Gao Z. A hybrid feature selection algorithm for gene expression data classification. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.07.080] [Citation(s) in RCA: 177] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
36
|
Lu H, Yang L, Yan K, Xue Y, Gao Z. A cost-sensitive rotation forest algorithm for gene expression data classification. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.09.077] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
37
|
Forestier G, Wemmert C. Semi-supervised learning using multiple clusterings with limited labeled data. Inf Sci (N Y) 2016. [DOI: 10.1016/j.ins.2016.04.040] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
38
|
Liao Z, Ju Y, Zou Q. Prediction of G Protein-Coupled Receptors with SVM-Prot Features and Random Forest. SCIENTIFICA 2016; 2016:8309253. [PMID: 27529053 PMCID: PMC4978840 DOI: 10.1155/2016/8309253] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/26/2016] [Revised: 06/26/2016] [Accepted: 06/30/2016] [Indexed: 06/06/2023]
Abstract
G protein-coupled receptors (GPCRs) are the largest receptor superfamily. In this paper, we try to employ physical-chemical properties, which come from SVM-Prot, to represent GPCR. Random Forest was utilized as classifier for distinguishing them from other protein sequences. MEME suite was used to detect the most significant 10 conserved motifs of human GPCRs. In the testing datasets, the average accuracy was 91.61%, and the average AUC was 0.9282. MEME discovery analysis showed that many motifs aggregated in the seven hydrophobic helices transmembrane regions adapt to the characteristic of GPCRs. All of the above indicate that our machine-learning method can successfully distinguish GPCRs from non-GPCRs.
Collapse
Affiliation(s)
- Zhijun Liao
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, Fujian 350108, China
- School of Computer Science and Technology, Tianjin University, Tianjin 300350, China
| | - Ying Ju
- School of Information Science and Technology, Xiamen University, Xiamen, Fujian 361005, China
| | - Quan Zou
- School of Computer Science and Technology, Tianjin University, Tianjin 300350, China
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin 300071, China
| |
Collapse
|