1
|
Qi F, Guo J, Li J, Liao Y, Liao W, Cai H, Chen J. Multi-Kernel Clustering with Tensor Fusion on Grassmann Manifold for High-dimensional Genomic Data. Methods 2024:S1046-2023(24)00213-5. [PMID: 39396747 DOI: 10.1016/j.ymeth.2024.09.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 09/21/2024] [Accepted: 09/26/2024] [Indexed: 10/15/2024] Open
Abstract
The high dimensionality and noise challenges in genomic data make it difficult for traditional clustering methods. Existing multi-kernel clustering methods aim to improve the quality of the affinity matrix by learning a set of base kernels, thereby enhancing clustering performance. However, directly learning from the original base kernels presents challenges in handling errors and redundancies when dealing with high-dimensional data, and there is still a lack of feasible multi-kernel fusion strategies. To address these issues, we propose a Multi-Kernel Clustering method with Tensor fusion on Grassmann manifolds, called MKCTM. Specifically, we maximize the clustering consensus among base kernels by imposing tensor low-rank constraints to eliminate noise and redundancy. Unlike traditional kernel fusion approaches, our method fuses learned base kernels on the Grassmann manifold, resulting in a final consensus matrix for clustering. We integrate tensor learning and fusion processes into a unified optimization model and propose an effective iterative optimization algorithm for solving it. Experimental results on ten datasets, comparing against 12 popular baseline clustering methods, confirm the superiority of our approach. Our code is available at https://github.com/foureverfei/MKCTM.git.
Collapse
Affiliation(s)
- Fei Qi
- Data Science and Information Engineering, Guizhou Minzu University, Guiyang, 550025, Guizhou, China; Computer Science and Technology, South China University of Technology, Guangzhou, 510006, Guangdong, China
| | - Jin Guo
- Big Data and Information Engineering, Guiyang Institute of Humanities and Technology, Guiyang, 550025, Guizhou, China
| | - Junyu Li
- Computer Science and Technology, South China University of Technology, Guangzhou, 510006, Guangdong, China
| | - Yi Liao
- Computer Science and Technology, South China University of Technology, Guangzhou, 510006, Guangdong, China
| | - Wenxiong Liao
- Computer Science and Technology, South China University of Technology, Guangzhou, 510006, Guangdong, China
| | - Hongmin Cai
- Computer Science and Technology, South China University of Technology, Guangzhou, 510006, Guangdong, China
| | - Jiazhou Chen
- Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006, Guangdong, China.
| |
Collapse
|
2
|
Wen J, Liu C, Deng S, Liu Y, Fei L, Yan K, Xu Y. Deep Double Incomplete Multi-View Multi-Label Learning With Incomplete Labels and Missing Views. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:11396-11408. [PMID: 37030862 DOI: 10.1109/tnnls.2023.3260349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
View missing and label missing are two challenging problems in the applications of multi-view multi-label classification scenery. In the past years, many efforts have been made to address the incomplete multi-view learning or incomplete multi-label learning problem. However, few works can simultaneously handle the challenging case with both the incomplete issues. In this article, we propose a new incomplete multi-view multi-label learning network to address this challenging issue. The proposed method is composed of four major parts: view-specific deep feature extraction network, weighted representation fusion module, classification module, and view-specific deep decoder network. By, respectively, integrating the view missing information and label missing information into the weighted fusion module and classification module, the proposed method can effectively reduce the negative influence caused by two such incomplete issues and sufficiently explore the available data and label information to obtain the most discriminative feature extractor and classifier. Furthermore, our method can be trained in both supervised and semi-supervised manners, which has important implications for flexible deployment. Experimental results on five benchmarks in supervised and semi-supervised cases demonstrate that the proposed method can greatly enhance the classification performance on the difficult incomplete multi-view multi-label classification tasks with missing labels and missing views.
Collapse
|
3
|
Lan W, Yang T, Chen Q, Zhang S, Dong Y, Zhou H, Pan Y. Multiview Subspace Clustering via Low-Rank Symmetric Affinity Graph. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:11382-11395. [PMID: 37015132 DOI: 10.1109/tnnls.2023.3260258] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Multiview subspace clustering (MVSC) has been used to explore the internal structure of multiview datasets by revealing unique information from different views. Most existing methods ignore the consistent information and angular information of different views. In this article, we propose a novel MVSC via low-rank symmetric affinity graph (LSGMC) to tackle these problems. Specifically, considering the consistent information, we pursue a consistent low-rank structure across views by decomposing the coefficient matrix into three factors. Then, the symmetry constraint is utilized to guarantee weight consistency for each pair of data samples. In addition, considering the angular information, we utilize the fusion mechanism to capture the inherent structure of data. Furthermore, to alleviate the effect brought by the noise and the high redundant data, the Schatten p-norm is employed to obtain a low-rank coefficient matrix. Finally, an adaptive information reduction strategy is designed to generate a high-quality similarity matrix for spectral clustering. Experimental results on 11 datasets demonstrate the superiority of LSGMC in clustering performance compared with ten state-of-the-art multiview clustering methods.
Collapse
|
4
|
Cui J, Fu Y, Huang C, Wen J. Low-Rank Graph Completion-Based Incomplete Multiview Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:8064-8074. [PMID: 36449580 DOI: 10.1109/tnnls.2022.3224058] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
In order to reduce the negative effect of missing data on clustering, incomplete multiview clustering (IMVC) has become an important research content in machine learning. At present, graph-based methods are widely used in IMVC, but these methods still have some defects. First, some of the methods overlook potential relationships across views. Second, most of the methods depend on local structure information and ignore the global structure information. Third, most of the methods cannot use both global structure information and potential information across views to adaptively recover the incomplete relationship structure. To address the above issues, we propose a unified optimization framework to learn reasonable affinity relationships, called low-rank graph completion-based IMVC (LRGR_IMVC). 1) Our method introduces adaptive graph embedding to effectively explore the potential relationship among views; 2) we append a low-rank constraint to adequately exploit the global structure information among views; and 3) this method unites related information within views, potential information across views, and global structure information to adaptively recover the incomplete graph structure and obtain complete affinity relationships. Experimental results on several commonly used datasets show that the proposed method achieves better clustering performance significantly than some of the most advanced methods.
Collapse
|
5
|
Wan X, Xiao B, Liu X, Liu J, Liang W, Zhu E. Fast Continual Multi-View Clustering With Incomplete Views. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:2995-3008. [PMID: 38640047 DOI: 10.1109/tip.2024.3388974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/21/2024]
Abstract
Multi-view clustering (MVC) has attracted broad attention due to its capacity to exploit consistent and complementary information across views. This paper focuses on a challenging issue in MVC called the incomplete continual data problem (ICDP). Specifically, most existing algorithms assume that views are available in advance and overlook the scenarios where data observations of views are accumulated over time. Due to privacy considerations or memory limitations, previous views cannot be stored in these situations. Some works have proposed ways to handle this problem, but all of them fail to address incomplete views. Such an incomplete continual data problem (ICDP) in MVC is difficult to solve since incomplete information with continual data increases the difficulty of extracting consistent and complementary knowledge among views. We propose Fast Continual Multi-View Clustering with Incomplete Views (FCMVC-IV) to address this issue. Specifically, the method maintains a scalable consensus coefficient matrix and updates its knowledge with the incoming incomplete view rather than storing and recomputing all the data matrices. Considering that the given views are incomplete, the newly collected view might contain samples that have yet to appear; two indicator matrices and a rotation matrix are developed to match matrices with different dimensions. In addition, we design a three-step iterative algorithm to solve the resultant problem with linear complexity and proven convergence. Comprehensive experiments conducted on various datasets demonstrate the superiority of FCMVC-IV over the competing approaches. The code is publicly available at https://github.com/wanxinhang/FCMVC-IV.
Collapse
|
6
|
Du M, Zhao J, Sun J, Dong Y. M3W: Multistep Three-Way Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:5627-5640. [PMID: 36173778 DOI: 10.1109/tnnls.2022.3208418] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Three-way clustering has been an active research topic in the field of cluster analysis in recent years. Some efforts are focused on the technique due to its feasibility and rationality. We observe, however, that the existing three-way clustering algorithms struggle to obtain more information and limit the fault tolerance excessively. Moreover, although the one-step three-way allocation based on a pair of fixed, global thresholds is the most straightforward way to generate the three-way cluster representations, the clusters derived from a pair of global thresholds cannot exactly reveal the inherent clustering structure of the dataset, and the threshold values are often difficult to determine beforehand. Inspired by sequential three-way decisions, we propose an algorithm, called multistep three-way clustering (M3W), to address these issues. Specifically, we first use a progressive erosion strategy to construct a multilevel structure of data, so that lower levels (or external layers) can gather more available information from higher levels (or internal layers). Then, we further propose a multistep three-way allocation strategy, which sufficiently considers the neighborhood information of every eroded instance. We use the allocation strategy in combination with the multilevel structure to ensure that more information is gradually obtained to increase the probability of being assigned correctly, capturing adaptively the inherent clustering structure of the dataset. The proposed algorithm is compared with eight competitors using 18 benchmark datasets. Experimental results show that M3W achieves superior performance, verifying its advantages and effectiveness.
Collapse
|
7
|
Wang S, Liu J, Yu G, Liu X, Zhou S, Zhu E, Yang Y, Yin J, Yang W. Multiview Deep Anomaly Detection: A Systematic Exploration. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:1651-1665. [PMID: 35767484 DOI: 10.1109/tnnls.2022.3184723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Anomaly detection (AD), which models a given normal class and distinguishes it from the rest of abnormal classes, has been a long-standing topic with ubiquitous applications. As modern scenarios often deal with massive high-dimensional complex data spawned by multiple sources, it is natural to consider AD from the perspective of multiview deep learning. However, it has not been formally discussed by the literature and remains underexplored. Motivated by this blank, this article makes fourfold contributions: First, to the best of our knowledge, this is the first work that formally identifies and formulates the multiview deep AD problem. Second, we take recent advances in relevant areas into account and systematically devise various baseline solutions, which lays the foundation for multiview deep AD research. Third, to remedy the problem that limited benchmark datasets are available for multiview deep AD, we extensively collect the existing public data and process them into more than 30 multiview benchmark datasets via multiple means, so as to provide a better evaluation platform for multiview deep AD. Finally, by comprehensively evaluating the devised solutions on different types of multiview deep AD benchmark datasets, we conduct a thorough analysis on the effectiveness of the designed baselines and hopefully provide other researchers with beneficial guidance and insight into the new multiview deep AD topic.
Collapse
|
8
|
Zhang DJ, Gao YL, Zhao JX, Zheng CH, Liu JX. A New Graph Autoencoder-Based Consensus-Guided Model for scRNA-seq Cell Type Detection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2473-2483. [PMID: 35857730 DOI: 10.1109/tnnls.2022.3190289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Single-cell RNA sequencing (scRNA-seq) technology is famous for providing a microscopic view to help capture cellular heterogeneity. This characteristic has advanced the field of genomics by enabling the delicate differentiation of cell types. However, the properties of single-cell datasets, such as high dropout events, noise, and high dimensionality, are still a research challenge in the single-cell field. To utilize single-cell data more efficiently and to better explore the heterogeneity among cells, a new graph autoencoder (GAE)-based consensus-guided model (scGAC) is proposed in this article. The data are preprocessed into multiple top-level feature datasets. Then, feature learning is performed by using GAEs to generate new feature matrices, followed by similarity learning based on distance fusion methods. The learned similarity matrices are fed back to the GAEs to guide their feature learning process. Finally, the abovementioned steps are iterated continuously to integrate the final consistent similarity matrix and perform other related downstream analyses. The scGAC model can accurately identify critical features and effectively preserve the internal structure of the data. This can further improve the accuracy of cell type identification.
Collapse
|
9
|
Gao X, Ma X, Zhang W, Huang J, Li H, Li Y, Cui J. Multi-View Clustering With Self-Representation and Structural Constraint. IEEE TRANSACTIONS ON BIG DATA 2022. [DOI: 10.1109/tbdata.2021.3128906] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Affiliation(s)
- Xiaowei Gao
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi, China
| | - Xiaoke Ma
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi, China
| | - Wensheng Zhang
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Jianbin Huang
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi, China
| | - He Li
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi, China
| | - Yanni Li
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi, China
| | - Jiangtao Cui
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi, China
| |
Collapse
|
10
|
Chen H, Wang W, Luo S. Coupled block diagonal regularization for multi-view subspace clustering. Data Min Knowl Discov 2022. [DOI: 10.1007/s10618-022-00852-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
11
|
Multi-View Graph Clustering by Adaptive Manifold Learning. MATHEMATICS 2022. [DOI: 10.3390/math10111821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Graph-oriented methods have been widely adopted in multi-view clustering because of their efficiency in learning heterogeneous relationships and complex structures hidden in data. However, existing methods are typically investigated based on a Euclidean structure instead of a more suitable manifold topological structure. Hence, it is expected that a more suitable manifold topological structure will be adopted to carry out intrinsic similarity learning. In this paper, we explore the implied adaptive manifold for multi-view graph clustering. Specifically, our model seamlessly integrates multiple adaptive graphs into a consensus graph with the manifold topological structure considered. We further manipulate the consensus graph with a useful rank constraint so that its connected components precisely correspond to distinct clusters. As a result, our model is able to directly achieve a discrete clustering result without any post-processing. In terms of the clustering results, our method achieves the best performance in 22 out of 24 cases in terms of four evaluation metrics on six datasets, which demonstrates the effectiveness of the proposed model. In terms of computational performance, our optimization algorithm is generally faster or in line with other state-of-the-art algorithms, which validates the efficiency of the proposed algorithm.
Collapse
|
12
|
Zhang X, Liu X. Multiview Clustering of Adaptive Sparse Representation Based on Coupled P Systems. ENTROPY 2022; 24:e24040568. [PMID: 35455231 PMCID: PMC9028410 DOI: 10.3390/e24040568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 04/12/2022] [Accepted: 04/15/2022] [Indexed: 12/10/2022]
Abstract
A multiview clustering (MVC) has been a significant technique to dispose data mining issues. Most of the existing studies on this topic adopt a fixed number of neighbors when constructing the similarity matrix of each view, like single-view clustering. However, this may reduce the clustering effect due to the diversity of multiview data sources. Moreover, most MVC utilizes iterative optimization to obtain clustering results, which consumes a significant amount of time. Therefore, this paper proposes a multiview clustering of adaptive sparse representation based on coupled P system (MVCS-CP) without iteration. The whole algorithm flow runs in the coupled P system. Firstly, the natural neighbor search algorithm without parameters automatically determines the number of neighbors of each view. In turn, manifold learning and sparse representation are employed to construct the similarity matrix, which preserves the internal geometry of the views. Next, a soft thresholding operator is introduced to form the unified graph to gain the clustering results. The experimental results on nine real datasets indicate that the MVCS-CP outperforms other state-of-the-art comparison algorithms.
Collapse
|
13
|
Chen H, Tai X, Wang W. Multi-view subspace clustering with inter-cluster consistency and intra-cluster diversity among views. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02895-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
14
|
Multi-view Subspace Clustering with Joint Tensor Representation and Indicator Matrix Learning. ARTIF INTELL 2022. [DOI: 10.1007/978-3-031-20500-2_37] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
15
|
Hu S, Lou Z, Ye Y. View-Wise Versus Cluster-Wise Weight: Which Is Better for Multi-View Clustering? IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 31:58-71. [PMID: 34807826 DOI: 10.1109/tip.2021.3128323] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Weighted multi-view clustering (MVC) aims to combine the complementary information of multi-view data (such as image data with different types of features) in a weighted manner to obtain a consistent clustering result. However, when the cluster-wise weights across views are vastly different, most existing weighted MVC methods may fail to fully utilize the complementary information, because they are based on view-wise weight learning and can not learn the fine-grained cluster-wise weights. Additionally, extra parameters are needed for most of them to control the weight distribution sparsity or smoothness, which are hard to tune without prior knowledge. To address these issues, in this paper we propose a novel and effective Cluster-weighted mUlti-view infoRmation bottlEneck (CURE) clustering algorithm, which can automatically learn the cluster-wise weights to discover the discriminative clusters across multiple views and thus can enhance the clustering performance by properly exploiting the cluster-level complementary information. To learn the cluster-wise weights, we design a new weight learning scheme by exploring the relation between the mutual information of the joint distribution of a specific cluster (containing a group of data samples) and the weight of this cluster. Finally, a novel draw-and-merge method is presented to solve the optimization problem. Experimental results on various multi-view datasets show the superiority and effectiveness of our cluster-wise weighted CURE over several state-of-the-art methods.
Collapse
|