1
|
Shang R, Zhong J, Zhang W, Xu S, Li Y. Multilabel Feature Selection via Shared Latent Sublabel Structure and Simultaneous Orthogonal Basis Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:5288-5303. [PMID: 38656846 DOI: 10.1109/tnnls.2024.3382911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Multilabel feature selection solves the dimension distress of high-dimensional multilabel data by selecting the optimal subset of features. Noisy and incomplete labels of raw multilabel data hinder the acquisition of label-guided information. In existing approaches, mapping the label space to a low-dimensional latent space by semantic decomposition to mitigate label noise is considered an effective strategy. However, the decomposed latent label space contains redundant label information, which misleads the capture of potential label relevance. To eliminate the effect of redundant information on the extraction of latent label correlations, a novel method named SLOFS via shared latent sublabel structure and simultaneous orthogonal basis clustering for multilabel feature selection is proposed. First, a latent orthogonal base structure shared (LOBSS) term is engineered to guide the construction of a redundancy-free latent sublabel space via the separated latent clustering center structure. The LOBSS term simultaneously retains latent sublabel information and latent clustering center structure. Moreover, the structure and relevance information of nonredundant latent sublabels are fully explored. The introduction of graph regularization ensures structural consistency in the data space and latent sublabels, thus helping the feature selection process. SLOFS employs a dynamic sublabel graph to obtain a high-quality sublabel space and uses regularization to constrain label correlations on dynamic sublabel projections. Finally, an effective convergence provable optimization scheme is proposed to solve the SLOFS method. The experimental studies on the 18 datasets demonstrate that the presented method performs consistently better than previous feature selection methods.
Collapse
|
2
|
Hong X, Wang W, Yan S, Shen X, Zhang Y, Ye X. Related Factors Mining of Diabetes Complications Based on Manifold-Constrained Multi-Label Feature Selection. IEEE J Biomed Health Inform 2025; 29:643-656. [PMID: 38805335 DOI: 10.1109/jbhi.2024.3406135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2024]
Abstract
The primary cause of mortality among individuals with diabetes stems from complications. Identifying related factors for these complications holds immense potential for early prevention. Previous research predominantly employed traditional machine-learning techniques to establish prediction models utilizing medical indicators for related factor selection. However, uncovering the intricate correlations among complication labels and identifying similar characteristics among medical indicators has been challenging. We propose a novel embedded multi-label feature selection approach called LCFSM(Label Cosine and Feature Similar Manifold) to address the issue. LCFSM introduces manifold constraints into the objective function to uncover risk factors associated with diabetes complications. Label cosine similarity is set to optimize feature weights, forming label manifold constraints. Similarly, feature manifold constraints are established to utilize feature kernel similarity in optimizing feature weights. LCFSM formulates an objective function based on the regularized Least Squares and previous manifolds constraints, employing the Sylvester equation for convergence assurance. The experimental evaluation compares LCFSM against eight baselines, demonstrating superior performance in top-10 feature selection and feature stacking.LCFSM is applied to identify primary risk factors for diabetes complications. Related factors involve Electromyogram, Urine Routine Protein Positive, etc, offering valuable insights for early treatment.
Collapse
|
3
|
Ruan J, Wang M, Liu D, Chen M, Gao X. Multi-Label Feature Selection with Feature-Label Subgraph Association and Graph Representation Learning. ENTROPY (BASEL, SWITZERLAND) 2024; 26:992. [PMID: 39593936 PMCID: PMC11592953 DOI: 10.3390/e26110992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2024] [Revised: 11/06/2024] [Accepted: 11/16/2024] [Indexed: 11/28/2024]
Abstract
In multi-label data, a sample is associated with multiple labels at the same time, and the computational complexity is manifested in the high-dimensional feature space as well as the interdependence and unbalanced distribution of labels, which leads to challenges regarding feature selection. As a result, a multi-label feature selection method based on feature-label subgraph association with graph representation learning (SAGRL) is proposed to represent the complex correlations of features and labels, especially the relationships between features and labels. Specifically, features and labels are mapped to nodes in the graph structure, and the connections between nodes are established to form feature and label sets, respectively, which increase intra-class correlation and decrease inter-class correlation. Further, feature-label subgraphs are constructed by feature and label sets to provide abundant feature combinations. The relationship between each subgraph is adjusted by graph representation learning, the crucial features in different label sets are selected, and the optimal feature subset is obtained by ranking. Experimental studies on 11 datasets show the superior performance of the proposed method with six evaluation metrics over some state-of-the-art multi-label feature selection methods.
Collapse
Affiliation(s)
- Jinghou Ruan
- School of Computer Science, Hubei University of Technology, Wuhan 430068, China; (J.R.); (D.L.)
| | - Mingwei Wang
- School of Computer Science, Hubei University of Technology, Wuhan 430068, China; (J.R.); (D.L.)
| | - Deqing Liu
- School of Computer Science, Hubei University of Technology, Wuhan 430068, China; (J.R.); (D.L.)
| | - Maolin Chen
- School of Smart City, Chongqing Jiaotong University, Chongqing 400074, China;
| | - Xianjun Gao
- School of Geosciences, Yangtze University, Wuhan 430100, China;
| |
Collapse
|
4
|
Liu J, Yang S, Zhang H, Sun Z, Du J. Online Multi-Label Streaming Feature Selection Based on Label Group Correlation and Feature Interaction. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1071. [PMID: 37510018 PMCID: PMC10377943 DOI: 10.3390/e25071071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 07/10/2023] [Accepted: 07/14/2023] [Indexed: 07/30/2023]
Abstract
Multi-label streaming feature selection has received widespread attention in recent years because the dynamic acquisition of features is more in line with the needs of practical application scenarios. Most previous methods either assume that the labels are independent of each other, or, although label correlation is explored, the relationship between related labels and features is difficult to understand or specify. In real applications, both situations may occur where the labels are correlated and the features may belong specifically to some labels. Moreover, these methods treat features individually without considering the interaction between features. Based on this, we present a novel online streaming feature selection method based on label group correlation and feature interaction (OSLGC). In our design, we first divide labels into multiple groups with the help of graph theory. Then, we integrate label weight and mutual information to accurately quantify the relationships between features under different label groups. Subsequently, a novel feature selection framework using sliding windows is designed, including online feature relevance analysis and online feature interaction analysis. Experiments on ten datasets show that the proposed method outperforms some mature MFS algorithms in terms of predictive performance, statistical analysis, stability analysis, and ablation experiments.
Collapse
Affiliation(s)
- Jinghua Liu
- Department of Computer Science and Technology, Huaqiao University, Xiamen 361021, China
- Xiamen Key Laboratory of Computer Vision and Pattern Recognition, Huaqiao University, Xiamen 361021, China
- Fujian Key Laboratory of Big Data Intelligence and Security, Huaqiao University, Xiamen 361021, China
| | - Songwei Yang
- Department of Computer Science and Technology, Huaqiao University, Xiamen 361021, China
- Xiamen Key Laboratory of Computer Vision and Pattern Recognition, Huaqiao University, Xiamen 361021, China
- Fujian Key Laboratory of Big Data Intelligence and Security, Huaqiao University, Xiamen 361021, China
| | - Hongbo Zhang
- Department of Computer Science and Technology, Huaqiao University, Xiamen 361021, China
- Xiamen Key Laboratory of Computer Vision and Pattern Recognition, Huaqiao University, Xiamen 361021, China
- Fujian Key Laboratory of Big Data Intelligence and Security, Huaqiao University, Xiamen 361021, China
| | - Zhenzhen Sun
- Department of Computer Science and Technology, Huaqiao University, Xiamen 361021, China
- Xiamen Key Laboratory of Computer Vision and Pattern Recognition, Huaqiao University, Xiamen 361021, China
- Fujian Key Laboratory of Big Data Intelligence and Security, Huaqiao University, Xiamen 361021, China
| | - Jixiang Du
- Department of Computer Science and Technology, Huaqiao University, Xiamen 361021, China
- Xiamen Key Laboratory of Computer Vision and Pattern Recognition, Huaqiao University, Xiamen 361021, China
- Fujian Key Laboratory of Big Data Intelligence and Security, Huaqiao University, Xiamen 361021, China
| |
Collapse
|
5
|
Miao J, Wang Y, Cheng Y, Chen F. Parallel dual-channel multi-label feature selection. Soft comput 2023. [DOI: 10.1007/s00500-023-07916-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2023]
|
6
|
Dai L, Zhang J, Du G, Li C, Wei R, Li S. Toward embedding-based multi-label feature selection with label and feature collaboration. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07924-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
7
|
Sparse multi-label feature selection via dynamic graph manifold regularization. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01679-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/10/2022]
|
8
|
Robust multi-label feature selection with shared label enhancement. Knowl Inf Syst 2022. [DOI: 10.1007/s10115-022-01747-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
10
|
Hashemi A, Bagher Dowlatshahi M, Nezamabadi-pour H. An efficient Pareto-based feature selection algorithm for multi-label classification. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.09.052] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
11
|
Fan Y, Liu J, Wu S. Exploring instance correlations with local discriminant model for multi-label feature selection. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02799-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|