Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Xu Z, King I, Lyu MRT, Jin R. Discriminative semi-supervised feature selection via manifold regularization. ACTA ACUST UNITED AC 2010;21:1033-47. [PMID: 20570772 DOI: 10.1109/tnn.2010.2047114] [Citation(s) in RCA: 239] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

For:	Xu Z, King I, Lyu MRT, Jin R. Discriminative semi-supervised feature selection via manifold regularization. ACTA ACUST UNITED AC 2010;21:1033-47. [PMID: 20570772 DOI: 10.1109/tnn.2010.2047114] [Citation(s) in RCA: 239] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Number

Cited by Other Article(s)

Labory J, Njomgue-Fotso E, Bottini S. Benchmarking feature selection and feature extraction methods to improve the performances of machine-learning algorithms for patient classification using metabolomics biomedical data. Comput Struct Biotechnol J 2024;23:1274-1287. [PMID: 38560281 PMCID: PMC10979063 DOI: 10.1016/j.csbj.2024.03.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 03/12/2024] [Accepted: 03/18/2024] [Indexed: 04/04/2024] Open

Abstract

Objective

Classification tasks are an open challenge in the field of biomedicine. While several machine-learning techniques exist to accomplish this objective, several peculiarities associated with biomedical data, especially when it comes to omics measurements, prevent their use or good performance achievements. Omics approaches aim to understand a complex biological system through systematic analysis of its content at the molecular level. On the other hand, omics data are heterogeneous, sparse and affected by the classical "curse of dimensionality" problem, i.e. having much fewer observation, samples (n) than omics features (p). Furthermore, a major problem with multi-omics data is the imbalance either at the class or feature level. The objective of this work is to study whether feature extraction and/or feature selection techniques can improve the performances of classification machine-learning algorithms on omics measurements.

Methods

Among all omics, metabolomics has emerged as a powerful tool in cancer research, facilitating a deeper understanding of the complex metabolic landscape associated with tumorigenesis and tumor progression. Thus, we selected three publicly available metabolomics datasets, and we applied several feature extraction techniques both linear and non-linear, coupled or not with feature selection methods, and evaluated the performances regarding patient classification in the different configurations for the three datasets.

Results

We provide general workflow and guidelines on when to use those techniques depending on the characteristics of the data available. To further test the extension of our approach to other omics data, we have included a transcriptomics and a proteomics data. Overall, for all datasets, we showed that applying supervised feature selection improves the performances of feature extraction methods for classification purposes. Scripts used to perform all analyses are available at: https://github.com/Plant-Net/Metabolomic_project/.

Collapse

Song Z, Yang X, Xu Z, King I. Graph-Based Semi-Supervised Learning: A Comprehensive Review. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023;34:8174-8194. [PMID: 35302941 DOI: 10.1109/tnnls.2022.3155478] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Li RY, Guo Y, Zhang B. Adaptive Kernel Graph Nonnegative Matrix Factorization. INFORMATION 2023. [DOI: 10.3390/info14040208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/31/2023] Open

Abstract Nonnegative matrix factorization (NMF) is an efficient method for feature learning in the field of machine learning and data mining. To investigate the nonlinear characteristics of datasets, kernel-method-based NMF (KNMF) and its graph-regularized extensions have received much attention from various researchers due to their promising performance. However, the graph similarity matrix of the existing methods is often predefined in the original space of data and kept unchanged during the matrix-factorization procedure, which leads to non-optimal graphs. To address these problems, we propose a kernel-graph-learning-based, nonlinear, nonnegative matrix-factorization method in this paper, termed adaptive kernel graph nonnegative matrix factorization (AKGNMF). In order to automatically capture the manifold structure of the data on the nonlinear feature space, AKGNMF learned an adaptive similarity graph. We formulated a unified objective function, in which global similarity graph learning is optimized jointly with the matrix decomposition process. A local graph Laplacian is further imposed on the learned feature subspace representation. The proposed method relies on both the factorization that respects geometric structure and the mapped high-dimensional subspace feature representations. In addition, an efficient iterative solution was derived to update all variables in the resultant objective problem in turn. Experiments on the synthetic dataset visually demonstrate the ability of AKGNMF to separate the nonlinear dataset with high clustering accuracy. Experiments on real-world datasets verified the effectiveness of AKGNMF in three aspects, including clustering performance, parameter sensitivity and convergence. Comprehensive experimental findings indicate that, compared with various classic methods and the state-of-the-art methods, the proposed AKGNMF algorithm demonstrated effectiveness and superiority. Collapse

Huang D, Zhang Q, Li Z. Semi-supervised attribute reduction for partially labeled categorical data based on predicted label. Int J Approx Reason 2023. [DOI: 10.1016/j.ijar.2022.12.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Feature selection for distance-based regression: An umbrella review and a one-shot wrapper. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.11.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Lv S, Wei L, Zhang Q, Liu B, Xu Z. Improved Inference for Imputation-Based Semisupervised Learning Under Misspecified Setting. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022;33:6346-6359. [PMID: 34029195 DOI: 10.1109/tnnls.2021.3077312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Binary dwarf mongoose optimizer for solving high-dimensional feature selection problems. PLoS One 2022;17:e0274850. [PMID: 36201524 PMCID: PMC9536540 DOI: 10.1371/journal.pone.0274850] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 09/06/2022] [Indexed: 11/13/2022] Open

Fan M, Zhang X, Hu J, Gu N, Tao D. Adaptive Data Structure Regularized Multiclass Discriminative Feature Selection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022;33:5859-5872. [PMID: 33882003 DOI: 10.1109/tnnls.2021.3071603] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

BÜYÜKKEÇECİ M, OKUR MC. A Comprehensive Review of Feature Selection and Feature Selection Stability in Machine Learning. GAZI UNIVERSITY JOURNAL OF SCIENCE 2022. [DOI: 10.35378/gujs.993763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023]

Akinola OO, Ezugwu AE, Agushaka JO, Zitar RA, Abualigah L. Multiclass feature selection with metaheuristic optimization algorithms: a review. Neural Comput Appl 2022;34:19751-19790. [PMID: 36060097 PMCID: PMC9424068 DOI: 10.1007/s00521-022-07705-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 08/02/2022] [Indexed: 11/24/2022]

Wang C, Chen X, Yuan G, Nie F, Yang M. Semisupervised Feature Selection With Sparse Discriminative Least Squares Regression. IEEE TRANSACTIONS ON CYBERNETICS 2022;52:8413-8424. [PMID: 33872166 DOI: 10.1109/tcyb.2021.3060804] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Balasubramanian K, N.P. A. Correlation-based feature selection using bio-inspired algorithms and optimized KELM classifier for glaucoma diagnosis. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109432] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Chen X, Chen R, Wu Q, Nie F, Yang M, Mao R. Semisupervised Feature Selection via Structured Manifold Learning. IEEE TRANSACTIONS ON CYBERNETICS 2022;52:5756-5766. [PMID: 33635817 DOI: 10.1109/tcyb.2021.3052847] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Robust dual-graph regularized and minimum redundancy based on self-representation for semi-supervised feature selection. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.03.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Yuan A, You M, He D, Li X. Convex Non-Negative Matrix Factorization With Adaptive Graph for Unsupervised Feature Selection. IEEE TRANSACTIONS ON CYBERNETICS 2022;52:5522-5534. [PMID: 33237876 DOI: 10.1109/tcyb.2020.3034462] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Zhang H, Gong M, Nie F, Li X. Unified Dual-label Semi-supervised Learning with Top-k Feature Selection. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.05.090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Zhang R, Zhang H, Li X, Yang S. Unsupervised Feature Selection With Extended OLSDA via Embedding Nonnegative Manifold Structure. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022;33:2274-2280. [PMID: 33382663 DOI: 10.1109/tnnls.2020.3045053] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Dai J, Liu Q. Semi-supervised attribute reduction for interval data based on misclassification cost. INT J MACH LEARN CYB 2021. [DOI: 10.1007/s13042-021-01483-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Pintas JT, Fernandes LAF, Garcia ACB. Feature selection methods for text classification: a systematic literature review. Artif Intell Rev 2021. [DOI: 10.1007/s10462-021-09970-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Huang S, Liu Z, Jin W, Mu Y. Broad learning system with manifold regularized sparse features for semi-supervised classification. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.08.052] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Wu X, Chen H, Li T, Wan J. Semi-supervised feature selection with minimal redundancy based on local adaptive. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02288-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Liu Z, Huang S, Jin W, Mu Y. Graph-based broad learning system for classification. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.07.065] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Huang Y, Shen Z, Cai F, Li T, Lv F. Adaptive graph-based generalized regression model for unsupervised feature selection. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107156] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

Zhong W, Chen X, Nie F, Huang JZ. Adaptive discriminant analysis for semi-supervised feature selection. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.02.035] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Xue C, Zhang T, Xiao D. Output-Related and -Unrelated Fault Monitoring with an Improvement Prototype Knockoff Filter and Feature Selection Based on Laplacian Eigen Maps and Sparse Regression. ACS OMEGA 2021;6:10828-10839. [PMID: 34056237 PMCID: PMC8153765 DOI: 10.1021/acsomega.1c00506] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Accepted: 04/06/2021] [Indexed: 06/12/2023]

Joint local structure preservation and redundancy minimization for unsupervised feature selection. APPL INTELL 2020. [DOI: 10.1007/s10489-020-01800-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Shang R, Xu K, Jiao L. Subspace learning for unsupervised feature selection via adaptive structure learning and rank approximation. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.06.111] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Zhou P, Chen J, Fan M, Du L, Shen YD, Li X. Unsupervised feature selection for balanced clustering. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2019.105417] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Liu Y, Ye D, Li W, Wang H, Gao Y. Robust neighborhood embedding for unsupervised feature selection. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2019.105462] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Huang S, Xu Z, Kang Z, Ren Y. Regularized nonnegative matrix factorization with adaptive local structure learning. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.11.070] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Shang R, Song J, Jiao L, Li Y. Double feature selection algorithm based on low-rank sparse non-negative matrix factorization. INT J MACH LEARN CYB 2020. [DOI: 10.1007/s13042-020-01079-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Ahmadizadeh C, Pousett B, Menon C. Investigation of Channel Selection for Gesture Classification for Prosthesis Control Using Force Myography: A Case Study. Front Bioeng Biotechnol 2019;7:331. [PMID: 31921794 PMCID: PMC6914858 DOI: 10.3389/fbioe.2019.00331] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Accepted: 10/29/2019] [Indexed: 11/13/2022] Open

Abstract

Background: Various human machine interfaces (HMIs) are used to control prostheses, such as robotic hands. One of the promising HMIs is Force Myography (FMG). Previous research has shown the potential for the use of high density FMG (HD-FMG) that can lead to higher accuracy of prosthesis control. Motivation: The more sensors used in an FMG controlled system, the more complicated and costlier the system becomes. This study proposes a design method that can produce powered prostheses with performance comparable to that of HD-FMG controlled systems using a fewer number of sensors. An HD-FMG apparatus would be used to collect information from the user only in the design phase. Channel selection would then be applied to the collected data to determine the number and location of sensors that are vital to performance of the device. This study assessed the use of multiple channel selection (CS) methods for this purpose. Methods: In this case study, three datasets were used. These datasets were collected from force sensitive resistors embedded in the inner socket of a subject with transradial amputation. Sensor data were collected as the subject carried out five repetitions of six gestures. Collected data were then used to asses five CS methods: Sequential forward selection (SFS) with two different stopping criteria, minimum redundancy-maximum relevance (mRMR), genetic algorithm (GA), and Boruta. Results: Three out of the five methods (mRMR, GA, and Boruta) were able to decrease channel numbers significantly while maintaining classification accuracy in all datasets. Neither of them outperformed the other two in all datasets. However, GA resulted in the smallest channel subset in all three of the datasets. The three selected methods were also compared in terms of stability [i.e., consistency of the channel subset chosen by the method as new training data were introduced or some training data were removed (Chandrashekar and Sahin, 2014)]. Boruta and mRMR resulted in less instability compared to GA when applied to the datasets of this study. Conclusion: This study shows feasibility of using the proposed design method that can produce prosthetic systems that are simpler than HD-FMG systems but have performance comparable to theirs.

Collapse

Ma J, Wu J, Zhao J, Jiang J, Zhou H, Sheng QZ. Nonrigid Point Set Registration With Robust Transformation Learning Under Manifold Regularization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019;30:3584-3597. [PMID: 30371389 DOI: 10.1109/tnnls.2018.2872528] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Correlation-Based Ensemble Feature Selection Using Bioinspired Algorithms and Classification Using Backpropagation Neural Network. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2019;2019:7398307. [PMID: 31662787 PMCID: PMC6778924 DOI: 10.1155/2019/7398307] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Revised: 08/02/2019] [Accepted: 08/16/2019] [Indexed: 11/17/2022]

Fast unsupervised feature selection based on the improved binary ant system and mutation strategy. Neural Comput Appl 2019. [DOI: 10.1007/s00521-018-03991-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

A multi-scheme semi-supervised regression approach. Pattern Recognit Lett 2019. [DOI: 10.1016/j.patrec.2019.07.022] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Liu H, Hu QV, He L. Term-Based Personalization for Feature Selection in Clinical Handover Form Auto-Filling. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019;16:1219-1230. [PMID: 30296238 DOI: 10.1109/tcbb.2018.2874237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Ordozgoiti B, Mozo A, López de Lacalle JG. Regularized greedy column subset selection. Inf Sci (N Y) 2019. [DOI: 10.1016/j.ins.2019.02.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Semi-supervised One-Pass Multi-view Learning with Variable Features and Views. Neural Process Lett 2019. [DOI: 10.1007/s11063-019-10037-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Zhang Y, Zhou Y, Zhang D, Song W. A Stroke Risk Detection: Improving Hybrid Feature Selection Method. J Med Internet Res 2019;21:e12437. [PMID: 30938684 PMCID: PMC6466481 DOI: 10.2196/12437] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2018] [Revised: 01/04/2019] [Accepted: 01/26/2019] [Indexed: 01/16/2023] Open

Abstract

Background

Stroke is one of the most common diseases that cause mortality. Detecting the risk of stroke for individuals is critical yet challenging because of a large number of risk factors for stroke.

Objective

This study aimed to address the limitation of ineffective feature selection in existing research on stroke risk detection. We have proposed a new feature selection method called weighting- and ranking-based hybrid feature selection (WRHFS) to select important risk factors for detecting ischemic stroke.

Methods

WRHFS integrates the strengths of various filter algorithms by following the principle of a wrapper approach. We employed a variety of filter-based feature selection models as the candidate set, including standard deviation, Pearson correlation coefficient, Fisher score, information gain, Relief algorithm, and chi-square test and used sensitivity, specificity, accuracy, and Youden index as performance metrics to evaluate the proposed method.

Results

This study chose 792 samples from the electronic records of 13,421 patients in a community hospital. Each sample included 28 features (24 blood test features and 4 demographic features). The results of evaluation showed that the proposed method selected 9 important features out of the original 28 features and significantly outperformed baseline methods. Their cumulative contribution was 0.51. The WRHFS method achieved a sensitivity of 82.7% (329/398), specificity of 80.4% (317/394), classification accuracy of 81.5% (645/792), and Youden index of 0.63 using only the top 9 features. We have also presented a chart for visualizing the risk of having ischemic strokes.

Conclusions

This study has proposed, developed, and evaluated a new feature selection method for identifying the most important features for building effective and parsimonious models for stroke risk detection. The findings of this research provide several novel research contributions and practical implications.

Collapse

A factor graph model for unsupervised feature selection. Inf Sci (N Y) 2019. [DOI: 10.1016/j.ins.2018.12.034] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Lin Q, Xue Y, Wen J, Zhong P. A sharing multi-view feature selection method via Alternating Direction Method of Multipliers. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.12.043] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Shi C, Duan C, Gu Z, Tian Q, An G, Zhao R. Semi-supervised feature selection analysis with structured multi-view sparse regularization. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.10.027] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]

Rough set based semi-supervised feature selection via ensemble selector. Knowl Based Syst 2019. [DOI: 10.1016/j.knosys.2018.11.034] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Dual graph regularized compact feature representation for unsupervised feature selection. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.11.060] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Adaptive local learning regularized nonnegative matrix factorization for data clustering. APPL INTELL 2019. [DOI: 10.1007/s10489-018-1380-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Ren Y, Hu K, Dai X, Pan L, Hoi SC, Xu Z. Semi-supervised deep embedded clustering. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.10.016] [Citation(s) in RCA: 73] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Constructing effective personalized policies using counterfactual inference from biased data sets with many features. Mach Learn 2018. [DOI: 10.1007/s10994-018-5768-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Zhao M, Lin M, Chiu B, Zhang Z, Tang XS. Trace Ratio Criterion based Discriminative Feature Selection via l2,-norm regularization for supervised learning. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.08.040] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Sheikhpour R, Sarram MA, Sheikhpour E. Semi-supervised sparse feature selection via graph Laplacian based scatter matrix for regression problems. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2018.08.035] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]