1
|
Yang X, Che H, Leung MF, Wen S. Self-paced regularized adaptive multi-view unsupervised feature selection. Neural Netw 2024; 175:106295. [PMID: 38614023 DOI: 10.1016/j.neunet.2024.106295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 03/14/2024] [Accepted: 04/05/2024] [Indexed: 04/15/2024]
Abstract
Multi-view unsupervised feature selection (MUFS) is an efficient approach for dimensional reduction of heterogeneous data. However, existing MUFS approaches mostly assign the samples the same weight, thus the diversity of samples is not utilized efficiently. Additionally, due to the presence of various regularizations, the resulting MUFS problems are often non-convex, making it difficult to find the optimal solutions. To address this issue, a novel MUFS method named Self-paced Regularized Adaptive Multi-view Unsupervised Feature Selection (SPAMUFS) is proposed. Specifically, the proposed approach firstly trains the MUFS model with simple samples, and gradually learns complex samples by using self-paced regularizer. l2,p-norm (0
Collapse
Affiliation(s)
- Xuanhao Yang
- College of Electronic and Information Engineering, Southwest University, Chongqing, 400715, China.
| | - Hangjun Che
- College of Electronic and Information Engineering, Southwest University, Chongqing, 400715, China; Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, Chongqing, 400715, China.
| | - Man-Fai Leung
- School of Computing and Information Science, Faculty of Science and Engineering, Anglia Ruskin University, Cambridge, UK.
| | - Shiping Wen
- Faculty of Engineering and Information Technology, Australian Artificial Intelligence Institute, University of Technology Sydney, Sydney, NSW 2007, Australia.
| |
Collapse
|
2
|
Lu L, Tan Y, Oetomo D, Mareels I, Clifton DA. Weak Monotonicity With Trend Analysis for Unsupervised Feature Evaluation. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:6883-6895. [PMID: 35500079 DOI: 10.1109/tcyb.2022.3166766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Performance in an engineering system tends to degrade over time due to a variety of wearing or ageing processes. In supervisory controlled processes there are typically many signals being monitored that may help to characterize performance degradation. It is preferred to select the least amount of information to obtain high quality of predictive analysis from a large amount of collected data, in which labeling the data is not always feasible. To this end a novel unsupervised feature selection method, robust with respect to significant measurement disturbances, is proposed using the notion of "weak monotonicity" (WM). The robustness of this notion makes it very attractive to identify the common trend in the presence of measurement noises and population variation from the collected data. Based on WM, a novel suitability indicator is proposed to evaluate the performance of each feature. This new indicator is then used to select the key features that contribute to the WM of a family of processes when noises and variations among processes exist. In order to evaluate the performance of the proposed framework of the WM and suitability, a comparative study with other nine state-of-the-arts unsupervised feature evaluation and selection methods is carried out on well-known benchmark datasets. The results show a promising performance of the proposed framework on unsupervised feature evaluation in the presence of measurement noises and population variations.
Collapse
|
3
|
Wang R, Bian J, Nie F, Li X. Nonlinear Feature Selection Neural Network via Structured Sparse Regularization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:9493-9505. [PMID: 36395136 DOI: 10.1109/tnnls.2022.3209716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Feature selection is an important and effective data preprocessing method, which can remove the noise and redundant features while retaining the relevant and discriminative features in high-dimensional data. In real-world applications, the relationships between data samples and their labels are usually nonlinear. However, most of the existing feature selection models focus on learning a linear transformation matrix, which cannot capture such a nonlinear structure in practice and will degrade the performance of downstream tasks. To address the issue, we propose a novel nonlinear feature selection method to select those most relevant and discriminative features in high-dimensional dataset. Specifically, our method learns the nonlinear structure of high-dimensional data by a neural network with cross entropy loss function, and then using the structured sparsity norm such as l2,p -norm to regularize the weights matrix connecting the input layer and the first hidden layer of the neural network model to learn weight of each feature. Therefore, a structural sparse weights matrix is obtained by conducting nonlinear learning based on a neural network with structured sparsity regularization. Then, we use the gradient descent method to achieve the optimal solution of the proposed model. Evaluating the experimental results on several synthetic datasets and real-world datasets shows the effectiveness and superiority of the proposed nonlinear feature selection model.
Collapse
|
4
|
Karami S, Saberi-Movahed F, Tiwari P, Marttinen P, Vahdati S. Unsupervised feature selection based on variance-covariance subspace distance. Neural Netw 2023; 166:188-203. [PMID: 37499604 DOI: 10.1016/j.neunet.2023.06.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 03/04/2023] [Accepted: 06/12/2023] [Indexed: 07/29/2023]
Abstract
Subspace distance is an invaluable tool exploited in a wide range of feature selection methods. The power of subspace distance is that it can identify a representative subspace, including a group of features that can efficiently approximate the space of original features. On the other hand, employing intrinsic statistical information of data can play a significant role in a feature selection process. Nevertheless, most of the existing feature selection methods founded on the subspace distance are limited in properly fulfilling this objective. To pursue this void, we propose a framework that takes a subspace distance into account which is called "Variance-Covariance subspace distance". The approach gains advantages from the correlation of information included in the features of data, thus determines all the feature subsets whose corresponding Variance-Covariance matrix has the minimum norm property. Consequently, a novel, yet efficient unsupervised feature selection framework is introduced based on the Variance-Covariance distance to handle both the dimensionality reduction and subspace learning tasks. The proposed framework has the ability to exclude those features that have the least variance from the original feature set. Moreover, an efficient update algorithm is provided along with its associated convergence analysis to solve the optimization side of the proposed approach. An extensive number of experiments on nine benchmark datasets are also conducted to assess the performance of our method from which the results demonstrate its superiority over a variety of state-of-the-art unsupervised feature selection methods. The source code is available at https://github.com/SaeedKarami/VCSDFS.
Collapse
Affiliation(s)
- Saeed Karami
- Department of Mathematics, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, 45137-66731, Iran
| | - Farid Saberi-Movahed
- Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran.
| | - Prayag Tiwari
- School of Information Technology, Halmstad University, Sweden; Department of Computer Science, Aalto University, Espoo, Finland.
| | - Pekka Marttinen
- Department of Computer Science, Aalto University, Espoo, Finland
| | - Sahar Vahdati
- Nature-Inspired Machine Intelligence Group at InfAI, Dresden, Germany
| |
Collapse
|
5
|
Li W, Chen H, Li T, Yin T, Luo C. Robust unsupervised feature selection via dual space latent representation learning and adaptive structure learning. INT J MACH LEARN CYB 2023. [DOI: 10.1007/s13042-023-01818-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2023]
|
6
|
Wu JS, Liu JX, Wu JY, Huang W. Dictionary learning for unsupervised feature selection via dual sparse regression. APPL INTELL 2023. [DOI: 10.1007/s10489-023-04480-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2023]
|
7
|
Shang R, Zhang W, Li Z, Wang C, Jiao L. Attribute community detection based on latent representation learning and graph regularized non-negative matrix factorization. Appl Soft Comput 2023. [DOI: 10.1016/j.asoc.2022.109932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
8
|
Shang R, Kong J, Wang L, Zhang W, Wang C, Li Y, Jiao L. Unsupervised feature selection via discrete spectral clustering and feature weights. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2022.10.053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
9
|
Dual Regularized Unsupervised Feature Selection Based on Matrix Factorization and Minimum Redundancy with application in gene selection. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
10
|
Multi-view multi-manifold learning with local and global structure preservation. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04101-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
11
|
Robust unsupervised feature selection via sparse and minimum-redundant subspace learning with dual regularization. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
12
|
Semi-supervised feature selection based on pairwise constraint-guided dual space latent representation learning and double sparse graphs discriminant. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04040-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
13
|
Compression and reinforce variation with convolutional neural networks for hyperspectral image classification. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
14
|
|
15
|
Saberi-Movahed F, Mohammadifard M, Mehrpooya A, Rezaei-Ravari M, Berahmand K, Rostami M, Karami S, Najafzadeh M, Hajinezhad D, Jamshidi M, Abedi F, Mohammadifard M, Farbod E, Safavi F, Dorvash M, Mottaghi-Dastjerdi N, Vahedi S, Eftekhari M, Saberi-Movahed F, Alinejad-Rokny H, Band SS, Tavassoly I. Decoding clinical biomarker space of COVID-19: Exploring matrix factorization-based feature selection methods. Comput Biol Med 2022; 146:105426. [PMID: 35569336 PMCID: PMC8979841 DOI: 10.1016/j.compbiomed.2022.105426] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2021] [Revised: 03/01/2022] [Accepted: 03/18/2022] [Indexed: 02/06/2023]
Abstract
One of the most critical challenges in managing complex diseases like COVID-19 is to establish an intelligent triage system that can optimize the clinical decision-making at the time of a global pandemic. The clinical presentation and patients' characteristics are usually utilized to identify those patients who need more critical care. However, the clinical evidence shows an unmet need to determine more accurate and optimal clinical biomarkers to triage patients under a condition like the COVID-19 crisis. Here we have presented a machine learning approach to find a group of clinical indicators from the blood tests of a set of COVID-19 patients that are predictive of poor prognosis and morbidity. Our approach consists of two interconnected schemes: Feature Selection and Prognosis Classification. The former is based on different Matrix Factorization (MF)-based methods, and the latter is performed using Random Forest algorithm. Our model reveals that Arterial Blood Gas (ABG) O2 Saturation and C-Reactive Protein (CRP) are the most important clinical biomarkers determining the poor prognosis in these patients. Our approach paves the path of building quantitative and optimized clinical management systems for COVID-19 and similar diseases.
Collapse
Affiliation(s)
| | | | - Adel Mehrpooya
- School of Mathematical Sciences, Science and Engineering Faculty, Queensland University of Technology (QUT), Brisbane, Australia
| | | | - Kamal Berahmand
- School of Computer Science, Faculty of Science, Queensland University of Technology (QUT), Brisbane, Australia
| | - Mehrdad Rostami
- Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Oulu, Finland
| | - Saeed Karami
- Department of Mathematics, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, 45137-66731, Iran
| | - Mohammad Najafzadeh
- Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran
| | | | - Mina Jamshidi
- Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran
| | - Farshid Abedi
- Infectious Diseases Research Center, Birjand University of Medical Sciences, Birjand, Iran
| | | | - Elnaz Farbod
- Baruch College, City University of New York, New York, USA
| | - Farinaz Safavi
- Neuroimmunology and Neurovirology Branch, National Institute of Neurological Disorders and Stroke, National Institute of Health, Bethesda, MD, USA
| | - Mohammadreza Dorvash
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Viewbank, VIC, Australia
| | - Negar Mottaghi-Dastjerdi
- Department of Pharmacognosy and Pharmaceutical Biotechnology, School of Pharmacy, Iran University of Medical Sciences, Tehran, Iran
| | | | - Mahdi Eftekhari
- Department of Computer Engineering, Shahid Bahonar University of Kerman, Kerman, Iran
| | - Farid Saberi-Movahed
- Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran,Corresponding author
| | - Hamid Alinejad-Rokny
- BioMedical Machine Learning Lab, The Graduate School of Biomedical Engineering, UNSW Sydney, Sydney, NSW, 2052, Australia
| | - Shahab S. Band
- Future Technology Research Center, College of Future, National Yunlin University of Science and Technology, 123 University Road, Section 3, Douliou, Yunlin, 64002, Taiwan
| | - Iman Tavassoly
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY10029, USA,Corresponding author
| |
Collapse
|
16
|
Gong X, Yu L, Wang J, Zhang K, Bai X, Pal NR. Unsupervised feature selection via adaptive autoencoder with redundancy control. Neural Netw 2022; 150:87-101. [DOI: 10.1016/j.neunet.2022.03.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 01/21/2022] [Accepted: 03/03/2022] [Indexed: 10/18/2022]
|
17
|
Zhang H, Gong M, Nie F, Li X. Unified Dual-label Semi-supervised Learning with Top-k Feature Selection. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.05.090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
18
|
Liu P, Luo J, Chen X. miRCom: Tensor Completion Integrating Multi-View Information to Deduce the Potential Disease-Related miRNA-miRNA Pairs. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1747-1759. [PMID: 33180730 DOI: 10.1109/tcbb.2020.3037331] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
MicroRNAs (miRNAs) are consistently capable of regulating gene expression synergistically in a combination mode and play a key role in various biological processes associated with the initiation and development of human diseases, which indicate that comprehending the synergistic molecular mechanism of miRNAs may facilitate understanding the pathogenesis of diseases or even overcome it. However, most existing computational methods had an incomprehensive acknowledge of the miRNA synergistic effect on the pathogenesis of complex diseases, or were hard to be extended to a large-scale prediction task of miRNA synergistic combinations for different diseases. In this article, we propose a novel tensor completion framework integrating multi-view miRNAs and diseases information, called miRCom, for the discovery of potential disease-associated miRNA-miRNA pairs. We first construct an incomplete three-order association tensor and several types of similarity matrices based on existing biological knowledge. Then, we formulate an objective function via performing the factorizations of coupled tensor and matrices simultaneously. Finally, we build an optimization schema by adopting the ADMM algorithm. After that, we obtain the prediction of miRNA-miRNA pairs for different diseases from the full tensor. The contrastive experimental results with other approaches verified that miRCom effectively identify the potential disease-related miRNA-miRNA pairs. Moreover, case study results further illustrated that miRNA-miRNA pairs have more biologically significance and prognostic value than single miRNAs.
Collapse
|
19
|
Sparse and low-dimensional representation with maximum entropy adaptive graph for feature selection. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.02.038] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
20
|
Wang J, Wang LH, Liu JX, Kong XZ, Li SJ. Multi-view Random-walk Graph regularization Low-Rank Representation for cancer clustering and Differentially Expressed Gene Selection. IEEE J Biomed Health Inform 2022; 26:3578-3589. [PMID: 35157604 DOI: 10.1109/jbhi.2022.3151333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Cancer genome data generally consists of multiple views from different sources. These views provide different levels of information about gene activity, as well as more comprehensive cancer information. The low-rank representation (LRR) method, as a powerful subspace clustering method, has been extended and applied in cancer data research. However, most methods based on low-rank representation only study single-view data in cancer genome data, such as gene expression data. The methods based on single-view genome data usually ignore the complementary relationship between the views, which is not conducive to further study of cancer. Therefore, this paper proposes a new method named Multi-view Random-walk Graph regularization Low-Rank Representation (MRGLRR) to comprehensively analyze multi-view genomics data. This method uses multi-view model to find the common centroid of view. By constructing a joint affinity matrix to learn the low-rank subspace representation of multiple sets of data, the hidden information of each view is fully obtained. In addition, this method introduces random walk graph regularization constraint to obtain more accurate similarity between samples. Different from the traditional graph regularization constraint, after constructing the KNN graph, we use the random walk algorithm to obtain the weight matrix. The random walk algorithm can retain more local geometric information and better learn the topological structure of the data. What's more, a feature gene selection strategy suitable for multi-view model is proposed to find more differentially expressed genes with research value. Experimental results show that our method is better than other representative methods in terms of clustering and feature gene selection for cancer multi-omics data.
Collapse
|
21
|
Lu H, Chen H, Li T, Chen H, Luo C. Multi-label feature selection based on manifold regularization and imbalance ratio. APPL INTELL 2022. [DOI: 10.1007/s10489-021-03141-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
22
|
Mehrpooya A, Saberi-Movahed F, Azizizadeh N, Rezaei-Ravari M, Saberi-Movahed F, Eftekhari M, Tavassoly I. High dimensionality reduction by matrix factorization for systems pharmacology. Brief Bioinform 2022; 23:bbab410. [PMID: 34891155 PMCID: PMC8898012 DOI: 10.1093/bib/bbab410] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 08/20/2021] [Accepted: 09/07/2021] [Indexed: 12/13/2022] Open
Abstract
The extraction of predictive features from the complex high-dimensional multi-omic data is necessary for decoding and overcoming the therapeutic responses in systems pharmacology. Developing computational methods to reduce high-dimensional space of features in in vitro, in vivo and clinical data is essential to discover the evolution and mechanisms of the drug responses and drug resistance. In this paper, we have utilized the matrix factorization (MF) as a modality for high dimensionality reduction in systems pharmacology. In this respect, we have proposed three novel feature selection methods using the mathematical conception of a basis for features. We have applied these techniques as well as three other MF methods to analyze eight different gene expression datasets to investigate and compare their performance for feature selection. Our results show that these methods are capable of reducing the feature spaces and find predictive features in terms of phenotype determination. The three proposed techniques outperform the other methods used and can extract a 2-gene signature predictive of a tyrosine kinase inhibitor treatment response in the Cancer Cell Line Encyclopedia.
Collapse
Affiliation(s)
- Adel Mehrpooya
- School of Mathematical Sciences, Science and Engineering Faculty, Queensland University of Technology (QUT), Brisbane, Australia
- Department of Computer Engineering, Faculty of Engineering, Shahid Bahonar University of Kerman, Kerman, Iran
| | - Farid Saberi-Movahed
- Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran
| | - Najmeh Azizizadeh
- Department of Applied Mathematics, Faculty of Mathematics and Computer, Shahid Bahonar University of Kerman, Iran
| | - Mohammad Rezaei-Ravari
- Department of Computer Engineering, Faculty of Engineering, Shahid Bahonar University of Kerman, Kerman, Iran
| | | | - Mahdi Eftekhari
- Department of Computer Engineering, Faculty of Engineering, Shahid Bahonar University of Kerman, Kerman, Iran
| | - Iman Tavassoly
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY10029, USA
| |
Collapse
|
23
|
|
24
|
Symmetric positive definite manifold learning and its application in fault diagnosis. Neural Netw 2021; 147:163-174. [PMID: 35038622 DOI: 10.1016/j.neunet.2021.12.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Revised: 10/07/2021] [Accepted: 12/20/2021] [Indexed: 11/24/2022]
Abstract
Locally linear embedding (LLE) is an effective tool to extract the significant features from a dataset. However, most of the relevant existing algorithms assume that the original dataset resides on a Euclidean space, unfortunately nearly all the original data space is non-Euclidean. In addition, the original LLE does not use the discriminant information of the dataset, which will degrade its performance in feature extraction. To address these problems raised in the conventional LLE, we first employ the original dataset to construct a symmetric positive definite manifold, and then estimate the tangent space of this manifold. Furthermore, the local and global discriminant information are integrated into the LLE, and the improved LLE is operated in the tangent space to extract the important features. We introduce Iris dataset to analyze the capability of the proposed method to extract features. Finally, several experiments are performed on five machinery datasets, and experimental results indicate that our proposed method can extract the excellent low-dimensional representations of the original dataset. Compared with the state-of-the-art methods, the proposed algorithm shows a strong capability for fault diagnosis.
Collapse
|
25
|
Zheng X, Zhang C. Gene selection for microarray data classification via dual latent representation learning. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.07.047] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
26
|
Saberi-Movahed F, Mohammadifard M, Mehrpooya A, Rezaei-Ravari M, Berahmand K, Rostami M, Karami S, Najafzadeh M, Hajinezhad D, Jamshidi M, Abedi F, Mohammadifard M, Farbod E, Safavi F, Dorvash M, Vahedi S, Eftekhari M, Saberi-Movahed F, Tavassoly I. Decoding Clinical Biomarker Space of COVID-19: Exploring Matrix Factorization-based Feature Selection Methods. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2021:2021.07.07.21259699. [PMID: 34268522 PMCID: PMC8282111 DOI: 10.1101/2021.07.07.21259699] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
One of the most critical challenges in managing complex diseases like COVID-19 is to establish an intelligent triage system that can optimize the clinical decision-making at the time of a global pandemic. The clinical presentation and patients’ characteristics are usually utilized to identify those patients who need more critical care. However, the clinical evidence shows an unmet need to determine more accurate and optimal clinical biomarkers to triage patients under a condition like the COVID-19 crisis. Here we have presented a machine learning approach to find a group of clinical indicators from the blood tests of a set of COVID-19 patients that are predictive of poor prognosis and morbidity. Our approach consists of two interconnected schemes: Feature Selection and Prognosis Classification. The former is based on different Matrix Factorization (MF)-based methods, and the latter is performed using Random Forest algorithm. Our model reveals that Arterial Blood Gas (ABG) O 2 Saturation and C-Reactive Protein (CRP) are the most important clinical biomarkers determining the poor prognosis in these patients. Our approach paves the path of building quantitative and optimized clinical management systems for COVID-19 and similar diseases.
Collapse
Affiliation(s)
| | | | - Adel Mehrpooya
- School of Mathematical Sciences, Science and Engineering Faculty, Queensland University of Technology (QUT), Brisbane, Australia
| | | | - Kamal Berahmand
- School of Computer Sciences, Science and Engineering Faculty, Queensland University of Technology (QUT), Brisbane Australia
| | | | - Saeed Karami
- Department of Mathematics, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, 45137-66731, Iran
| | - Mohammad Najafzadeh
- Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran
| | | | - Mina Jamshidi
- Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran
| | - Farshid Abedi
- Infectious Diseases Research Center, Birjand University of Medical Sciences, Birjand, Iran
| | | | - Elnaz Farbod
- Baruch College, City University of New York, New York, USA
| | - Farinaz Safavi
- Neuroimmunology and Neurovirology Branch, National Institute of Neurological Disorders and Stroke, National Institute of Health, Bethesda, Maryland, USA
| | - Mohammadreza Dorvash
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Viewbank, VIC, Australia
| | | | - Mahdi Eftekhari
- Department of Computer Engineering, University of Kerman, Kerman, Iran
| | - Farid Saberi-Movahed
- Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran
| | - Iman Tavassoly
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY10029
| |
Collapse
|
27
|
Fu S, Liu W, Zhang K, Zhou Y, Tao D. Semi-supervised classification by graph p-Laplacian convolutional networks. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.01.075] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
28
|
Jiang Y, Luo Q, Wei Y, Abualigah L, Zhou Y. An efficient binary Gradient-based optimizer for feature selection. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021; 18:3813-3854. [PMID: 34198414 DOI: 10.3934/mbe.2021192] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Feature selection (FS) is a classic and challenging optimization task in the field of machine learning and data mining. Gradient-based optimizer (GBO) is a recently developed metaheuristic with population-based characteristics inspired by gradient-based Newton's method that uses two main operators: the gradient search rule (GSR), the local escape operator (LEO) and a set of vectors to explore the search space for solving continuous problems. This article presents a binary GBO (BGBO) algorithm and for feature selecting problems. The eight independent GBO variants are proposed, and eight transfer functions divided into two families of S-shaped and V-shaped are evaluated to map the search space to a discrete space of research. To verify the performance of the proposed binary GBO algorithm, 18 well-known UCI datasets and 10 high-dimensional datasets are tested and compared with other advanced FS methods. The experimental results show that among the proposed binary GBO algorithms has the best comprehensive performance and has better performance than other well known metaheuristic algorithms in terms of the performance measures.
Collapse
Affiliation(s)
- Yugui Jiang
- College of Artificial Intelligence, Guangxi University for Nationalities, Nanning 530006, China
- Guangxi Key Laboratories of Hybrid Computation and IC Design Analysis, Nanning 530006, China
| | - Qifang Luo
- College of Artificial Intelligence, Guangxi University for Nationalities, Nanning 530006, China
- Guangxi Key Laboratories of Hybrid Computation and IC Design Analysis, Nanning 530006, China
| | - Yuanfei Wei
- Xiangsihu College of Gunagxi University for Nationalities, Nanning, Guangxi 532100, China
| | - Laith Abualigah
- Faculty of Computer Sciences and Informatics, Amman Arab University, Amman 11953, Jordan
| | - Yongquan Zhou
- College of Artificial Intelligence, Guangxi University for Nationalities, Nanning 530006, China
- Guangxi Key Laboratories of Hybrid Computation and IC Design Analysis, Nanning 530006, China
| |
Collapse
|
29
|
Jing P, Su Y, Li Z, Nie L. Learning robust affinity graph representation for multi-view clustering. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2020.06.068] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
30
|
Zhang Z, Chen L, Xu P, Xing L, Hong Y, Chen P. Gene correlation network analysis to identify regulatory factors in sepsis. J Transl Med 2020; 18:381. [PMID: 33032623 PMCID: PMC7545567 DOI: 10.1186/s12967-020-02561-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2020] [Accepted: 10/03/2020] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND AND OBJECTIVES Sepsis is a leading cause of mortality and morbidity in the intensive care unit. Regulatory mechanisms underlying the disease progression and prognosis are largely unknown. The study aimed to identify master regulators of mortality-related modules, providing potential therapeutic target for further translational experiments. METHODS The dataset GSE65682 from the Gene Expression Omnibus (GEO) database was utilized for bioinformatic analysis. Consensus weighted gene co-expression netwoek analysis (WGCNA) was performed to identify modules of sepsis. The module most significantly associated with mortality were further analyzed for the identification of master regulators of transcription factors and miRNA. RESULTS A total number of 682 subjects with various causes of sepsis were included for consensus WGCNA analysis, which identified 27 modules. The network was well preserved among different causes of sepsis. Two modules designated as black and light yellow module were found to be associated with mortality outcome. Key regulators of the black and light yellow modules were the transcription factor CEBPB (normalized enrichment score = 5.53) and ETV6 (NES = 6), respectively. The top 5 miRNA regulated the most number of genes were hsa-miR-335-5p (n = 59), hsa-miR-26b-5p (n = 57), hsa-miR-16-5p (n = 44), hsa-miR-17-5p (n = 42), and hsa-miR-124-3p (n = 38). Clustering analysis in 2-dimension space derived from manifold learning identified two subclasses of sepsis, which showed significant association with survival in Cox proportional hazard model (p = 0.018). CONCLUSIONS The present study showed that the black and light-yellow modules were significantly associated with mortality outcome. Master regulators of the module included transcription factor CEBPB and ETV6. miRNA-target interactions identified significantly enriched miRNA.
Collapse
Affiliation(s)
- Zhongheng Zhang
- grid.13402.340000 0004 1759 700XDepartment of Emergency Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, No 3, East Qingchun Road, Hangzhou, 310016 Zhejiang Province China
| | - Lin Chen
- grid.13402.340000 0004 1759 700XDepartment of Critical Care Medicine, Affiliated Jinhua Hospital, Zhejiang University School of Medicine, Jinhua, China
| | - Ping Xu
- Emergency Department, Zigong Fourth People’s Hospital, 19 Tanmulin Road, Zigong, Sichuan China
| | - Lifeng Xing
- grid.13402.340000 0004 1759 700XDepartment of Emergency Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, No 3, East Qingchun Road, Hangzhou, 310016 Zhejiang Province China
| | - Yucai Hong
- grid.13402.340000 0004 1759 700XDepartment of Emergency Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, No 3, East Qingchun Road, Hangzhou, 310016 Zhejiang Province China
| | - Pengpeng Chen
- grid.13402.340000 0004 1759 700XDepartment of Emergency Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, No 3, East Qingchun Road, Hangzhou, 310016 Zhejiang Province China
| |
Collapse
|
31
|
|
32
|
Kang Z, Lu X, Lu Y, Peng C, Chen W, Xu Z. Structure learning with similarity preserving. Neural Netw 2020; 129:138-148. [DOI: 10.1016/j.neunet.2020.05.030] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2019] [Revised: 02/15/2020] [Accepted: 05/26/2020] [Indexed: 02/07/2023]
|
33
|
Hu J, Li Y, Gao W, Zhang P. Robust multi-label feature selection with dual-graph regularization. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106126] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
34
|
Kang Z, Lu X, Liang J, Bai K, Xu Z. Relation-Guided Representation Learning. Neural Netw 2020; 131:93-102. [PMID: 32763763 DOI: 10.1016/j.neunet.2020.07.014] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 06/12/2020] [Accepted: 07/10/2020] [Indexed: 11/20/2022]
Abstract
Deep auto-encoders (DAEs) have achieved great success in learning data representations via the powerful representability of neural networks. But most DAEs only focus on the most dominant structures which are able to reconstruct the data from a latent space and neglect rich latent structural information. In this work, we propose a new representation learning method that explicitly models and leverages sample relations, which in turn is used as supervision to guide the representation learning. Different from previous work, our framework well preserves the relations between samples. Since the prediction of pairwise relations themselves is a fundamental problem, our model adaptively learns them from data. This provides much flexibility to encode real data manifold. The important role of relation and representation learning is evaluated on the clustering task. Extensive experiments on benchmark data sets demonstrate the superiority of our approach. By seeking to embed samples into subspace, we further show that our method can address the large-scale and out-of-sample problem. Our source code is publicly available at: https://github.com/nbShawnLu/RGRL.
Collapse
Affiliation(s)
- Zhao Kang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Sichuan, China; Trusted Cloud Computing and Big Data Key Laboratory of Sichuan Province, China
| | - Xiao Lu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Sichuan, China
| | - Jian Liang
- Cloud and Smart Industries Group, Tencent, Beijing, China
| | - Kun Bai
- Cloud and Smart Industries Group, Tencent, Beijing, China
| | - Zenglin Xu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China; Center for Artificial Intelligence, Peng Cheng Lab, Shenzhen, China.
| |
Collapse
|
35
|
Unsupervised feature selection via adaptive hypergraph regularized latent representation learning. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.10.018] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|