1
|
Saberi-Movahed F, Mohammadifard M, Mehrpooya A, Rezaei-Ravari M, Berahmand K, Rostami M, Karami S, Najafzadeh M, Hajinezhad D, Jamshidi M, Abedi F, Mohammadifard M, Farbod E, Safavi F, Dorvash M, Mottaghi-Dastjerdi N, Vahedi S, Eftekhari M, Saberi-Movahed F, Alinejad-Rokny H, Band SS, Tavassoly I. Decoding clinical biomarker space of COVID-19: Exploring matrix factorization-based feature selection methods. Comput Biol Med 2022; 146:105426. [PMID: 35569336 PMCID: PMC8979841 DOI: 10.1016/j.compbiomed.2022.105426] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2021] [Revised: 03/01/2022] [Accepted: 03/18/2022] [Indexed: 02/06/2023]
Abstract
One of the most critical challenges in managing complex diseases like COVID-19 is to establish an intelligent triage system that can optimize the clinical decision-making at the time of a global pandemic. The clinical presentation and patients' characteristics are usually utilized to identify those patients who need more critical care. However, the clinical evidence shows an unmet need to determine more accurate and optimal clinical biomarkers to triage patients under a condition like the COVID-19 crisis. Here we have presented a machine learning approach to find a group of clinical indicators from the blood tests of a set of COVID-19 patients that are predictive of poor prognosis and morbidity. Our approach consists of two interconnected schemes: Feature Selection and Prognosis Classification. The former is based on different Matrix Factorization (MF)-based methods, and the latter is performed using Random Forest algorithm. Our model reveals that Arterial Blood Gas (ABG) O2 Saturation and C-Reactive Protein (CRP) are the most important clinical biomarkers determining the poor prognosis in these patients. Our approach paves the path of building quantitative and optimized clinical management systems for COVID-19 and similar diseases.
Collapse
Affiliation(s)
| | | | - Adel Mehrpooya
- School of Mathematical Sciences, Science and Engineering Faculty, Queensland University of Technology (QUT), Brisbane, Australia
| | | | - Kamal Berahmand
- School of Computer Science, Faculty of Science, Queensland University of Technology (QUT), Brisbane, Australia
| | - Mehrdad Rostami
- Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Oulu, Finland
| | - Saeed Karami
- Department of Mathematics, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, 45137-66731, Iran
| | - Mohammad Najafzadeh
- Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran
| | | | - Mina Jamshidi
- Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran
| | - Farshid Abedi
- Infectious Diseases Research Center, Birjand University of Medical Sciences, Birjand, Iran
| | | | - Elnaz Farbod
- Baruch College, City University of New York, New York, USA
| | - Farinaz Safavi
- Neuroimmunology and Neurovirology Branch, National Institute of Neurological Disorders and Stroke, National Institute of Health, Bethesda, MD, USA
| | - Mohammadreza Dorvash
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Viewbank, VIC, Australia
| | - Negar Mottaghi-Dastjerdi
- Department of Pharmacognosy and Pharmaceutical Biotechnology, School of Pharmacy, Iran University of Medical Sciences, Tehran, Iran
| | | | - Mahdi Eftekhari
- Department of Computer Engineering, Shahid Bahonar University of Kerman, Kerman, Iran
| | - Farid Saberi-Movahed
- Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran,Corresponding author
| | - Hamid Alinejad-Rokny
- BioMedical Machine Learning Lab, The Graduate School of Biomedical Engineering, UNSW Sydney, Sydney, NSW, 2052, Australia
| | - Shahab S. Band
- Future Technology Research Center, College of Future, National Yunlin University of Science and Technology, 123 University Road, Section 3, Douliou, Yunlin, 64002, Taiwan
| | - Iman Tavassoly
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY10029, USA,Corresponding author
| |
Collapse
|
2
|
Chen K, Che H, Li X, Leung MF. Graph non-negative matrix factorization with alternative smoothed $$L_0$$ regularizations. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07200-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
3
|
Zhang LX, Yan H, Liu Y, Xu J, Song J, Yu DJ. Enhancing Characteristic Gene Selection and Tumor Classification by the Robust Laplacian Supervised Discriminative Sparse PCA. J Chem Inf Model 2022; 62:1794-1807. [PMID: 35353532 DOI: 10.1021/acs.jcim.1c01403] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Characteristic gene selection and tumor classification of gene expression data play major roles in genomic research. Due to the characteristics of a small sample size and high dimensionality of gene expression data, it is a common practice to perform dimensionality reduction prior to the use of machine learning-based methods to analyze the expression data. In this context, classical principal component analysis (PCA) and its improved versions have been widely used. Recently, methods based on supervised discriminative sparse PCA have been developed to improve the performance of data dimensionality reduction. However, such methods still have limitations: most of them have not taken into consideration the improvement of robustness to outliers and noise, label information, sparsity, as well as capturing intrinsic geometrical structures in one objective function. To address this drawback, in this study, we propose a novel PCA-based method, known as the robust Laplacian supervised discriminative sparse PCA, termed RLSDSPCA, which enforces the L2,1 norm on the error function and incorporates the graph Laplacian into supervised discriminative sparse PCA. To evaluate the efficacy of the proposed RLSDSPCA, we applied it to the problems of characteristic gene selection and tumor classification problems using gene expression data. The results demonstrate that the proposed RLSDSPCA method, when used in combination with other related methods, can effectively identify new pathogenic genes associated with diseases. In addition, RLSDSPCA has also achieved the best performance compared with the state-of-the-art methods on tumor classification in terms of major performance metrics. The codes and data sets used in the study are freely available at http://csbio.njust.edu.cn/bioinf/rlsdspca/.
Collapse
Affiliation(s)
- Lu-Xing Zhang
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
| | - He Yan
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
| | - Yan Liu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
| | - Jian Xu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia.,Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, Victoria 3800, Australia
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
| |
Collapse
|
4
|
Liu R, Liu L, Zhou Y. m6Adecom: analysis of m6A profile matrix based on graph regularized non-negative matrix factorization. Methods 2022; 203:322-327. [DOI: 10.1016/j.ymeth.2022.01.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 01/12/2022] [Accepted: 01/21/2022] [Indexed: 01/07/2023] Open
|
5
|
Mokhtia M, Eftekhari M, Saberi-Movahed F. Dual-manifold regularized regression models for feature selection based on hesitant fuzzy correlation. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107308] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
6
|
Saberi-Movahed F, Mohammadifard M, Mehrpooya A, Rezaei-Ravari M, Berahmand K, Rostami M, Karami S, Najafzadeh M, Hajinezhad D, Jamshidi M, Abedi F, Mohammadifard M, Farbod E, Safavi F, Dorvash M, Vahedi S, Eftekhari M, Saberi-Movahed F, Tavassoly I. Decoding Clinical Biomarker Space of COVID-19: Exploring Matrix Factorization-based Feature Selection Methods. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2021:2021.07.07.21259699. [PMID: 34268522 PMCID: PMC8282111 DOI: 10.1101/2021.07.07.21259699] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
One of the most critical challenges in managing complex diseases like COVID-19 is to establish an intelligent triage system that can optimize the clinical decision-making at the time of a global pandemic. The clinical presentation and patients’ characteristics are usually utilized to identify those patients who need more critical care. However, the clinical evidence shows an unmet need to determine more accurate and optimal clinical biomarkers to triage patients under a condition like the COVID-19 crisis. Here we have presented a machine learning approach to find a group of clinical indicators from the blood tests of a set of COVID-19 patients that are predictive of poor prognosis and morbidity. Our approach consists of two interconnected schemes: Feature Selection and Prognosis Classification. The former is based on different Matrix Factorization (MF)-based methods, and the latter is performed using Random Forest algorithm. Our model reveals that Arterial Blood Gas (ABG) O 2 Saturation and C-Reactive Protein (CRP) are the most important clinical biomarkers determining the poor prognosis in these patients. Our approach paves the path of building quantitative and optimized clinical management systems for COVID-19 and similar diseases.
Collapse
Affiliation(s)
| | | | - Adel Mehrpooya
- School of Mathematical Sciences, Science and Engineering Faculty, Queensland University of Technology (QUT), Brisbane, Australia
| | | | - Kamal Berahmand
- School of Computer Sciences, Science and Engineering Faculty, Queensland University of Technology (QUT), Brisbane Australia
| | | | - Saeed Karami
- Department of Mathematics, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, 45137-66731, Iran
| | - Mohammad Najafzadeh
- Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran
| | | | - Mina Jamshidi
- Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran
| | - Farshid Abedi
- Infectious Diseases Research Center, Birjand University of Medical Sciences, Birjand, Iran
| | | | - Elnaz Farbod
- Baruch College, City University of New York, New York, USA
| | - Farinaz Safavi
- Neuroimmunology and Neurovirology Branch, National Institute of Neurological Disorders and Stroke, National Institute of Health, Bethesda, Maryland, USA
| | - Mohammadreza Dorvash
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Viewbank, VIC, Australia
| | | | - Mahdi Eftekhari
- Department of Computer Engineering, University of Kerman, Kerman, Iran
| | - Farid Saberi-Movahed
- Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran
| | - Iman Tavassoly
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY10029
| |
Collapse
|
7
|
Wang CY, Liu JX, Yu N, Zheng CH. Sparse Graph Regularization Non-Negative Matrix Factorization Based on Huber Loss Model for Cancer Data Analysis. Front Genet 2019; 10:1054. [PMID: 31824556 PMCID: PMC6882287 DOI: 10.3389/fgene.2019.01054] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2019] [Accepted: 10/01/2019] [Indexed: 12/02/2022] Open
Abstract
Non-negative matrix factorization (NMF) is a matrix decomposition method based on the square loss function. To exploit cancer information, cancer gene expression data often uses the NMF method to reduce dimensionality. Gene expression data usually have some noise and outliers, while the original NMF loss function is very sensitive to non-Gaussian noise. To improve the robustness and clustering performance of the algorithm, we propose a sparse graph regularization NMF based on Huber loss model for cancer data analysis (Huber-SGNMF). Huber loss is a function between L1-norm and L2-norm that can effectively handle non-Gaussian noise and outliers. Taking into account the sparsity matrix and data geometry information, sparse penalty and graph regularization terms are introduced into the model to enhance matrix sparsity and capture data manifold structure. Before the experiment, we first analyzed the robustness of Huber-SGNMF and other models. Experiments on The Cancer Genome Atlas (TCGA) data have shown that Huber-SGNMF performs better than other most advanced methods in sample clustering and differentially expressed gene selection.
Collapse
Affiliation(s)
- Chuan-Yuan Wang
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| | - Jin-Xing Liu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
- *Correspondence: Jin-Xing Liu,
| | - Na Yu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| | - Chun-Hou Zheng
- School of Software Engineering, Qufu Normal University, Qufu, China
| |
Collapse
|
8
|
Yu N, Gao YL, Liu JX, Wang J, Shang J. Robust hypergraph regularized non-negative matrix factorization for sample clustering and feature selection in multi-view gene expression data. Hum Genomics 2019; 13:46. [PMID: 31639067 PMCID: PMC6805321 DOI: 10.1186/s40246-019-0222-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Background As one of the most popular data representation methods, non-negative matrix decomposition (NMF) has been widely concerned in the tasks of clustering and feature selection. However, most of the previously proposed NMF-based methods do not adequately explore the hidden geometrical structure in the data. At the same time, noise and outliers are inevitably present in the data. Results To alleviate these problems, we present a novel NMF framework named robust hypergraph regularized non-negative matrix factorization (RHNMF). In particular, the hypergraph Laplacian regularization is imposed to capture the geometric information of original data. Unlike graph Laplacian regularization which captures the relationship between pairwise sample points, it captures the high-order relationship among more sample points. Moreover, the robustness of the RHNMF is enhanced by using the L2,1-norm constraint when estimating the residual. This is because the L2,1-norm is insensitive to noise and outliers. Conclusions Clustering and common abnormal expression gene (com-abnormal expression gene) selection are conducted to test the validity of the RHNMF model. Extensive experimental results on multi-view datasets reveal that our proposed model outperforms other state-of-the-art methods.
Collapse
Affiliation(s)
- Na Yu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, 276826, China
| | - Ying-Lian Gao
- Library of Qufu Normal University, Qufu Normal University, Rizhao, 276826, China
| | - Jin-Xing Liu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, 276826, China.
| | - Juan Wang
- School of Information Science and Engineering, Qufu Normal University, Rizhao, 276826, China.
| | - Junliang Shang
- School of Information Science and Engineering, Qufu Normal University, Rizhao, 276826, China
| |
Collapse
|
9
|
Sun X, Zhang L, Wang Z, Chang J, Yao Y, Li P, Zimmermann R. Scene Categorization Using Deeply Learned Gaze Shifting Kernel. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:2156-2167. [PMID: 29993760 DOI: 10.1109/tcyb.2018.2820731] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Accurately recognizing sophisticated sceneries from a rich variety of semantic categories is an indispensable component in many intelligent systems, e.g., scene parsing, video surveillance, and autonomous driving. Recently, there have emerged a large quantity of deep architectures for scene categorization, wherein promising performance has been achieved. However, these models cannot explicitly encode human visual perception toward different sceneries, i.e., the sequence of humans sequentially allocates their gazes. To solve this problem, we propose deep gaze shifting kernel to distinguish sceneries from different categories. Specifically, we first project regions from each scenery into the so-called perceptual space, which is established by combining color, texture, and semantic features. Then, a novel non-negative matrix factorization algorithm is developed which decomposes the regions' feature matrix into the product of the basis matrix and the sparse codes. The sparse codes indicate the saliency level of different regions. In this way, the gaze shifting path from each scenery is derived and an aggregation-based convolutional neural network is designed accordingly to learn its deep representation. Finally, the deep representations of gaze shifting paths from all the scene images are incorporated into an image kernel, which is further fed into a kernel SVM for scene categorization. Comprehensive experiments on six scenery data sets have demonstrated the superiority of our method over a series of shallow/deep recognition models. Besides, eye tracking experiments have shown that our predicted gaze shifting paths are 94.6% consistent with the real human gaze allocations.
Collapse
|
10
|
Sun X, Hu Y, Zhang L, Chen Y, Li P, Xie Z, Liu Z. Camera-Assisted Video Saliency Prediction and Its Applications. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:2520-2530. [PMID: 29990269 DOI: 10.1109/tcyb.2017.2741498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Video saliency prediction is an indispensable yet challenging technique which can facilitate various applications, such as video surveillance, autonomous driving, and realistic rendering. Based on the popularity of embedded cameras, we in this paper predict region-level saliency from videos by leveraging human gaze locations recorded using a camera, (e.g., those equipped on an iMAC and laptop PC). Our proposed camera-assisted mechanism improves saliency prediction by discovering human attended regions inside a video clip. It is orthogonal to the current saliency models, i.e., any existing video/image saliency model can be boosted by our mechanism. First of all, the spatial-and temporal-level visual features are exploited collaboratively for calculating an initial saliency map. We notice that the current saliency models are not sufficiently adaptable to the variations in lighting, different view angles, and complicated backgrounds. Therefore, assisted by a camera tracking human gaze movements, a non-negative matrix factorization algorithm is designed to accurately localize the semantically/visually salient video regions perceived by humans. Finally, the learned human gaze locations as well as the initial saliency map are integrated to optimize video saliency calculation. Empirical results thoroughly demonstrated that: 1) our approach achieves the state-of-the-art video saliency prediction accuracy by outperforming 11 mainstream algorithms considerably and 2) our method can conveniently and successfully enhance video retargeting, quality estimation, and summarization.
Collapse
|
11
|
Liu J, Cheng Y, Wang X, Cui X, Kong Y, Du J. Low Rank Subspace Clustering via Discrete Constraint and Hypergraph Regularization for Tumor Molecular Pattern Discovery. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1500-1512. [PMID: 29993749 DOI: 10.1109/tcbb.2018.2834371] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Tumor clustering is a powerful approach for cancer class discovery which is crucial to the effective treatment of cancer. Many traditional clustering methods such as NMF-based models, have been widely used to identify tumors. However, they cannot achieve satisfactory results. Recently, subspace clustering approaches have been proposed to improve the performance by dividing the original space into multiple low-dimensional subspaces. Among them, low rank representation is becoming a popular approach to attain subspace clustering. In this paper, we propose a novel Low Rank Subspace Clustering model via Discrete Constraint and Hypergraph Regularization (DHLRS). The proposed method learns the cluster indicators directly by using discrete constraint, which makes the clustering task simple. For each subspace, we adopt Schatten -norm to better approximate the low rank constraint. Moreover, Hypergraph Regularization is adopted to infer the complex relationship between genes and intrinsic geometrical structure of gene expression data in each subspace. Finally, the molecular pattern of tumor gene expression data sets is discovered according to the optimized cluster indicators. Experiments on both synthetic data and real tumor gene expression data sets prove the effectiveness of proposed DHLRS.
Collapse
|
12
|
Liu J, Cheng Y, Wang X, Zhang L, Wang ZJ. Cancer Characteristic Gene Selection via Sample Learning Based on Deep Sparse Filtering. Sci Rep 2018; 8:8270. [PMID: 29844511 PMCID: PMC5974408 DOI: 10.1038/s41598-018-26666-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 05/17/2018] [Indexed: 11/09/2022] Open
Abstract
Identification of characteristic genes associated with specific biological processes of different cancers could provide insights into the underlying cancer genetics and cancer prognostic assessment. It is of critical importance to select such characteristic genes effectively. In this paper, a novel unsupervised characteristic gene selection method based on sample learning and sparse filtering, Sample Learning based on Deep Sparse Filtering (SLDSF), is proposed. With sample learning, the proposed SLDSF can better represent the gene expression level by the transformed sample space. Most unsupervised characteristic gene selection methods did not consider deep structures, while a multilayer structure may learn more meaningful representations than a single layer, therefore deep sparse filtering is investigated here to implement sample learning in the proposed SLDSF. Experimental studies on several microarray and RNA-Seq datasets demonstrate that the proposed SLDSF is more effective than several representative characteristic gene selection methods (e.g., RGNMF, GNMF, RPCA and PMD) for selecting cancer characteristic genes.
Collapse
Affiliation(s)
- Jian Liu
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Yuhu Cheng
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xuesong Wang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.
| | - Lin Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Z Jane Wang
- Electrical and Computer Engineering Department, University of British Columbia, V6T 1Z4, Vancouver, BC, Canada
| |
Collapse
|
13
|
Liu JX, Wang D, Gao YL, Zheng CH, Xu Y, Yu J. Regularized Non-Negative Matrix Factorization for Identifying Differentially Expressed Genes and Clustering Samples: A Survey. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:974-987. [PMID: 28186906 DOI: 10.1109/tcbb.2017.2665557] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Non-negative Matrix Factorization (NMF), a classical method for dimensionality reduction, has been applied in many fields. It is based on the idea that negative numbers are physically meaningless in various data-processing tasks. Apart from its contribution to conventional data analysis, the recent overwhelming interest in NMF is due to its newly discovered ability to solve challenging data mining and machine learning problems, especially in relation to gene expression data. This survey paper mainly focuses on research examining the application of NMF to identify differentially expressed genes and to cluster samples, and the main NMF models, properties, principles, and algorithms with its various generalizations, extensions, and modifications are summarized. The experimental results demonstrate the performance of the various NMF algorithms in identifying differentially expressed genes and clustering samples.
Collapse
|
14
|
Liu JX, Wang DQ, Zheng CH, Gao YL, Wu SS, Shang JL. Identifying drug-pathway association pairs based on L 2,1-integrative penalized matrix decomposition. BMC SYSTEMS BIOLOGY 2017; 11:119. [PMID: 29297378 PMCID: PMC5770056 DOI: 10.1186/s12918-017-0480-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
BACKGROUND Traditional drug identification methods follow the "one drug-one target" thought. But those methods ignore the natural characters of human diseases. To overcome this limitation, many identification methods of drug-pathway association pairs have been developed, such as the integrative penalized matrix decomposition (iPaD) method. The iPaD method imposes the L1-norm penalty on the regularization term. However, lasso-type penalties have an obvious disadvantage, that is, the sparsity produced by them is too dispersive. RESULTS Therefore, to improve the performance of the iPaD method, we propose a novel method named L2,1-iPaD to identify paired drug-pathway associations. In the L2,1-iPaD model, we use the L2,1-norm penalty to replace the L1-norm penalty since the L2,1-norm penalty can produce row sparsity. CONCLUSIONS By applying the L2,1-iPaD method to the CCLE and NCI-60 datasets, we demonstrate that the performance of L2,1-iPaD method is superior to existing methods. And the proposed method can achieve better enrichment in terms of discovering validated drug-pathway association pairs than the iPaD method by performing permutation test. The results on the two real datasets prove that our method is effective.
Collapse
Affiliation(s)
- Jin-Xing Liu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| | - Dong-Qin Wang
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| | - Chun-Hou Zheng
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China.
| | - Ying-Lian Gao
- Library of Qufu Normal University, Qufu Normal University, Rizhao, China.
| | - Sha-Sha Wu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| | - Jun-Liang Shang
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| |
Collapse
|