1
|
Zabihian A, Asghari J, Hooshmand M, Gharaghani S. A comparative analysis of computational drug repurposing approaches: proposing a novel tensor-matrix-tensor factorization method. Mol Divers 2024:10.1007/s11030-024-10851-7. [PMID: 38683487 DOI: 10.1007/s11030-024-10851-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 03/18/2024] [Indexed: 05/01/2024]
Abstract
Efficient drug discovery relies on drug repurposing, an important and open research field. This work presents a novel factorization method and a practical comparison of different approaches for drug repurposing. First, we propose a novel tensor-matrix-tensor (TMT) formulation as a new data array method with a gradient-based factorization procedure. Additionally, this paper examines and contrasts four computational drug repurposing approaches-factorization-based methods, machine learning methods, deep learning methods, and graph neural networks-to fulfill the second purpose. We test the strategies on two datasets and assess each approach's performance, drawbacks, problems, and benefits based on results. The results demonstrate that deep learning techniques work better than other strategies and that their results might be more reliable. Ultimately, graph neural methods need to be in an inductive manner to have a reliable prediction.
Collapse
Affiliation(s)
- Arash Zabihian
- Department of Bioinformatics, Kish International Campus, University of Tehran, Kish, Iran
| | - Javad Asghari
- Department of Computer Science and Information Technology, Institute of Advanced Studies in Basic Sciences, Zanjan, Iran
| | - Mohsen Hooshmand
- Department of Computer Science and Information Technology, Institute of Advanced Studies in Basic Sciences, Zanjan, Iran.
| | - Sajjad Gharaghani
- Laboratory of Bioinformatics and Drug Design, University of Tehran, Tehran, Iran
| |
Collapse
|
2
|
Qin W, Wang H, Zhang F, Ma W, Wang J, Huang T. Nonconvex Robust High-Order Tensor Completion Using Randomized Low-Rank Approximation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:2835-2850. [PMID: 38598373 DOI: 10.1109/tip.2024.3385284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
Within the tensor singular value decomposition (T-SVD) framework, existing robust low-rank tensor completion approaches have made great achievements in various areas of science and engineering. Nevertheless, these methods involve the T-SVD based low-rank approximation, which suffers from high computational costs when dealing with large-scale tensor data. Moreover, most of them are only applicable to third-order tensors. Against these issues, in this article, two efficient low-rank tensor approximation approaches fusing random projection techniques are first devised under the order-d ( d ≥ 3 ) T-SVD framework. Theoretical results on error bounds for the proposed randomized algorithms are provided. On this basis, we then further investigate the robust high-order tensor completion problem, in which a double nonconvex model along with its corresponding fast optimization algorithms with convergence guarantees are developed. Experimental results on large-scale synthetic and real tensor data illustrate that the proposed method outperforms other state-of-the-art approaches in terms of both computational efficiency and estimated precision.
Collapse
|
3
|
Li L, Yan S, Bakker BM, Hoefsloot H, Chawes B, Horner D, Rasmussen MA, Smilde AK, Acar E. Analyzing postprandial metabolomics data using multiway models: a simulation study. BMC Bioinformatics 2024; 25:94. [PMID: 38438850 PMCID: PMC10913623 DOI: 10.1186/s12859-024-05686-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 01/31/2024] [Indexed: 03/06/2024] Open
Abstract
BACKGROUND Analysis of time-resolved postprandial metabolomics data can improve the understanding of metabolic mechanisms, potentially revealing biomarkers for early diagnosis of metabolic diseases and advancing precision nutrition and medicine. Postprandial metabolomics measurements at several time points from multiple subjects can be arranged as a subjects by metabolites by time points array. Traditional analysis methods are limited in terms of revealing subject groups, related metabolites, and temporal patterns simultaneously from such three-way data. RESULTS We introduce an unsupervised multiway analysis approach based on the CANDECOMP/PARAFAC (CP) model for improved analysis of postprandial metabolomics data guided by a simulation study. Because of the lack of ground truth in real data, we generate simulated data using a comprehensive human metabolic model. This allows us to assess the performance of CP models in terms of revealing subject groups and underlying metabolic processes. We study three analysis approaches: analysis of fasting-state data using principal component analysis, T0-corrected data (i.e., data corrected by subtracting fasting-state data) using a CP model and full-dynamic (i.e., full postprandial) data using CP. Through extensive simulations, we demonstrate that CP models capture meaningful and stable patterns from simulated meal challenge data, revealing underlying mechanisms and differences between diseased versus healthy groups. CONCLUSIONS Our experiments show that it is crucial to analyze both fasting-state and T0-corrected data for understanding metabolic differences among subject groups. Depending on the nature of the subject group structure, the best group separation may be achieved by CP models of T0-corrected or full-dynamic data. This study introduces an improved analysis approach for postprandial metabolomics data while also shedding light on the debate about correcting baseline values in longitudinal data analysis.
Collapse
Affiliation(s)
- Lu Li
- Department of Data Science and Knowledge Discovery, Simula Metropolitan Center for Digital Engineering, Oslo, Norway.
| | - Shi Yan
- Department of Data Science and Knowledge Discovery, Simula Metropolitan Center for Digital Engineering, Oslo, Norway
| | - Barbara M Bakker
- Laboratory of Pediatrics, Section Systems Medicine and Metabolic Signalling, Center for Liver, Digestive and Metabolic Disease, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Huub Hoefsloot
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands
| | - Bo Chawes
- Copenhagen Prospective Studies on Asthma in Childhood (COPSAC), Herlev and Gentofte Hospital, University of Copenhagen, Copenhagen, Denmark
| | - David Horner
- Copenhagen Prospective Studies on Asthma in Childhood (COPSAC), Herlev and Gentofte Hospital, University of Copenhagen, Copenhagen, Denmark
| | - Morten A Rasmussen
- Copenhagen Prospective Studies on Asthma in Childhood (COPSAC), Herlev and Gentofte Hospital, University of Copenhagen, Copenhagen, Denmark
- Department of Food Science, University of Copenhagen, Copenhagen, Denmark
| | - Age K Smilde
- Department of Data Science and Knowledge Discovery, Simula Metropolitan Center for Digital Engineering, Oslo, Norway
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands
| | - Evrim Acar
- Department of Data Science and Knowledge Discovery, Simula Metropolitan Center for Digital Engineering, Oslo, Norway.
| |
Collapse
|
4
|
Guo L, Yu H, Li Y, Zhang C, Kharbach M. Tensor methods in data analysis of chromatography/mass spectroscopy-based plant metabolomics. PLANT METHODS 2023; 19:130. [PMID: 37990220 PMCID: PMC10662285 DOI: 10.1186/s13007-023-01105-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 11/06/2023] [Indexed: 11/23/2023]
Abstract
Plant metabolomics is an important research area in plant science. Chemometrics is a useful tool for plant metabolomic data analysis and processing. Among them, high-order chemometrics represented by tensor modeling provides a new and promising technical method for the analysis of complex multi-way plant metabolomics data. This paper systematically reviews different tensor methods widely applied to the analysis of complex plant metabolomic data. The advantages and disadvantages as well as the latest methodological advances of tensor models are reviewed and summarized. At the same time, application of different tensor methods in solving plant science problems are also reviewed and discussed. The reviewed applications of tensor methods in plant metabolomics cover a wide range of important plant science topics including plant gene mutation and phenotype, plant disease and resistance, plant pharmacology and nutrition analysis, and plant products ingredient characterization and quality evaluation. It is evident from the review that tensor methods significantly promote the automated and intelligent process of plant metabolomics analysis and profoundly affect the paradigm of plant science research. To the best of our knowledge, this is the first review to systematically summarize the tensor analysis methods in plant metabolomic data analysis.
Collapse
Affiliation(s)
- Lili Guo
- Weifang University of Science and Technology, Shouguang, 262700, China
| | - Huiwen Yu
- Shenzhen Hospital, Southern Medical University, Shenzhen, 518005, China.
- Chemometrics Group, Faculty of Science, University of Copenhagen, Frederiksberg, 1958, Denmark.
| | - Yuan Li
- Northwest Land and Resources Research Center, Shaanxi Normal University, Xi'an, 710062, China
| | - Chenxi Zhang
- Weifang University of Science and Technology, Shouguang, 262700, China
| | - Mourad Kharbach
- Department of Food and Nutrition, University of Helsinki, Helsinki, 00014, Finland
- Department of Computer Sciences, University of Helsinki, Helsinki, 00560, Finland
| |
Collapse
|
5
|
Luo Q, Yang M, Li W, Xiao M. Hyper-Laplacian Regularized Multi-View Clustering with Exclusive L21 Regularization and Tensor Log-Determinant Minimization Approach. ACM T INTEL SYST TEC 2023. [DOI: 10.1145/3587034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
Multi-view clustering aims to capture the multiple views inherent information by identifying the data clustering that reflects distinct features of datasets. Being a consensus in literature that different views of a dataset share a common latent structure, most existing multi-view subspace learning methods rely on the nuclear norm to seek the low-rank representation of the underlying subspace. However, the nuclear norm often fails to distinguish the variance of features for each cluster due to its convexity nature and data tends to fall in multiple non-linear subspaces for multi-dimensional datasets. To address these problems, we propose a new and novel multi-view clustering method (HL-L21-TLD-MSC) that unifies the Hyper-Laplacian (HL) and exclusive ℓ
2, 1
(L21) regularization with Tensor Log-Determinant Rank Minimization (TLD) setting. Specifically, the hyper-Laplacian regularization maintains the local geometrical structure that makes the estimation prune to nonlinearities, and the mixed ℓ
2, 1
and ℓ
1, 2
regularization provides the joint sparsity within-cluster as well as the exclusive sparsity between-cluster. Furthermore, a log-determinant function is used as a tighter tensor rank approximation to discriminate the dimension of features. An efficient alternating algorithm is then derived to optimize the proposed model, and the construction of a convergent sequence to the Karush-Kuhn-Tucker (KKT) critical point solution is mathematically validated in detail. Extensive experiments are conducted on ten well-known datasets to demonstrate that the proposed approach outperforms the existing state-of-the-art approaches with various scenarios, in which, six of them achieve perfect results under our framework developed in this paper, demonstrating highly effectiveness for the proposed approach.
Collapse
Affiliation(s)
- Qilun Luo
- South China Normal University, China
| | | | - Wen Li
- South China Normal University, China
| | | |
Collapse
|
6
|
Su L, Liu J, Zhang J, Tian X, Zhang H, Ma C. Smooth low-rank representation with a Grassmann manifold for tensor completion. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2023.110510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
|
7
|
Randomized sampling techniques based low-tubal-rank plus sparse tensor recovery. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2022.110198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
8
|
Pasricha RS, Gujral E, Papalexakis EE. Adaptive granularity in tensors: A quest for interpretable structure. Front Big Data 2022; 5:929511. [PMID: 36505975 PMCID: PMC9727254 DOI: 10.3389/fdata.2022.929511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 09/05/2022] [Indexed: 09/19/2023] Open
Abstract
Data collected at very frequent intervals is usually extremely sparse and has no structure that is exploitable by modern tensor decomposition algorithms. Thus, the utility of such tensors is low, in terms of the amount of interpretable and exploitable structure that one can extract from them. In this paper, we introduce the problem of finding a tensor of adaptive aggregated granularity that can be decomposed to reveal meaningful latent concepts (structures) from datasets that, in their original form, are not amenable to tensor analysis. Such datasets fall under the broad category of sparse point processes that evolve over space and/or time. To the best of our knowledge, this is the first work that explores adaptive granularity aggregation in tensors. Furthermore, we formally define the problem and discuss different definitions of "good structure" that are in practice and show that the optimal solution is of prohibitive combinatorial complexity. Subsequently, we propose an efficient and effective greedy algorithm called ICEBREAKER, which follows a number of intuitive decision criteria that locally maximize the "goodness of structure," resulting in high-quality tensors. We evaluate our method on synthetic, semi-synthetic, and real datasets. In all the cases, our proposed method constructs tensors that have a very high structure quality.
Collapse
Affiliation(s)
- Ravdeep S. Pasricha
- Department of Computer Science and Engineering, University of California, Riverside, Riverside, CA, United States
| | | | | |
Collapse
|
9
|
Liu P, Luo J, Chen X. miRCom: Tensor Completion Integrating Multi-View Information to Deduce the Potential Disease-Related miRNA-miRNA Pairs. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1747-1759. [PMID: 33180730 DOI: 10.1109/tcbb.2020.3037331] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
MicroRNAs (miRNAs) are consistently capable of regulating gene expression synergistically in a combination mode and play a key role in various biological processes associated with the initiation and development of human diseases, which indicate that comprehending the synergistic molecular mechanism of miRNAs may facilitate understanding the pathogenesis of diseases or even overcome it. However, most existing computational methods had an incomprehensive acknowledge of the miRNA synergistic effect on the pathogenesis of complex diseases, or were hard to be extended to a large-scale prediction task of miRNA synergistic combinations for different diseases. In this article, we propose a novel tensor completion framework integrating multi-view miRNAs and diseases information, called miRCom, for the discovery of potential disease-associated miRNA-miRNA pairs. We first construct an incomplete three-order association tensor and several types of similarity matrices based on existing biological knowledge. Then, we formulate an objective function via performing the factorizations of coupled tensor and matrices simultaneously. Finally, we build an optimization schema by adopting the ADMM algorithm. After that, we obtain the prediction of miRNA-miRNA pairs for different diseases from the full tensor. The contrastive experimental results with other approaches verified that miRCom effectively identify the potential disease-related miRNA-miRNA pairs. Moreover, case study results further illustrated that miRNA-miRNA pairs have more biologically significance and prognostic value than single miRNAs.
Collapse
|
10
|
Outlier Reconstruction of NDVI for Vegetation-Cover Dynamic Analyses. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12094412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The normalized difference vegetation index (NDVI) contains important data for providing vegetation-cover information and supporting environmental analyses. However, understanding long-term vegetation cover dynamics remains challenging due to data outliers that are found in cloudy regions. In this article, we propose a sliding-window-based tensor stream analysis algorithm (SWTSA) for reconstructing outliers in NDVI from multitemporal optical remote-sensing images. First, we constructed a tensor stream of NDVI that was calculated from clear-sky optical remote-sensing images corresponding to seasons on the basis of the acquired date. Second, we conducted tensor decomposition and reconstruction by SWTSA. Landsat series remote-sensing images were used in experiments to demonstrate the applicability of the SWTSA. Experiments were carried out successfully on the basis of data from the estuary area of Salween River in Southeast Asia. Compared with random forest regression (RFR), SWTSA has higher accuracy and better reconstruction capabilities. Results show that SWTSA is reliable and suitable for reconstructing outliers of NDVI from multitemporal optical remote-sensing images.
Collapse
|
11
|
Zhang JZ, Xu W, Hu P. Tightly Integrated Multiomics-based Deep Tensor Survival Model for Time-to-Event Prediction. Bioinformatics 2022; 38:3259-3266. [PMID: 35445698 DOI: 10.1093/bioinformatics/btac286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 03/12/2022] [Accepted: 04/18/2022] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Multiomics cancer profiles provide essential signals for predicting cancer survival. It is challenging to reveal the complex patterns from multiple types of data and link them to survival outcomes. We aim to develop a new deep learning-based algorithm to integrate three types of high-dimensional omics data measured on the same individuals to improve cancer survival outcome prediction. RESULTS We built a three-dimension tensor to integrate multi-omics cancer data and factorized it into two-dimension matrices of latent factors, which were fed into neural networks-based survival networks. The new algorithm and other multi-omics-based algorithms, as well as individual genomic-based survival analysis algorithms, were applied to the breast cancer data colon and rectal cancer data from The Cancer Genome Atlas (TCGA) program. We evaluated the goodness-of-fit using the concordance index (C-index) and Integrated Brier Score (IBS). We demonstrated that the proposed tight integration framework has better survival prediction performance than the models using individual genomic data and other conventional data integration methods. AVAILABILITY https://github.com/jasperzyzhang/DeepTensorSurvival. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jasper Zhongyuan Zhang
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario M5T 3M7, Canada
| | - Wei Xu
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario M5T 3M7, Canada.,Biostatistics Department, Princess Margaret Cancer Centre, Toronto, Ontario M5G 2M9, Canada
| | - Pingzhao Hu
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario M5T 3M7, Canada.,Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Manitoba, R3E 0J9, Canada.,CancerCare Manitoba Research Institute, CancerCare Manitoba, Winnipeg, Manitoba, R3E 0V9, Canada.,Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, R3T 2N2, Canada
| |
Collapse
|
12
|
Integration of Multimodal Data from Disparate Sources for Identifying Disease Subtypes. BIOLOGY 2022; 11:biology11030360. [PMID: 35336734 PMCID: PMC8945377 DOI: 10.3390/biology11030360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 02/17/2022] [Accepted: 02/23/2022] [Indexed: 11/17/2022]
Abstract
Simple Summary The diagnostic and treatment strategies of cancer remain generally suboptimal resulting in over-diagnosis or under-treatment. Though many attempts on optimizing treatment decisions by early prediction of disease progression have been undertaken, these efforts yielded only modest success so far due to the heterogeneity of cancer with multifactorial etiology. Here, we propose a deep-learning based data integration model capable of predicting disease progression by integrating collective information available through multiple studies with different cohorts and heterogeneous data types. The results have shown that the proposed data integration pipeline is able to identify disease progression with higher accuracy and robustness compared to using a single cohort, by offering a more complete picture of the specific disease on patients with brain, blood, and pancreatic cancers. Abstract Studies over the past decade have generated a wealth of molecular data that can be leveraged to better understand cancer risk, progression, and outcomes. However, understanding the progression risk and differentiating long- and short-term survivors cannot be achieved by analyzing data from a single modality due to the heterogeneity of disease. Using a scientifically developed and tested deep-learning approach that leverages aggregate information collected from multiple repositories with multiple modalities (e.g., mRNA, DNA Methylation, miRNA) could lead to a more accurate and robust prediction of disease progression. Here, we propose an autoencoder based multimodal data fusion system, in which a fusion encoder flexibly integrates collective information available through multiple studies with partially coupled data. Our results on a fully controlled simulation-based study have shown that inferring the missing data through the proposed data fusion pipeline allows a predictor that is superior to other baseline predictors with missing modalities. Results have further shown that short- and long-term survivors of glioblastoma multiforme, acute myeloid leukemia, and pancreatic adenocarcinoma can be successfully differentiated with an AUC of 0.94, 0.75, and 0.96, respectively.
Collapse
|
13
|
Borhani N, Ghaisari J, Abedi M, Kamali M, Gheisari Y. A deep learning approach to predict inter-omics interactions in multi-layer networks. BMC Bioinformatics 2022; 23:53. [PMID: 35081903 PMCID: PMC8793231 DOI: 10.1186/s12859-022-04569-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Accepted: 01/07/2022] [Indexed: 11/24/2022] Open
Abstract
Background Despite enormous achievements in the production of high-throughput datasets, constructing comprehensive maps of interactions remains a major challenge. Lack of sufficient experimental evidence on interactions is more significant for heterogeneous molecular types. Hence, developing strategies to predict inter-omics connections is essential to construct holistic maps of disease. Results Here, as a novel nonlinear deep learning method, Data Integration with Deep Learning (DIDL) was proposed to predict inter-omics interactions. It consisted of an encoder that performs automatic feature extraction for biomolecules according to existing interactions coupled with a predictor that predicts unforeseen interactions. Applicability of DIDL was assessed on different networks, namely drug–target protein, transcription factor-DNA element, and miRNA–mRNA. Also, validity of the novel predictions was evaluated by literature surveys. According to the results, the DIDL outperformed state-of-the-art methods. For all three networks, the areas under the curve and the precision–recall curve exceeded 0.85 and 0.83, respectively. Conclusions DIDL offers several advantages like automatic feature extraction from raw data, end-to-end training, and robustness to network sparsity. In addition, reliance solely on existing inter-layer interactions and independence of biochemical features of interacting molecules make this algorithm applicable for a wide variety of networks. DIDL paves the way to understand the underlying mechanisms of complex disorders through constructing integrative networks. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04569-2.
Collapse
Affiliation(s)
- Niloofar Borhani
- Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, 84156-83111, Iran
| | - Jafar Ghaisari
- Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, 84156-83111, Iran.
| | - Maryam Abedi
- Regenerative Medicine Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Marzieh Kamali
- Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, 84156-83111, Iran
| | - Yousof Gheisari
- Regenerative Medicine Research Center, Isfahan University of Medical Sciences, Isfahan, Iran. .,Department of Genetics and Molecular Biology, Isfahan University of Medical Sciences, Isfahan, Iran.
| |
Collapse
|
14
|
Lakizadeh A, Babaei M. Detection of polypharmacy side effects by integrating multiple data sources and convolutional neural networks. Mol Divers 2022; 26:3193-3203. [PMID: 35072838 DOI: 10.1007/s11030-022-10382-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 01/11/2022] [Indexed: 11/30/2022]
Abstract
The consumption of drug combinations, named polypharmacy, is commonly used for treating patients with several diseases or those with complex conditions. However, the main drawback of polypharmacy is the increased probability of harmful side effects. The polypharmacy side effects are caused by an interaction between two medications. It means that the drug-drug interaction causes changes in their activities due to interfering in each other's performance. Therefore, discovering these side effects is one of the most challenging and important aspects of drug production and consumption as it is associated with human health. In this paper, a method has been introduced for predicting the polypharmacy side effects, called PSECNN. It is a multi-label multi-class deep learning method that combines various basic features of drugs to predict the polypharmacy side effects. Firstly, PSECNN collects five basic features of drugs, such as individual drug's side effects, drug-protein interactions, chemical substructures, targets, and enzymes in order to create a novel combination of drug features. A feature extraction module creates five feature vectors with the same dimension for each drug based on the Jaccard similarity index. Based on the feature vectors, a unique representative is then created for each drug. These representative vectors are given in pairs as input to the deep neural network to predict the occurrence probability of side effects. According to the experimental evaluations, PSECNN could outperform the state-of-the-art polypharmacy side effects prediction methods up to 74%. It has been found that PSECNN has better performance with polypharmacy side effects with a cause of molecular basis due to the novel combination of basic drug features.
Collapse
Affiliation(s)
- Amir Lakizadeh
- Computer Engineering Department, University of Qom, Qom, Iran.
| | - Mahdi Babaei
- Computer Engineering Department, University of Qom, Qom, Iran
| |
Collapse
|
15
|
Luo X, Wu H, Wang Z, Wang J, Meng D. A Novel Approach to Large-Scale Dynamically Weighted Directed Network Representation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; PP:9756-9773. [PMID: 34898429 DOI: 10.1109/tpami.2021.3132503] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
A dynamically weighted directed network (DWDN) is frequently encountered in various big data-related applications like a terminal interaction pattern analysis system (TIPAS) concerned in this study. It consists of large-scale dynamic interactions among numerous nodes. As the involved nodes increase drastically, it becomes impossible to observe their full interactions at each time slot, making a resultant DWDN High Dimensional and Incomplete (HDI). An HDI DWDN, in spite of its incompleteness, contains rich knowledge regarding involved nodes various behavior patterns. To extract such knowledge from an HDI DWDN, this paper proposes a novel Alternating direction method of multipliers (ADMM)-based Nonnegative Latent-factorization of Tensors (ANLT) model. It adopts three-fold ideas: a) building a data density-oriented augmented Lagrangian function for efficiently handling an HDI tensors incompleteness and nonnegativity; b) splitting the optimization task in each iteration into an elaborately designed subtask series where each one is solved based on the previously solved ones following the ADMM principle to achieve fast convergence; and c) theoretically proving that its convergence is guaranteed with its efficient learning scheme. Experimental results on six DWDNs from real applications demonstrate that the proposed ANLT outperforms state-of-the-art models significantly in both computational efficiency and prediction accuracy.
Collapse
|
16
|
Afrakhteh S, Behnam H. Efficient synthetic transmit aperture ultrasound based on tensor completion. ULTRASONICS 2021; 117:106553. [PMID: 34454358 DOI: 10.1016/j.ultras.2021.106553] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 08/05/2021] [Accepted: 08/10/2021] [Indexed: 06/13/2023]
Abstract
One of the most important methods in medical ultrasound imaging is the synthetic transmit aperture (STA). Despite the image quality improvement in the STA, this method suffers from several limitations, including a limited data acquisition rate and an increase in the overall time to form a single frame. Tensor completion (TC) is a powerful technique that uses rank minimization to recover missing information from a low-rank tensor. This paper provides a novel random synthetic transmit aperture (RSTA) method based on using only a randomly selected part (a fraction) of the linear array elements in the transmit mode to increase the data acquisition rate and then applying the tensor completion (TC) to improve the image quality. By the proposed method, as it is not necessary to transmit all elements sequentially, the data acquisition rate is improved and the overall time for creating an image is also significantly reduced. We investigated the proposed idea by using several simulated and experimental phantoms. Results showed that the proposed method could increase the data acquisition rate up to three times with the image quality difference of less than 6% compared to the original STA method.
Collapse
Affiliation(s)
- Sajjad Afrakhteh
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran.
| | - Hamid Behnam
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran.
| |
Collapse
|
17
|
Luo J, Liu Y, Liu P, Lai Z, Wu H. Data Integration Using Tensor Decomposition for The Prediction of miRNA-Disease Associations. IEEE J Biomed Health Inform 2021; 26:2370-2378. [PMID: 34748505 DOI: 10.1109/jbhi.2021.3125573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Dysfunction of miRNAs has an important relationship with diseases by impacting their target genes. Identifying disease-related miRNAs is of great significance to prevent and treat diseases. Integrating information of genes related miRNAs and/or diseases in calculational methods for miRNA-disease association studies is meaningful because of the complexity of biological mechanisms. Therefore, in this study, we propose a novel method based on tensor decomposition, termed TDMDA, to integrate multi-type data for identifying pathogenic miRNAs. First, we construct a three-order association tensor to express the associations of miRNA-disease pairs, the associations of miRNA-gene pairs, and the associations of gene-disease pairs simultaneously. Then, a tensor decomposition-based method with auxiliary information is applied to reconstruct the association tensor for predicting miRNA-disease associations, and the auxiliary information includes biological similarity information and adjacency information. The performance of TDMDA is compared with other advanced methods under 5-fold cross-validations. The experimental results indicate the TDMDA is a competitive method.
Collapse
|
18
|
Masumshah R, Aghdam R, Eslahchi C. A neural network-based method for polypharmacy side effects prediction. BMC Bioinformatics 2021; 22:385. [PMID: 34303360 PMCID: PMC8305591 DOI: 10.1186/s12859-021-04298-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 07/14/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Polypharmacy is a type of treatment that involves the concurrent use of multiple medications. Drugs may interact when they are used simultaneously. So, understanding and mitigating polypharmacy side effects are critical for patient safety and health. Since the known polypharmacy side effects are rare and they are not detected in clinical trials, computational methods are developed to model polypharmacy side effects. RESULTS We propose a neural network-based method for polypharmacy side effects prediction (NNPS) by using novel feature vectors based on mono side effects, and drug-protein interaction information. The proposed method is fast and efficient which allows the investigation of large numbers of polypharmacy side effects. Our novelty is defining new feature vectors for drugs and combining them with a neural network architecture to apply for the context of polypharmacy side effects prediction. We compare NNPS on a benchmark dataset to predict 964 polypharmacy side effects against 5 well-established methods and show that NNPS achieves better results than the results of all 5 methods in terms of accuracy, complexity, and running time speed. NNPS outperforms about 9.2% in Area Under the Receiver-Operating Characteristic, 12.8% in Area Under the Precision-Recall Curve, 8.6% in F-score, 10.3% in Accuracy, and 18.7% in Matthews Correlation Coefficient with 5-fold cross-validation against the best algorithm among other well-established methods (Decagon method). Also, the running time of the Decagon method which is 15 days for one fold of cross-validation is reduced to 8 h by the NNPS method. CONCLUSIONS The performance of NNPS is benchmarked against 5 well-known methods, Decagon, Concatenated drug features, Deep Walk, DEDICOM, and RESCAL, for 964 polypharmacy side effects. We adopt the 5-fold cross-validation for 50 iterations and use the average of the results to assess the performance of the NNPS method. The evaluation of the NNPS against five well-known methods, in terms of accuracy, complexity, and running time speed shows the performance of the presented method for an essential and challenging problem in pharmacology. Datasets and code for NNPS algorithm are freely accessible at https://github.com/raziyehmasumshah/NNPS .
Collapse
Affiliation(s)
- Raziyeh Masumshah
- Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran
| | - Rosa Aghdam
- School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran.
| | - Changiz Eslahchi
- Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran.
- School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran.
| |
Collapse
|
19
|
Jeng PY, Wei CS, Jung TP, Wang LC. Low-Dimensional Subject Representation-Based Transfer Learning in EEG Decoding. IEEE J Biomed Health Inform 2021; 25:1915-1925. [PMID: 32960770 DOI: 10.1109/jbhi.2020.3025865] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Recently, the advances in passive brain-computer interfaces (BCIs) based on electroencephalogram (EEG) have shed light on real-world neuromonitoring technologies. However, human variability in the EEG activities hinders the development of practical applications of EEG-based BCI. To tackle this problem, many transfer-learning techniques perform supervised calibration. This kind of calibration approach requires task-relevant data, which is impractical in real-life scenarios such as drowsiness during driving. This study presents a transfer-learning framework for EEG decoding based on the low-dimensional representations of subjects learned from the pre-trial EEG. Tensor decomposition was applied to the pre-trial EEG of subjects to extract the underlying characteristics in subject, spatial, and spectral domains. Then, the proposed framework assessed the characteristics to obtain the low-dimensional subject representations such that the subjects with similar brain dynamics can be identified. This method can leverage the existing data from other users, and a small number of data from a rapid, non-task, unsupervised calibration from a new user to build an accurate BCI. Our results demonstrated that, in terms of prediction accuracy, the proposed low-dimensional subject representation-based transfer learning (LDSR-TL) framework outperformed the random selection, and the Riemannian manifold approach in cognitive-state tracking, while requiring fewer training data. The results can greatly improve the practicability, and usability of EEG-based BCI in the real world.
Collapse
|
20
|
Liu T, Cui J, Zhuang H, Wang H. Modeling polypharmacy effects with heterogeneous signed graph convolutional networks. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02296-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
21
|
Ahmadi-Asl S, Cichocki A, Huy Phan A, Asante-Mensah MG, Musavian Ghazani M, Tanaka T, Oseledets I. Randomized algorithms for fast computation of low rank tensor ring model. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2020. [DOI: 10.1088/2632-2153/abad87] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
22
|
Sun W, Braatz RD. Opportunities in tensorial data analytics for chemical and biological manufacturing processes. Comput Chem Eng 2020. [DOI: 10.1016/j.compchemeng.2020.107099] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
23
|
Dutta A, Breloff SP, Dai F, Sinsel EW, Carey RE, Warren CM, Wu JZ. Fusing imperfect experimental data for risk assessment of musculoskeletal disorders in construction using canonical polyadic decomposition. AUTOMATION IN CONSTRUCTION 2020; 119:10.1016/j.autcon.2020.103322. [PMID: 33897107 PMCID: PMC8064735 DOI: 10.1016/j.autcon.2020.103322] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Field or laboratory data collected for work-related musculoskeletal disorder (WMSD) risk assessment in construction often becomes unreliable as a large amount of data go missing due to technology-induced errors, instrument failures or sometimes at random. Missing data can adversely affect the assessment conclusions. This study proposes a method that applies Canonical Polyadic Decomposition (CPD) tensor decomposition to fuse multiple sparse risk-related datasets and fill in missing data by leveraging the correlation among multiple risk indicators within those datasets. Two knee WMSD risk-related datasets-3D knee rotation (kinematics) and electromyography (EMG) of five knee postural muscles-collected from previous studies were used for the validation and demonstration of the proposed method. The analysis results revealed that for a large portion of missing values (40%), the proposed method can generate a fused dataset that provides reliable risk assessment results highly consistent (70%-87%) with those obtained from the original experimental datasets. This signified the usefulness of the proposed method for use in WMSD risk assessment studies when data collection is affected by a significant amount of missing data, which will facilitate reliable assessment of WMSD risks among construction workers. In the future, findings of this study will be implemented to explore whether, and to what extent, the fused dataset outperforms the datasets with missing values by comparing consistencies of the risk assessment results obtained from these datasets for further investigation of the fusion performance.
Collapse
Affiliation(s)
- Amrita Dutta
- Department of Civil and Environmental Engineering, West Virginia University, P.O. Box 6103, Morgantown, WV 26506, United States of America
| | - Scott P. Breloff
- National Institute for Occupational Safety and Health, 1095 Willowdale Road, Morgantown, WV 26505, United States of America
| | - Fei Dai
- Department of Civil and Environmental Engineering, West Virginia University, P.O. Box 6103, Morgantown, WV 26506, United States of America
| | - Erik W. Sinsel
- National Institute for Occupational Safety and Health, 1095 Willowdale Road, Morgantown, WV 26505, United States of America
| | - Robert E. Carey
- National Institute for Occupational Safety and Health, 1095 Willowdale Road, Morgantown, WV 26505, United States of America
| | - Christopher M. Warren
- National Institute for Occupational Safety and Health, 1095 Willowdale Road, Morgantown, WV 26505, United States of America
| | - John Z. Wu
- National Institute for Occupational Safety and Health, 1095 Willowdale Road, Morgantown, WV 26505, United States of America
| |
Collapse
|
24
|
Fernandes S, Fanaee-T H, Gama J. Tensor decomposition for analysing time-evolving social networks: an overview. Artif Intell Rev 2020. [DOI: 10.1007/s10462-020-09916-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
25
|
Fanaee-T H, Thoresen M. Multi-insight visualization of multi-omics data via ensemble dimension reduction and tensor factorization. Bioinformatics 2020; 35:1625-1633. [PMID: 30295701 DOI: 10.1093/bioinformatics/bty847] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Revised: 08/15/2018] [Accepted: 10/04/2018] [Indexed: 01/12/2023] Open
Abstract
MOTIVATION Visualization of high-dimensional data is an important step in exploratory data analysis and knowledge discovery. However, it is challenging, because the interpretation is highly subjective. If we see dimensionality reduction (DR) techniques as the main tool for data visualization, they are like multiple cameras that look into the data from different perspectives or angles. We can hardly prescribe one single perspective for all datasets and problems. One snapshot of data cannot reveal all the relevant aspects of the data in higher dimensions. The reason is that each of these methods has its own specific strategy, normally based on well-established mathematical theories to obtain a low-dimensional projection of the data, which sometimes is totally different from the others. Therefore, relying only on one single projection can be risky, because it can close our eyes to important parts of the full knowledge space. RESULTS We propose the first framework for multi-insight data visualization of multi-omics data. This approach, contrary to single-insight approaches, is able to uncover the majority of data features through multiple insights. The main idea behind the methodology is to combine several DR methods via tensor factorization and group the solutions into an optimal number of clusters (or insights). The experimental evaluation with low-dimensional synthetic data, simulated multi-omics data related to ovarian cancer, as well as real multi-omics data related to breast cancer show the competitive advantage over state-of-the-art methods. AVAILABILITY AND IMPLEMENTATION https://folk.uio.no/hadift/MIV/ [user/pass via hadift@medisin. uio.no]. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hadi Fanaee-T
- Department of Biostatistics, University of Oslo, Oslo, Norway
| | - Magne Thoresen
- Department of Biostatistics, University of Oslo, Oslo, Norway
| |
Collapse
|
26
|
Wang S, Yang J, Chen Z, Yuan H, Geng J, Hai Z. Global and Local Tensor Factorization for Multi-criteria Recommender System. PATTERNS 2020; 1:100023. [PMID: 33205096 PMCID: PMC7660452 DOI: 10.1016/j.patter.2020.100023] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 03/19/2020] [Accepted: 03/30/2020] [Indexed: 11/17/2022]
Abstract
In multi-criteria recommender systems, matrix factorization characterizes users and items via latent factor vectors inferred from user-item rating patterns. However, two-dimensional matrix factorization models may not be able to cope with the recommendation problem that involves additional criterion-specific rating data. This study introduces a tensor factorization method to handle three-dimensional user-item-criterion rating data. Moreover, we observe that using single global tensor factorization alone may not be sufficient to characterize diverse preferences among different groups of users, and a combined global and local tensor factorization method (GLTF) for multi-criteria recommendation is thus proposed. One key benefit of the GLTF is that it can leverage global user-item-criterion rating patterns while also exploiting local user-subset specific rating behaviors to jointly infer the latent factor representations for users, items, and specific item criteria. Experimental results, which used real-life data available to the public, demonstrated that the GLTF is superior to well-established baseline methods. A global and local tensor factorization is created for multi-criteria recommendation The method can learn a global predictive model and multiple local ones It discovers the structure of rating tensor and user-rating behaviors in subtensors It leverages user-item-criterion ratings for better recommendations in e-commerce
We propose a global and local tensor factorization method (GLTF) to solve the multi-criteria recommendation problem commonly experienced when e-commerce systems recommend products to users based on multiple different ratings. The method uses additional criterion-specific ratings in addition to existing user-item rating data for better recommendations. It can jointly learn a global predictive model and multiple local predictive models, not only by discovering the overall structure of the entire rating tensor but also by capturing diverse rating behaviors of users in individual subtensors. The GLTF can take advantage of the user's multi-criteria rating information to discover the user's behavior, predict the information and products that the user is interested in, and obtain more accurate recommendation results. In the future, we plan to apply the GLTF in a much larger dataset for evaluation and will improve the model to mitigate the bottleneck caused by the data sparsity problem.
Collapse
Affiliation(s)
- Shuliang Wang
- School of Computer Science, Beijing Institute of Technology, Beijing 100081, China
- Institute of E-Government, Beijing Institute of Technology, Beijing 100081, China
- Corresponding author
| | - Jingting Yang
- College of Computer & Network Engineering, Shanxi Datong University, Datong 037009, China
| | - Zhengyu Chen
- College of Computer Science & Technology, Zhejiang University, Hangzhou 310007, China
| | - Hanning Yuan
- School of Computer Science, Beijing Institute of Technology, Beijing 100081, China
- Corresponding author
| | - Jing Geng
- School of Computer Science, Beijing Institute of Technology, Beijing 100081, China
- Institute of E-Government, Beijing Institute of Technology, Beijing 100081, China
- Corresponding author
| | - Zhen Hai
- Institute for Infocomm Research, 1 Fusionopolis Way, Singapore 138632, Singapore
| |
Collapse
|
27
|
Luo X, Wu H, Yuan H, Zhou M. Temporal Pattern-Aware QoS Prediction via Biased Non-Negative Latent Factorization of Tensors. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:1798-1809. [PMID: 30969935 DOI: 10.1109/tcyb.2019.2903736] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Quality-of-service (QoS) data vary over time, making it vital to capture the temporal patterns hidden in such dynamic data for predicting missing ones with high accuracy. However, currently latent factor (LF) analysis-based QoS-predictors are mostly defined on static QoS data without the consideration of such temporal dynamics. To address this issue, this paper presents a biased non-negative latent factorization of tensors (BNLFTs) model for temporal pattern-aware QoS prediction. Its main idea is fourfold: 1) incorporating linear biases into the model for describing QoS fluctuations; 2) constraining the model to be non-negative for describing QoS non-negativity; 3) deducing a single LF-dependent, non-negative, and multiplicative update scheme for training the model; and 4) incorporating an alternating direction method into the model for faster convergence. The empirical studies on two dynamic QoS datasets from real applications show that compared with the state-of-the-art QoS-predictors, BNLFT represents temporal patterns more precisely with high computational efficiency, thereby achieving the most accurate predictions for missing QoS data.
Collapse
|
28
|
A Low-Rank Tensor Factorization Using Implicit Similarity in Trust Relationships. Symmetry (Basel) 2020. [DOI: 10.3390/sym12030439] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Low-rank tensor factorization can not only mine the implicit relationships between data but also fill in the missing data when working with complex data. Compared with the traditional collaborative filtering (CF) algorithm, the changes are essentially proposed, from traditional matrix analysis to three-dimensional spatial analysis. Based on low-rank tensor factorization, this paper proposes a recommendation model that comprehensively considers local information and global information, in other words, combining the similarity between trust users and low-rank tensor factorization. First, the similarity between trusted users is measured to capture local information between users by trusting similar preferences of users when selecting items. Then, the users’ similarity is integrated into the tensor, and the low-rank tensor factorization is used to better maintain and describe the internal structure of the data to obtain global information. Furthermore, based on the idea of the alternating least squares method, the conjugate gradient (CG) optimization algorithm for the model of this paper is designed. The local and global information is used to generate the optimal expected result in an iterative process. Finally, we conducted a large number of comparative experiments on the Ciao dataset and the FilmTrust dataset. Experimental results show that the algorithm has less precision loss under the data set with lower density. Thus, not only can a perfect compromise between accuracy and coverage be achieved, but also the computational complexity can be reduced to meet the need for real-time results.
Collapse
|
29
|
Wei Z, Zhao H, Zhao L, Yan H. Multiscale co-clustering for tensor data based on canonical polyadic decomposition and slice-wise factorization. Inf Sci (N Y) 2019. [DOI: 10.1016/j.ins.2019.06.044] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
30
|
A momentum-incorporated latent factorization of tensors model for temporal-aware QoS missing data prediction. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.08.026] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
31
|
Imbalanced data classification algorithm with support vector machine kernel extensions. EVOLUTIONARY INTELLIGENCE 2019. [DOI: 10.1007/s12065-018-0182-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
32
|
Zitnik M, Agrawal M, Leskovec J. Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics 2019; 34:i457-i466. [PMID: 29949996 PMCID: PMC6022705 DOI: 10.1093/bioinformatics/bty294] [Citation(s) in RCA: 392] [Impact Index Per Article: 78.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Motivation The use of drug combinations, termed polypharmacy, is common to treat patients with complex diseases or co-existing conditions. However, a major consequence of polypharmacy is a much higher risk of adverse side effects for the patient. Polypharmacy side effects emerge because of drug-drug interactions, in which activity of one drug may change, favorably or unfavorably, if taken with another drug. The knowledge of drug interactions is often limited because these complex relationships are rare, and are usually not observed in relatively small clinical testing. Discovering polypharmacy side effects thus remains an important challenge with significant implications for patient mortality and morbidity. Results Here, we present Decagon, an approach for modeling polypharmacy side effects. The approach constructs a multimodal graph of protein-protein interactions, drug-protein target interactions and the polypharmacy side effects, which are represented as drug-drug interactions, where each side effect is an edge of a different type. Decagon is developed specifically to handle such multimodal graphs with a large number of edge types. Our approach develops a new graph convolutional neural network for multirelational link prediction in multimodal networks. Unlike approaches limited to predicting simple drug-drug interaction values, Decagon can predict the exact side effect, if any, through which a given drug combination manifests clinically. Decagon accurately predicts polypharmacy side effects, outperforming baselines by up to 69%. We find that it automatically learns representations of side effects indicative of co-occurrence of polypharmacy in patients. Furthermore, Decagon models particularly well polypharmacy side effects that have a strong molecular basis, while on predominantly non-molecular side effects, it achieves good performance because of effective sharing of model parameters across edge types. Decagon opens up opportunities to use large pharmacogenomic and patient population data to flag and prioritize polypharmacy side effects for follow-up analysis via formal pharmacological studies. Availability and implementation Source code and preprocessed datasets are at: http://snap.stanford.edu/decagon.
Collapse
Affiliation(s)
- Marinka Zitnik
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Monica Agrawal
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jure Leskovec
- Department of Computer Science, Stanford University, Stanford, CA, USA.,Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
33
|
Hériché JK, Alexander S, Ellenberg J. Integrating Imaging and Omics: Computational Methods and Challenges. Annu Rev Biomed Data Sci 2019. [DOI: 10.1146/annurev-biodatasci-080917-013328] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Fluorescence microscopy imaging has long been complementary to DNA sequencing- and mass spectrometry–based omics in biomedical research, but these approaches are now converging. On the one hand, omics methods are moving from in vitro methods that average across large cell populations to in situ molecular characterization tools with single-cell sensitivity. On the other hand, fluorescence microscopy imaging has moved from a morphological description of tissues and cells to quantitative molecular profiling with single-molecule resolution. Recent technological developments underpinned by computational methods have started to blur the lines between imaging and omics and have made their direct correlation and seamless integration an exciting possibility. As this trend continues rapidly, it will allow us to create comprehensive molecular profiles of living systems with spatial and temporal context and subcellular resolution. Key to achieving this ambitious goal will be novel computational methods and successfully dealing with the challenges of data integration and sharing as well as cloud-enabled big data analysis.
Collapse
Affiliation(s)
- Jean-Karim Hériché
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| | - Stephanie Alexander
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| | - Jan Ellenberg
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| |
Collapse
|
34
|
Fang Z, Yang X, Han L, Liu X. A Sequentially Truncated Higher Order Singular Value Decomposition-Based Algorithm for Tensor Completion. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:1956-1967. [PMID: 29993938 DOI: 10.1109/tcyb.2018.2817630] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The problem of recovering missing data of an incomplete tensor has drawn more and more attentions in the fields of pattern recognition, machine learning, data mining, computer vision, and signal processing. Researches on this problem usually share a common assumption that the original tensor is of low-rank. One of the important ways to capture the low-rank structure of the incomplete tensor is based on tensor factorization. For the traditional tensor factorization algorithms, the tensor ranks should be specified ahead, which is not reasonable in real applications. To overcome this drawback, an adaptive algorithm is first presented based on sequentially truncated higher order singular value decomposition (ST-HOSVD) for fast low-rank approximation of complete tensor, in which the tensor ranks can be obtained adaptively. Then for tensor with missing data, we use adaptive ST-HOSVD and the average operator of low-rank approximation to improve the accuracy of the fulfilled tensor. Convergence analysis of the proposed algorithm is also given in this paper. The experimental results on 14 image datasets and three video datasets show that the proposed method outperforms the state-of-the-art methods in terms of running time and the accuracy.
Collapse
|
35
|
|
36
|
Mokhtari F, Laurienti PJ, Rejeski WJ, Ballard G. Dynamic Functional Magnetic Resonance Imaging Connectivity Tensor Decomposition: A New Approach to Analyze and Interpret Dynamic Brain Connectivity. Brain Connect 2018; 9:95-112. [PMID: 30318906 DOI: 10.1089/brain.2018.0605] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
There is a growing interest in using so-called dynamic functional connectivity, as the conventional static brain connectivity models are being questioned. Brain network analyses yield complex network data that are difficult to analyze and interpret. To deal with the complex structures, decomposition/factorization techniques that simplify the data are often used. For dynamic network analyses, data simplification is of even greater importance, as dynamic connectivity analyses result in a time series of complex networks. A new challenge that must be faced when using these decomposition/factorization techniques is how to interpret the resulting connectivity patterns. Connectivity patterns resulting from decomposition analyses are often visualized as networks in brain space, in the same way that pairwise correlation networks are visualized. This elevates the risk of conflating connections between nodes that represent correlations between nodes' time series with connections between nodes that result from decomposition analyses. Moreover, dynamic connectivity data may be represented with three-dimensional or four-dimensional (4D) tensors and decomposition results require unique interpretations. Thus, the primary goal of this article is to (1) address the issues that must be considered when interpreting the connectivity patterns from decomposition techniques and (2) show how the data structure and decomposition method interact to affect this interpretation. The outcome of our analyses is summarized as follows. (1) The edge strength in decomposition connectivity patterns represents complex relationships not pairwise interactions between the nodes. (2) The structure of the data significantly alters the connectivity patterns, for example, 4D data result in connectivity patterns with higher regional connections. (3) Orthogonal decomposition methods outperform in feature reduction applications, whereas nonorthogonal decomposition methods are better for mechanistic interpretation.
Collapse
Affiliation(s)
- Fatemeh Mokhtari
- 1 Laboratory for Complex Brain Networks, Department of Radiology, Wake Forest University School of Medicine, Winston-Salem, North Carolina.,2 Virginia Tech-Wake Forest University School of Biomedical Engineering and Sciences, Wake Forest University School of Medicine, Winston-Salem, North Carolina
| | - Paul J Laurienti
- 1 Laboratory for Complex Brain Networks, Department of Radiology, Wake Forest University School of Medicine, Winston-Salem, North Carolina.,3 Translational Science Center, Wake Forest University, Winston-Salem, North Carolina
| | - W Jack Rejeski
- 1 Laboratory for Complex Brain Networks, Department of Radiology, Wake Forest University School of Medicine, Winston-Salem, North Carolina.,3 Translational Science Center, Wake Forest University, Winston-Salem, North Carolina.,4 Department of Health and Exercise Science, Wake Forest University, Winston-Salem, North Carolina
| | - Grey Ballard
- 5 Department of Computer Science, Wake Forest University, Winston-Salem, North Carolina
| |
Collapse
|
37
|
Gao R, Li J, Li X, Song C, Chang J, Liu D, Wang C. STSCR: Exploring spatial-temporal sequential influence and social information for location recommendation. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.07.041] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
38
|
Abstract
Multiplayer online battle arena is a genre of online games that has become extremely popular. Due to their success, these games also drew the attention of our research community, because they provide a wealth of information about human online interactions and behaviors. A crucial problem is the extraction of activity patterns that characterize this type of data, in an interpretable way. Here, we leverage the Non-negative Tensor Factorization to detect hidden correlated behaviors of playing in a well-known game: League of Legends. To this aim, we collect the entire gaming history of a group of about 1000 players, which accounts for roughly 100K matches. By applying our framework we are able to separate players into different groups. We show that each group exhibits similar features and playing strategies, as well as similar temporal trajectories, i.e., behavioral progressions over the course of their gaming history. We surprisingly discover that playing strategies are stable over time and we provide an explanation for this observation.
Collapse
|
39
|
Abstract
Breathing signal monitoring can provide important clues for health problems. Compared to existing techniques that require wearable devices and special equipment, a more desirable approach is to provide contact-free and long-term breathing rate monitoring by exploiting wireless signals. In this article, we propose TensorBeat, a system to employ channel state information (CSI) phase difference data to intelligently estimate breathing rates for multiple persons with commodity WiFi devices. The main idea is to leverage the tensor decomposition technique to handle the CSI phase difference data. The proposed TensorBeat scheme first obtains CSI phase difference data between pairs of antennas at the WiFi receiver to create CSI tensors. Then canonical polyadic (CP) decomposition is applied to obtain the desired breathing signals. A stable signal matching algorithm is developed to identify the decomposed signal pairs, and a peak detection method is applied to estimate the breathing rates for multiple persons. Our experimental study shows that TensorBeat can achieve high accuracy under different environments for multiperson breathing rate monitoring.
Collapse
|
40
|
A Tensor-Based Structural Damage Identification and Severity Assessment. SENSORS 2018; 18:s18010111. [PMID: 29301314 PMCID: PMC5795348 DOI: 10.3390/s18010111] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 12/16/2017] [Accepted: 12/19/2017] [Indexed: 11/24/2022]
Abstract
Early damage detection is critical for a large set of global ageing infrastructure. Structural Health Monitoring systems provide a sensor-based quantitative and objective approach to continuously monitor these structures, as opposed to traditional engineering visual inspection. Analysing these sensed data is one of the major Structural Health Monitoring (SHM) challenges. This paper presents a novel algorithm to detect and assess damage in structures such as bridges. This method applies tensor analysis for data fusion and feature extraction, and further uses one-class support vector machine on this feature to detect anomalies, i.e., structural damage. To evaluate this approach, we collected acceleration data from a sensor-based SHM system, which we deployed on a real bridge and on a laboratory specimen. The results show that our tensor method outperforms a state-of-the-art approach using the wavelet energy spectrum of the measured data. In the specimen case, our approach succeeded in detecting 92.5% of induced damage cases, as opposed to 61.1% for the wavelet-based approach. While our method was applied to bridges, its algorithm and computation can be used on other structures or sensor-data analysis problems, which involve large series of correlated data from multiple sensors.
Collapse
|
41
|
|