1
|
Hyperbolic matrix factorization improves prediction of drug-target associations. Sci Rep 2023; 13:959. [PMID: 36653463 PMCID: PMC9849222 DOI: 10.1038/s41598-023-27995-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 01/11/2023] [Indexed: 01/19/2023] Open
Abstract
Past research in computational systems biology has focused more on the development and applications of advanced statistical and numerical optimization techniques and much less on understanding the geometry of the biological space. By representing biological entities as points in a low dimensional Euclidean space, state-of-the-art methods for drug-target interaction (DTI) prediction implicitly assume the flat geometry of the biological space. In contrast, recent theoretical studies suggest that biological systems exhibit tree-like topology with a high degree of clustering. As a consequence, embedding a biological system in a flat space leads to distortion of distances between biological objects. Here, we present a novel matrix factorization methodology for drug-target interaction prediction that uses hyperbolic space as the latent biological space. When benchmarked against classical, Euclidean methods, hyperbolic matrix factorization exhibits superior accuracy while lowering embedding dimension by an order of magnitude. We see this as additional evidence that the hyperbolic geometry underpins large biological networks.
Collapse
|
2
|
Matrix factorization with denoising autoencoders for prediction of drug–target interactions. Mol Divers 2022:10.1007/s11030-022-10492-8. [DOI: 10.1007/s11030-022-10492-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Accepted: 07/01/2022] [Indexed: 11/25/2022]
|
3
|
Yazdani-Jahromi M, Yousefi N, Tayebi A, Kolanthai E, Neal CJ, Seal S, Garibay OO. AttentionSiteDTI: an interpretable graph-based model for drug-target interaction prediction using NLP sentence-level relation classification. Brief Bioinform 2022; 23:6640006. [PMID: 35817396 PMCID: PMC9294423 DOI: 10.1093/bib/bbac272] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 05/01/2022] [Accepted: 06/10/2022] [Indexed: 11/14/2022] Open
Abstract
In this study, we introduce an interpretable graph-based deep learning prediction model, AttentionSiteDTI, which utilizes protein binding sites along with a self-attention mechanism to address the problem of drug-target interaction prediction. Our proposed model is inspired by sentence classification models in the field of Natural Language Processing, where the drug-target complex is treated as a sentence with relational meaning between its biochemical entities a.k.a. protein pockets and drug molecule. AttentionSiteDTI enables interpretability by identifying the protein binding sites that contribute the most toward the drug-target interaction. Results on three benchmark datasets show improved performance compared with the current state-of-the-art models. More significantly, unlike previous studies, our model shows superior performance, when tested on new proteins (i.e. high generalizability). Through multidisciplinary collaboration, we further experimentally evaluate the practical potential of our proposed approach. To achieve this, we first computationally predict the binding interactions between some candidate compounds and a target protein, then experimentally validate the binding interactions for these pairs in the laboratory. The high agreement between the computationally predicted and experimentally observed (measured) drug-target interactions illustrates the potential of our method as an effective pre-screening tool in drug repurposing applications.
Collapse
Affiliation(s)
- Mehdi Yazdani-Jahromi
- Industrial Engineering and Management Systems, University of Central Florida, Street, 32816, 4000 Central Florida Blvd. Orlando, USA
| | - Niloofar Yousefi
- Industrial Engineering and Management Systems, University of Central Florida, Street, 32816, 4000 Central Florida Blvd. Orlando, USA
| | - Aida Tayebi
- Industrial Engineering and Management Systems, University of Central Florida, Street, 32816, 4000 Central Florida Blvd. Orlando, USA
| | - Elayaraja Kolanthai
- College of Medicine, Bionix Cluster, University of Central Florida, 4000 Central Florida Blvd. Orlando, 32816, Florida, USA
| | - Craig J Neal
- College of Medicine, Bionix Cluster, University of Central Florida, 4000 Central Florida Blvd. Orlando, 32816, Florida, USA
| | - Sudipta Seal
- College of Medicine, Bionix Cluster, University of Central Florida, 4000 Central Florida Blvd. Orlando, 32816, Florida, USA.,Advanced Materials Processing and Analysis Center, Dept. of Materials Science and Engineering, University of Central Florida, 4000 Central Florida Blvd. Orlando, 32816, Florida, USA
| | - Ozlem Ozmen Garibay
- Industrial Engineering and Management Systems, University of Central Florida, Street, 32816, 4000 Central Florida Blvd. Orlando, USA
| |
Collapse
|
4
|
Wang S, Li J, Wang Y, Juan L. A Neighborhood-Based Global Network Model to Predict Drug-Target Interactions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2017-2025. [PMID: 33687846 DOI: 10.1109/tcbb.2021.3064614] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The detection of drug-target interactions (DTIs) plays an important role in drug discovery and development, making DTI prediction urgent to be solved. Existing computational methods usually utilize drug similarity, target similarity and DTI information to make prediction, providing the convenience of fast time and low cost. However, they usually learn features for drugs and targets separately, lacking of a global consideration. In this study, we proposed a novel neighborhood-based global network model, named as NGN, to accurately predict DTIs from the global perspective. We designed a distance constraint for features of all entities (drugs and targets) in the latent space to ensure the close distance between adjacent entities, and defined a global probability matrix to compute the predicted DTI scores on our constructed neighborhood-based global network. Results showed that NGN obtained advantageous performance compared with other state-of-the-art methods, especially surpassing them by 4.2-9.1 percent on AUPR values in the biggest dataset. Furthermore, several novel high-ranked DTIs were successfully predicted with confirmations by public sources, demonstrating the effectiveness of our method.
Collapse
|
5
|
Poleksic A. Overcoming Sparseness of Biomedical Networks to Identify Drug Repositioning Candidates. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2377-2384. [PMID: 33591920 DOI: 10.1109/tcbb.2021.3059807] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Modeling complex biological systems is necessary to understand biochemical interactions behind pharmacological effects of drugs. Successful in silico drug repurposing relies on exploration of diverse biochemical concepts and their relationships, including drug's adverse reactions, drug targets, disease symptoms, as well as disease associated genes and their pathways, to name a few. We present a computational method for inferring drug-disease associations from complex but incomplete and biased biological networks. Our method employs matrix completion to overcome the sparseness of biomedical data and to enrich the set of relationships between different biomedical entities. We present a strategy for identifying network paths supportive of drug efficacy as well as a computational procedure capable of combining different network patterns to better distinguish treatments from non-treatments. The algorithms is available at http://bioinfo.cs.uni.edu/AEONET.html.
Collapse
|
6
|
Detecting Drug–Target Interactions with Feature Similarity Fusion and Molecular Graphs. BIOLOGY 2022; 11:biology11070967. [PMID: 36101348 PMCID: PMC9312204 DOI: 10.3390/biology11070967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 06/12/2022] [Accepted: 06/24/2022] [Indexed: 12/03/2022]
Abstract
Simple Summary Accurate identification of potential targets for drugs to interact with can accelerate drug development. The identification of drug–target interactions can provide insights into hidden drug efficacy. This paper presents a prediction model based on feature similarity fusion that can identify crucial features of drugs and targets to help predict drug–target interactions. Abstract The key to drug discovery is the identification of a target and a corresponding drug compound. Effective identification of drug–target interactions facilitates the development of drug discovery. In this paper, drug similarity and target similarity are considered, and graphical representations are used to extract internal structural information and intermolecular interaction information about drugs and targets. First, drug similarity and target similarity are fused using the similarity network fusion (SNF) method. Then, the graph isomorphic network (GIN) is used to extract the features with information about the internal structure of drug molecules. For target proteins, feature extraction is carried out using TextCNN to efficiently capture the features of target protein sequences. Three different divisions (CVD, CVP, CVT) are used on the standard dataset, and experiments are carried out separately to validate the performance of the model for drug–target interaction prediction. The experimental results show that our method achieves better results on AUC and AUPR. The docking results also show the superiority of the proposed model in predicting drug–target interactions.
Collapse
|
7
|
Ruan D, Ji S, Yan C, Zhu J, Zhao X, Yang Y, Gao Y, Zou C, Dai Q. Exploring complex and heterogeneous correlations on hypergraph for the prediction of drug-target interactions. PATTERNS 2021; 2:100390. [PMID: 34950907 PMCID: PMC8672193 DOI: 10.1016/j.patter.2021.100390] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 07/23/2021] [Accepted: 10/21/2021] [Indexed: 01/04/2023]
Abstract
The continuous emergence of drug-target interaction data provides an opportunity to construct a biological network for systematically discovering unknown interactions. However, this is challenging due to complex and heterogeneous correlations between drug and target. Here, we describe a heterogeneous hypergraph-based framework for drug-target interaction (HHDTI) predictions by modeling biological networks through a hypergraph, where each vertex represents a drug or a target and a hyperedge indicates existing similar interactions or associations between the connected vertices. The hypergraph is then trained to generate suitably structured embeddings for discovering unknown interactions. Comprehensive experiments performed on four public datasets demonstrate that HHDTI achieves significant and consistently improved predictions compared with state-of-the-art methods. Our analysis indicates that this superior performance is due to the ability to integrate heterogeneous high-order information from the hypergraph learning. These results suggest that HHDTI is a scalable and practical tool for uncovering novel drug-target interactions. A hypergraph framework to model high-order correlations in heterogenous biological network An embedding learning method for drugs and targets using hypergraphs High-order correlation between drugs and targets can contribute to DTI predictions
The prediction of drug-target interactions (DTIs) plays a crucial role in drug discovery. In this work, we discover that the high-order correlations in heterogeneous biological networks are essential for DTI predictions. The hypergraph structure is ultilized to model the high-order correlations in the biological networks, then the embeddings are generated for the drugs and targets, respectively. Finally, the interaction between them can be predicted according to the similarity of the embeddings. Our proposed method has been evaluated on multiple public datasets and the improved performance demonstrates that the high-order correlations among drugs and targets contribute significantly on DTI predictions, and other associations besides DTIs are also useful in this task. Our method can also be used in other scenarios containing complex correlations.
Collapse
Affiliation(s)
- Ding Ruan
- School of Automation, Hangzhou Dianzi University, Hangzhou, China
| | - Shuyi Ji
- School of Software, KLISS, BNRist, Tsinghua University, Beijing, China
- Institute for Brain and Cognitive Sciences, Tsinghua University, Beijing, China
| | - Chenggang Yan
- School of Automation, Hangzhou Dianzi University, Hangzhou, China
| | - Junjie Zhu
- School of Software, KLISS, BNRist, Tsinghua University, Beijing, China
| | - Xibin Zhao
- School of Software, KLISS, BNRist, Tsinghua University, Beijing, China
| | - Yuedong Yang
- School of Computer Science, Sun Yat-sen University, Guangzhou, China
| | - Yue Gao
- School of Software, KLISS, BNRist, Tsinghua University, Beijing, China
- Institute for Brain and Cognitive Sciences, Tsinghua University, Beijing, China
- Corresponding author
| | - Changqing Zou
- Huawei Vancouver Research Center, Huawei Canada Technologies, Vancouver, Canada
- Corresponding author
| | - Qionghai Dai
- Institute for Brain and Cognitive Sciences, Tsinghua University, Beijing, China
- Department of Automation, Tsinghua University, Beijing, China
- Corresponding author
| |
Collapse
|
8
|
Thomas M, Boardman A, Garcia-Ortegon M, Yang H, de Graaf C, Bender A. Applications of Artificial Intelligence in Drug Design: Opportunities and Challenges. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021; 2390:1-59. [PMID: 34731463 DOI: 10.1007/978-1-0716-1787-8_1] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Artificial intelligence (AI) has undergone rapid development in recent years and has been successfully applied to real-world problems such as drug design. In this chapter, we review recent applications of AI to problems in drug design including virtual screening, computer-aided synthesis planning, and de novo molecule generation, with a focus on the limitations of the application of AI therein and opportunities for improvement. Furthermore, we discuss the broader challenges imposed by AI in translating theoretical practice to real-world drug design; including quantifying prediction uncertainty and explaining model behavior.
Collapse
Affiliation(s)
- Morgan Thomas
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Andrew Boardman
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Miguel Garcia-Ortegon
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK.,Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, Cambridge, UK
| | - Hongbin Yang
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
| | | | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK.
| |
Collapse
|
9
|
Xin Y, Henan B, Jianmin N, Wenjuan Y, Honggen Z, Xingyu J, Pengfei Y. Coating matching recommendation based on improved fuzzy comprehensive evaluation and collaborative filtering algorithm. Sci Rep 2021; 11:14035. [PMID: 34234246 PMCID: PMC8263793 DOI: 10.1038/s41598-021-93628-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Accepted: 06/28/2021] [Indexed: 11/09/2022] Open
Abstract
Coating matching design is one of the important parts of ship coating process design. The selection of coating matching is influenced by various factors such as marine corrosive environment, anti-corrosion period and working conditions. There are also differences in the coating performance requirements for different ship types and different coating parts. At present, the design of coating matching in shipyards depends on the experience of technologist, which is not conducive to the scientific management of ship painting process and the macro control of ship construction cost. Therefore, this paper proposes a hybrid algorithm of fuzzy comprehensive evaluation and collaborative filtering based on user label improvement (IFCE-CF). Based on the analytic hierarchy process (AHP), the evaluation index system of coating matching is constructed, and the weight calculation process of fuzzy comprehensive evaluation is optimized by introducing the user label weight. The collaborative filtering algorithm based on matrix decomposition is used to realize the accurate recommendation of coating matching. Historical coating process data of a shipyard between 2010 and 2020 are selected to verify the recommendation ability of the method in the paper. The results show that using the coating matching intelligent recommendation algorithm proposed in this paper, the root mean square error is < 1.02 and the mean absolute error is < 0.75, the prediction accuracy is significantly better than other research methods, which proves the effectiveness of the method.
Collapse
Affiliation(s)
- Yuan Xin
- School of Mechanical Engineering, Jiangsu University of Science and Technology, Zhenjiang, 212100, Jiangsu, China
| | - Bu Henan
- School of Mechanical Engineering, Jiangsu University of Science and Technology, Zhenjiang, 212100, Jiangsu, China.
| | - Niu Jianmin
- Shanghai Shipbuilding Technology Research Institute, Shanghai, 200032, China
| | - Yu Wenjuan
- Shanghai Shipbuilding Technology Research Institute, Shanghai, 200032, China
| | - Zhou Honggen
- School of Mechanical Engineering, Jiangsu University of Science and Technology, Zhenjiang, 212100, Jiangsu, China
| | - Ji Xingyu
- School of Mechanical Engineering, Jiangsu University of Science and Technology, Zhenjiang, 212100, Jiangsu, China
| | - Ye Pengfei
- School of Mechanical Engineering, Jiangsu University of Science and Technology, Zhenjiang, 212100, Jiangsu, China
| |
Collapse
|
10
|
Cai T, Lim H, Abbu KA, Qiu Y, Nussinov R, Xie L. MSA-Regularized Protein Sequence Transformer toward Predicting Genome-Wide Chemical-Protein Interactions: Application to GPCRome Deorphanization. J Chem Inf Model 2021; 61:1570-1582. [PMID: 33757283 PMCID: PMC8154251 DOI: 10.1021/acs.jcim.0c01285] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Indexed: 01/14/2023]
Abstract
Small molecules play a critical role in modulating biological systems. Knowledge of chemical-protein interactions helps address fundamental and practical questions in biology and medicine. However, with the rapid emergence of newly sequenced genes, the endogenous or surrogate ligands of a vast number of proteins remain unknown. Homology modeling and machine learning are two major methods for assigning new ligands to a protein but mostly fail when sequence homology between an unannotated protein and those with known functions or structures is low. In this study, we develop a new deep learning framework to predict chemical binding to evolutionary divergent unannotated proteins, whose ligand cannot be reliably predicted by existing methods. By incorporating evolutionary information into self-supervised learning of unlabeled protein sequences, we develop a novel method, distilled sequence alignment embedding (DISAE), for the protein sequence representation. DISAE can utilize all protein sequences and their multiple sequence alignment (MSA) to capture functional relationships between proteins without the knowledge of their structure and function. Followed by the DISAE pretraining, we devise a module-based fine-tuning strategy for the supervised learning of chemical-protein interactions. In the benchmark studies, DISAE significantly improves the generalizability of machine learning models and outperforms the state-of-the-art methods by a large margin. Comprehensive ablation studies suggest that the use of MSA, sequence distillation, and triplet pretraining critically contributes to the success of DISAE. The interpretability analysis of DISAE suggests that it learns biologically meaningful information. We further use DISAE to assign ligands to human orphan G-protein coupled receptors (GPCRs) and to cluster the human GPCRome by integrating their phylogenetic and ligand relationships. The promising results of DISAE open an avenue for exploring the chemical landscape of entire sequenced genomes.
Collapse
Affiliation(s)
- Tian Cai
- Ph.D.
Program in Computer Science, The Graduate Center, The City University of New York, New York, New York 10016, United States
| | - Hansaim Lim
- Ph.D.
Program in Biochemistry, The Graduate Center, The City University of New York, New York, New York 10016, United States
| | - Kyra Alyssa Abbu
- Department
of Computer Science, Hunter College, The
City University of New York, New York, New York 10065, United States
| | - Yue Qiu
- Ph.D.
Program in Biology, The Graduate Center, The City University of New York, New York, New York 10016, United States
| | - Ruth Nussinov
- Computational
Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, United States
- Department
of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Lei Xie
- Ph.D.
Program in Computer Science, The Graduate Center, The City University of New York, New York, New York 10016, United States
- Ph.D.
Program in Biochemistry, The Graduate Center, The City University of New York, New York, New York 10016, United States
- Department
of Computer Science, Hunter College, The
City University of New York, New York, New York 10065, United States
- Ph.D.
Program in Biology, The Graduate Center, The City University of New York, New York, New York 10016, United States
- Helen
and Robert Appel Alzheimer’s Disease Research Institute, Feil
Family Brain & Mind Research Institute, Weill Cornell Medicine, Cornell University, New York, New York 10021, United States
| |
Collapse
|
11
|
Sajadi SZ, Zare Chahooki MA, Gharaghani S, Abbasi K. AutoDTI++: deep unsupervised learning for DTI prediction by autoencoders. BMC Bioinformatics 2021; 22:204. [PMID: 33879050 PMCID: PMC8056558 DOI: 10.1186/s12859-021-04127-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 04/09/2021] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Drug-target interaction (DTI) plays a vital role in drug discovery. Identifying drug-target interactions related to wet-lab experiments are costly, laborious, and time-consuming. Therefore, computational methods to predict drug-target interactions are an essential task in the drug discovery process. Meanwhile, computational methods can reduce search space by proposing potential drugs already validated on wet-lab experiments. Recently, deep learning-based methods in drug-target interaction prediction have gotten more attention. Traditionally, DTI prediction methods' performance heavily depends on additional information, such as protein sequence and molecular structure of the drug, as well as deep supervised learning. RESULTS This paper proposes a method based on deep unsupervised learning for drug-target interaction prediction called AutoDTI++. The proposed method includes three steps. The first step is to pre-process the interaction matrix. Since the interaction matrix is sparse, we solved the sparsity of the interaction matrix with drug fingerprints. Then, in the second step, the AutoDTI approach is introduced. In the third step, we post-preprocess the output of the AutoDTI model. CONCLUSIONS Experimental results have shown that we were able to improve the prediction performance. To this end, the proposed method has been compared to other algorithms using the same reference datasets. The proposed method indicates that the experimental results of running five repetitions of tenfold cross-validation on golden standard datasets (Nuclear Receptors, GPCRs, Ion channels, and Enzymes) achieve good performance with high accuracy.
Collapse
Affiliation(s)
| | | | - Sajjad Gharaghani
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Karim Abbasi
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| |
Collapse
|
12
|
Abstract
INTRODUCTION Knowledge graphs have proven to be promising systems of information storage and retrieval. Due to the recent explosion of heterogeneous multimodal data sources generated in the biomedical domain, and an industry shift toward a systems biology approach, knowledge graphs have emerged as attractive methods of data storage and hypothesis generation. AREAS COVERED In this review, the author summarizes the applications of knowledge graphs in drug discovery. They evaluate their utility; differentiating between academic exercises in graph theory, and useful tools to derive novel insights, highlighting target identification and drug repurposing as two areas showing particular promise. They provide a case study on COVID-19, summarizing the research that used knowledge graphs to identify repurposable drug candidates. They describe the dangers of degree and literature bias, and discuss mitigation strategies. EXPERT OPINION Whilst knowledge graphs and graph-based machine learning have certainly shown promise, they remain relatively immature technologies. Many popular link prediction algorithms fail to address strong biases in biomedical data, and only highlight biological associations, failing to model causal relationships in complex dynamic biological systems. These problems need to be addressed before knowledge graphs reach their true potential in drug discovery.
Collapse
Affiliation(s)
- Finlay MacLean
- Target Identification., BenevolentAI, United Kingdom of Great Britain and Northern Ireland
| |
Collapse
|
13
|
Isozaki A, Harmon J, Zhou Y, Li S, Nakagawa Y, Hayashi M, Mikami H, Lei C, Goda K. AI on a chip. LAB ON A CHIP 2020; 20:3074-3090. [PMID: 32644061 DOI: 10.1039/d0lc00521e] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Artificial intelligence (AI) has dramatically changed the landscape of science, industry, defence, and medicine in the last several years. Supported by considerably enhanced computational power and cloud storage, the field of AI has shifted from mostly theoretical studies in the discipline of computer science to diverse real-life applications such as drug design, material discovery, speech recognition, self-driving cars, advertising, finance, medical imaging, and astronomical observation, where AI-produced outcomes have been proven to be comparable or even superior to the performance of human experts. In these applications, what is essentially important for the development of AI is the data needed for machine learning. Despite its prominent importance, the very first process of the AI development, namely data collection and data preparation, is typically the most laborious task and is often a limiting factor of constructing functional AI algorithms. Lab-on-a-chip technology, in particular microfluidics, is a powerful platform for both the construction and implementation of AI in a large-scale, cost-effective, high-throughput, automated, and multiplexed manner, thereby overcoming the above bottleneck. On this platform, high-throughput imaging is a critical tool as it can generate high-content information (e.g., size, shape, structure, composition, interaction) of objects on a large scale. High-throughput imaging can also be paired with sorting and DNA/RNA sequencing to conduct a massive survey of phenotype-genotype relations whose data is too complex to analyze with traditional computational tools, but is analyzable with the power of AI. In addition to its function as a data provider, lab-on-a-chip technology can also be employed to implement the developed AI for accurate identification, characterization, classification, and prediction of objects in mixed, heterogeneous, or unknown samples. In this review article, motivated by the excellent synergy between AI and lab-on-a-chip technology, we outline fundamental elements, recent advances, future challenges, and emerging opportunities of AI with lab-on-a-chip technology or "AI on a chip" for short.
Collapse
Affiliation(s)
- Akihiro Isozaki
- Department of Chemistry, University of Tokyo, Tokyo 113-0033, Japan. and Kanagawa Institute of Industrial Science and Technology, Kanagawa 213-0012, Japan
| | - Jeffrey Harmon
- Department of Chemistry, University of Tokyo, Tokyo 113-0033, Japan.
| | - Yuqi Zhou
- Department of Chemistry, University of Tokyo, Tokyo 113-0033, Japan.
| | - Shuai Li
- Department of Chemistry, University of Tokyo, Tokyo 113-0033, Japan. and The Cambridge Centre for Data-Driven Discovery, Cambridge University, Cambridge CB3 0WA, UK
| | - Yuta Nakagawa
- Department of Chemistry, University of Tokyo, Tokyo 113-0033, Japan.
| | - Mika Hayashi
- Department of Chemistry, University of Tokyo, Tokyo 113-0033, Japan.
| | - Hideharu Mikami
- Department of Chemistry, University of Tokyo, Tokyo 113-0033, Japan.
| | - Cheng Lei
- Department of Chemistry, University of Tokyo, Tokyo 113-0033, Japan. and Institute of Technological Sciences, Wuhan University, Hubei 430072, China
| | - Keisuke Goda
- Department of Chemistry, University of Tokyo, Tokyo 113-0033, Japan. and Institute of Technological Sciences, Wuhan University, Hubei 430072, China and Department of Bioengineering, University of California, Los Angeles, California 90095, USA
| |
Collapse
|
14
|
Zhao T, Hu Y, Valsdottir LR, Zang T, Peng J. Identifying drug-target interactions based on graph convolutional network and deep neural network. Brief Bioinform 2020; 22:2141-2150. [PMID: 32367110 DOI: 10.1093/bib/bbaa044] [Citation(s) in RCA: 124] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 03/05/2020] [Accepted: 03/06/2020] [Indexed: 12/21/2022] Open
Abstract
Identification of new drug-target interactions (DTIs) is an important but a time-consuming and costly step in drug discovery. In recent years, to mitigate these drawbacks, researchers have sought to identify DTIs using computational approaches. However, most existing methods construct drug networks and target networks separately, and then predict novel DTIs based on known associations between the drugs and targets without accounting for associations between drug-protein pairs (DPPs). To incorporate the associations between DPPs into DTI modeling, we built a DPP network based on multiple drugs and proteins in which DPPs are the nodes and the associations between DPPs are the edges of the network. We then propose a novel learning-based framework, 'graph convolutional network (GCN)-DTI', for DTI identification. The model first uses a graph convolutional network to learn the features for each DPP. Second, using the feature representation as an input, it uses a deep neural network to predict the final label. The results of our analysis show that the proposed framework outperforms some state-of-the-art approaches by a large margin.
Collapse
Affiliation(s)
- Tianyi Zhao
- Department of Computer Science at Harbin Institute of Technology. He currently works as a bioinformatician in Beth Israel Deaconess Medical Center
| | - Yang Hu
- Department of Life Science at Harbin Institute of Technology. His expertise is bioinformatics
| | - Linda R Valsdottir
- MS in Biology and works as a scientific writer at the Smith Center for Outcomes Research in Cardiology at Beth Israel Deaconess Medical Center in Boston, MA. Her work is focused on helping researchers communicate their findings in an effort to translate novel analytical approaches and clinical expertise into improved outcomes for patients
| | - Tianyi Zang
- School of Computer Science and Technology at Harbin Institute of Technology (HIT), China. Before joining HIT in 2009, he was a research fellow at the Department of Computer Science at University of Oxford, UK. His current research is concerned with biomedical bigdata computing and algorithms, deep-learning algorithms for network data, intelligent recommendation algorithms, and modeling and analysis methods for complex systems
| | - Jiajie Peng
- School of Computer Science at Northwestern Polytechnical University. His expertise is computational biology and machine learning. Availability and implementation: https://github.com/zty2009/GCN-DNN/
| |
Collapse
|
15
|
Hao M, Bryant SH, Wang Y. Open-source chemogenomic data-driven algorithms for predicting drug-target interactions. Brief Bioinform 2020; 20:1465-1474. [PMID: 29420684 DOI: 10.1093/bib/bby010] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 01/18/2018] [Indexed: 12/25/2022] Open
Abstract
While novel technologies such as high-throughput screening have advanced together with significant investment by pharmaceutical companies during the past decades, the success rate for drug development has not yet been improved prompting researchers looking for new strategies of drug discovery. Drug repositioning is a potential approach to solve this dilemma. However, experimental identification and validation of potential drug targets encoded by the human genome is both costly and time-consuming. Therefore, effective computational approaches have been proposed to facilitate drug repositioning, which have proved to be successful in drug discovery. Doubtlessly, the availability of open-accessible data from basic chemical biology research and the success of human genome sequencing are crucial to develop effective in silico drug repositioning methods allowing the identification of potential targets for existing drugs. In this work, we review several chemogenomic data-driven computational algorithms with source codes publicly accessible for predicting drug-target interactions (DTIs). We organize these algorithms by model properties and model evolutionary relationships. We re-implemented five representative algorithms in R programming language, and compared these algorithms by means of mean percentile ranking, a new recall-based evaluation metric in the DTI prediction research field. We anticipate that this review will be objective and helpful to researchers who would like to further improve existing algorithms or need to choose appropriate algorithms to infer potential DTIs in the projects. The source codes for DTI predictions are available at: https://github.com/minghao2016/chemogenomicAlg4DTIpred.
Collapse
|
16
|
Mohamed SK, Nounu A, Nováček V. Biological applications of knowledge graph embedding models. Brief Bioinform 2020; 22:1679-1693. [PMID: 32065227 DOI: 10.1093/bib/bbaa012] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Revised: 01/10/2020] [Accepted: 01/21/2020] [Indexed: 01/04/2023] Open
Abstract
Complex biological systems are traditionally modelled as graphs of interconnected biological entities. These graphs, i.e. biological knowledge graphs, are then processed using graph exploratory approaches to perform different types of analytical and predictive tasks. Despite the high predictive accuracy of these approaches, they have limited scalability due to their dependency on time-consuming path exploratory procedures. In recent years, owing to the rapid advances of computational technologies, new approaches for modelling graphs and mining them with high accuracy and scalability have emerged. These approaches, i.e. knowledge graph embedding (KGE) models, operate by learning low-rank vector representations of graph nodes and edges that preserve the graph's inherent structure. These approaches were used to analyse knowledge graphs from different domains where they showed superior performance and accuracy compared to previous graph exploratory approaches. In this work, we study this class of models in the context of biological knowledge graphs and their different applications. We then show how KGE models can be a natural fit for representing complex biological knowledge modelled as graphs. We also discuss their predictive and analytical capabilities in different biology applications. In this regard, we present two example case studies that demonstrate the capabilities of KGE models: prediction of drug-target interactions and polypharmacy side effects. Finally, we analyse different practical considerations for KGEs, and we discuss possible opportunities and challenges related to adopting them for modelling biological systems.
Collapse
Affiliation(s)
| | - Aayah Nounu
- Insight Centre for Data Analytics, NUI Galway, Galway, Ireland
| | - Vít Nováček
- MRC Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, UK
| |
Collapse
|
17
|
Luo H, Li M, Yang M, Wu FX, Li Y, Wang J. Biomedical data and computational models for drug repositioning: a comprehensive review. Brief Bioinform 2020; 22:1604-1619. [PMID: 32043521 DOI: 10.1093/bib/bbz176] [Citation(s) in RCA: 83] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 12/07/2019] [Accepted: 12/26/2019] [Indexed: 12/16/2022] Open
Abstract
Drug repositioning can drastically decrease the cost and duration taken by traditional drug research and development while avoiding the occurrence of unforeseen adverse events. With the rapid advancement of high-throughput technologies and the explosion of various biological data and medical data, computational drug repositioning methods have been appealing and powerful techniques to systematically identify potential drug-target interactions and drug-disease interactions. In this review, we first summarize the available biomedical data and public databases related to drugs, diseases and targets. Then, we discuss existing drug repositioning approaches and group them based on their underlying computational models consisting of classical machine learning, network propagation, matrix factorization and completion, and deep learning based models. We also comprehensively analyze common standard data sets and evaluation metrics used in drug repositioning, and give a brief comparison of various prediction methods on the gold standard data sets. Finally, we conclude our review with a brief discussion on challenges in computational drug repositioning, which includes the problem of reducing the noise and incompleteness of biomedical data, the ensemble of various computation drug repositioning methods, the importance of designing reliable negative samples selection methods, new techniques dealing with the data sparseness problem, the construction of large-scale and comprehensive benchmark data sets and the analysis and explanation of the underlying mechanisms of predicted interactions.
Collapse
Affiliation(s)
- Huimin Luo
- School of Computer Science and Engineering at Central South University
| | - Min Li
- School of Computer Science and Engineering at Central South University
| | - Mengyun Yang
- School of Computer Science and Engineering at Central South University
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Science at University of Saskatchewan, Saskatoon, Canada
| | - Yaohang Li
- Department of Computer Science at Old Dominion University, Norfolk, USA
| | - Jianxin Wang
- School of Computer Science and Engineering at Central South University
| |
Collapse
|
18
|
Wang H, Wang J, Dong C, Lian Y, Liu D, Yan Z. A Novel Approach for Drug-Target Interactions Prediction Based on Multimodal Deep Autoencoder. Front Pharmacol 2020; 10:1592. [PMID: 32047432 PMCID: PMC6997437 DOI: 10.3389/fphar.2019.01592] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Accepted: 12/09/2019] [Indexed: 01/09/2023] Open
Abstract
Drug targets are biomacromolecules or biomolecular structures that bind to specific drugs and produce therapeutic effects. Therefore, the prediction of drug-target interactions (DTIs) is important for disease therapy. Incorporating multiple similarity measures for drugs and targets is of essence for improving the accuracy of prediction of DTIs. However, existing studies with multiple similarity measures ignored the global structure information of similarity measures, and required manual extraction features of drug-target pairs, ignoring the non-linear relationship among features. In this paper, we proposed a novel approach MDADTI for DTIs prediction based on MDA. MDADTI applied random walk with restart method and positive pointwise mutual information to calculate the topological similarity matrices of drugs and targets, capturing the global structure information of similarity measures. Then, MDADTI applied multimodal deep autoencoder to fuse multiple topological similarity matrices of drugs and targets, automatically learned the low-dimensional features of drugs and targets, and applied deep neural network to predict DTIs. The results of 5-repeats of 10-fold cross-validation under three different cross-validation settings indicated that MDADTI is superior to the other four baseline methods. In addition, we validated the predictions of the MDADTI in six drug-target interactions reference databases, and the results showed that MDADTI can effectively identify unknown DTIs.
Collapse
Affiliation(s)
- Huiqing Wang
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| | - Jingjing Wang
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| | - Chunlin Dong
- Dryland Agriculture Research Center, Shanxi Academy of Agricultural Sciences, Taiyuan, China
| | - Yuanyuan Lian
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| | - Dan Liu
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| | - Zhiliang Yan
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| |
Collapse
|
19
|
Bagherian M, Sabeti E, Wang K, Sartor MA, Nikolovska-Coleska Z, Najarian K. Machine learning approaches and databases for prediction of drug-target interaction: a survey paper. Brief Bioinform 2020; 22:247-269. [PMID: 31950972 PMCID: PMC7820849 DOI: 10.1093/bib/bbz157] [Citation(s) in RCA: 172] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 11/01/2019] [Accepted: 11/07/2019] [Indexed: 12/12/2022] Open
Abstract
The task of predicting the interactions between drugs and targets plays a key role in the process of drug discovery. There is a need to develop novel and efficient prediction approaches in order to avoid costly and laborious yet not-always-deterministic experiments to determine drug–target interactions (DTIs) by experiments alone. These approaches should be capable of identifying the potential DTIs in a timely manner. In this article, we describe the data required for the task of DTI prediction followed by a comprehensive catalog consisting of machine learning methods and databases, which have been proposed and utilized to predict DTIs. The advantages and disadvantages of each set of methods are also briefly discussed. Lastly, the challenges one may face in prediction of DTI using machine learning approaches are highlighted and we conclude by shedding some lights on important future research directions.
Collapse
Affiliation(s)
- Maryam Bagherian
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Elyas Sabeti
- Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Kai Wang
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Maureen A Sartor
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA
| | | | - Kayvan Najarian
- Department of Electrical Engineering and Computer Science, College of Engineering, University of Michigan, Ann Arbor, MI, 48109, USA
| |
Collapse
|
20
|
Poleksic A, Xie L. Database of adverse events associated with drugs and drug combinations. Sci Rep 2019; 9:20025. [PMID: 31882773 PMCID: PMC6934730 DOI: 10.1038/s41598-019-56525-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Accepted: 12/13/2019] [Indexed: 12/26/2022] Open
Abstract
Due to the aging world population and increasing trend in clinical practice to treat patients with multiple drugs, adverse events (AEs) are becoming a major challenge in drug discovery and public health. In particular, identifying AEs caused by drug combinations remains a challenging task. Clinical trials typically focus on individual drugs rather than drug combinations and animal models are unreliable. An added difficulty is the combinatorial explosion in the number of possible combinations that can be made using the increasingly large set of FDA approved chemicals. We present a statistical and computational technique for identifying AEs caused by two-drug combinations. Taking advantage of the large and increasing data deposited in FDA’s postmarketing reports, we demonstrate that the task of predicting AEs for 2-drug combinations is amenable to the Likelihood Ratio Test (LRT). Our pAERS database constructed with LRT contains almost 77 thousand associations between pairs of drugs and corresponding AEs caused solely by drug-drug interactions (DDIs). The DDIs stored in pAERS complement the existing data sets. Due to our stringent statistical test, we expect many of the associations in pAERS to be unrecorded or poorly documented in the literature.
Collapse
Affiliation(s)
- Aleksandar Poleksic
- Department of Computer Science, University of Northern Iowa, Cedar Falls, Iowa, 50614, USA.
| | - Lei Xie
- Department of Computer Science, Hunter College, The City University of New York, New York, New York, 10065, USA. .,Ph.D. Program in Computer Science, Biochemistry and Biology, The Graduate Center, The City University of New York, New York, New York, 10065, USA.
| |
Collapse
|
21
|
Ayed M, Lim H, Xie L. Biological representation of chemicals using latent target interaction profile. BMC Bioinformatics 2019; 20:674. [PMID: 31861982 PMCID: PMC6924142 DOI: 10.1186/s12859-019-3241-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Background Computational prediction of a phenotypic response upon the chemical perturbation on a biological system plays an important role in drug discovery, and many other applications. Chemical fingerprints are a widely used feature to build machine learning models. However, the fingerprints that are derived from chemical structures ignore the biological context, thus, they suffer from several problems such as the activity cliff and curse of dimensionality. Fundamentally, the chemical modulation of biological activities is a multi-scale process. It is the genome-wide chemical-target interactions that modulate chemical phenotypic responses. Thus, the genome-scale chemical-target interaction profile will more directly correlate with in vitro and in vivo activities than the chemical structure. Nevertheless, the scope of direct application of the chemical-target interaction profile is limited due to the severe incompleteness, biasness, and noisiness of bioassay data. Results To address the aforementioned problems, we developed a novel chemical representation method: Latent Target Interaction Profile (LTIP). LTIP embeds chemicals into a low dimensional continuous latent space that represents genome-scale chemical-target interactions. Subsequently LTIP can be used as a feature to build machine learning models. Using the drug sensitivity of cancer cell lines as a benchmark, we have shown that the LTIP robustly outperforms chemical fingerprints regardless of machine learning algorithms. Moreover, the LTIP is complementary with the chemical fingerprints. It is possible for us to combine LTIP with other fingerprints to further improve the performance of bioactivity prediction. Conclusions Our results demonstrate the potential of LTIP in particular and multi-scale modeling in general in predictive modeling of chemical modulation of biological activities.
Collapse
Affiliation(s)
- Mohamed Ayed
- Ph.D. Program in Computer Science, The Graduate Center, The City University of New York, New York, NY, USA
| | - Hansaim Lim
- Ph.D. Program in Biochemistry, The Graduate Center, The City University of New York, New York, NY, USA
| | - Lei Xie
- Department of Computer Science, Hunter College, & The Graduate Center, The City University of New York, New York, NY, USA.
| |
Collapse
|
22
|
Rifaioglu AS, Atas H, Martin MJ, Cetin-Atalay R, Atalay V, Doğan T. Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases. Brief Bioinform 2019; 20:1878-1912. [PMID: 30084866 PMCID: PMC6917215 DOI: 10.1093/bib/bby061] [Citation(s) in RCA: 237] [Impact Index Per Article: 47.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Revised: 05/25/2018] [Indexed: 01/16/2023] Open
Abstract
The identification of interactions between drugs/compounds and their targets is crucial for the development of new drugs. In vitro screening experiments (i.e. bioassays) are frequently used for this purpose; however, experimental approaches are insufficient to explore novel drug-target interactions, mainly because of feasibility problems, as they are labour intensive, costly and time consuming. A computational field known as 'virtual screening' (VS) has emerged in the past decades to aid experimental drug discovery studies by statistically estimating unknown bio-interactions between compounds and biological targets. These methods use the physico-chemical and structural properties of compounds and/or target proteins along with the experimentally verified bio-interaction information to generate predictive models. Lately, sophisticated machine learning techniques are applied in VS to elevate the predictive performance. The objective of this study is to examine and discuss the recent applications of machine learning techniques in VS, including deep learning, which became highly popular after giving rise to epochal developments in the fields of computer vision and natural language processing. The past 3 years have witnessed an unprecedented amount of research studies considering the application of deep learning in biomedicine, including computational drug discovery. In this review, we first describe the main instruments of VS methods, including compound and protein features (i.e. representations and descriptors), frequently used libraries and toolkits for VS, bioactivity databases and gold-standard data sets for system training and benchmarking. We subsequently review recent VS studies with a strong emphasis on deep learning applications. Finally, we discuss the present state of the field, including the current challenges and suggest future directions. We believe that this survey will provide insight to the researchers working in the field of computational drug discovery in terms of comprehending and developing novel bio-prediction methods.
Collapse
Affiliation(s)
- Ahmet Sureyya Rifaioglu
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
- Department of Computer Engineering, İskenderun Technical University, Hatay, Turkey
| | - Heval Atas
- Cancer System Biology Laboratory (CanSyL), Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Maria Jesus Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Cambridge, Hinxton, UK
| | - Rengul Cetin-Atalay
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
| | - Volkan Atalay
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
| | - Tunca Doğan
- Cancer System Biology Laboratory (CanSyL), Graduate School of Informatics, Middle East Technical University, Ankara, Turkey and European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Cambridge, Hinxton, UK
| |
Collapse
|
23
|
Poleksic A, Xie L. Predicting serious rare adverse reactions of novel chemicals. Bioinformatics 2019; 34:2835-2842. [PMID: 29617731 PMCID: PMC6084596 DOI: 10.1093/bioinformatics/bty193] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2017] [Accepted: 03/28/2018] [Indexed: 02/02/2023] Open
Abstract
Motivation Adverse drug reactions (ADRs) are one of the main causes of death and a major financial burden on the world’s economy. Due to the limitations of the animal model, computational prediction of serious and rare ADRs is invaluable. However, current state-of-the-art computational methods do not yield significantly better predictions of rare ADRs than random guessing. Results We present a novel method, based on the theory of ‘compressed sensing’ (CS), which can accurately predict serious side-effects of candidate and market drugs. Not only is our method able to infer new chemical-ADR associations using existing noisy, biased and incomplete databases, but our data also demonstrate that the accuracy of CS in predicting a serious ADR for a candidate drug increases with increasing knowledge of other ADRs associated with the drug. In practice, this means that as the candidate drug moves up the different stages of clinical trials, the prediction accuracy of our method will increase accordingly. Availability and implementation The program is available at https://github.com/poleksic/side-effects. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Aleksandar Poleksic
- Department of Computer Science, University of Northern Iowa, Cedar Falls, IA, USA
| | - Lei Xie
- Department of Computer Science, Hunter College, The Graduate Center, The City University of New York, New York, NY, USA.,Ph.D. Program in Computer Science, Biochemistry and Biology, The Graduate Center, The City University of New York, New York, NY, USA
| |
Collapse
|
24
|
Martínez MJ, Razuc M, Ponzoni I. MoDeSuS: A Machine Learning Tool for Selection of Molecular Descriptors in QSAR Studies Applied to Molecular Informatics. BIOMED RESEARCH INTERNATIONAL 2019; 2019:2905203. [PMID: 30906770 PMCID: PMC6398071 DOI: 10.1155/2019/2905203] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/11/2018] [Revised: 01/10/2019] [Accepted: 01/19/2019] [Indexed: 01/15/2023]
Abstract
The selection of the most relevant molecular descriptors to describe a target variable in the context of QSAR (Quantitative Structure-Activity Relationship) modelling is a challenging combinatorial optimization problem. In this paper, a novel software tool for addressing this task in the context of regression and classification modelling is presented. The methodology that implements the tool is organized into two phases. The first phase uses a multiobjective evolutionary technique to perform the selection of subsets of descriptors. The second phase performs an external validation of the chosen descriptors subsets in order to improve reliability. The tool functionalities have been illustrated through a case study for the estimation of the ready biodegradation property as an example of classification QSAR modelling. The results obtained show the usefulness and potential of this novel software tool that aims to reduce the time and costs of development in the drug discovery process.
Collapse
Affiliation(s)
- María Jimena Martínez
- Instituto de Ciencias e Ingeniería de la Computación (UNS-CONICET), Departamento de Ciencias e Ingeniería de la Computación, Universidad Nacional del Sur (UNS), CP 8000, Bahía Blanca, Argentina
| | - Marina Razuc
- Instituto de Ciencias e Ingeniería de la Computación (UNS-CONICET), Departamento de Ciencias e Ingeniería de la Computación, Universidad Nacional del Sur (UNS), CP 8000, Bahía Blanca, Argentina
- Comisión de Investigaciones Científicas de la Provincia de Buenos Aires (CIC), Calle 526 between 10 and 11, CP 1900, La Plata, Argentina
| | - Ignacio Ponzoni
- Instituto de Ciencias e Ingeniería de la Computación (UNS-CONICET), Departamento de Ciencias e Ingeniería de la Computación, Universidad Nacional del Sur (UNS), CP 8000, Bahía Blanca, Argentina
| |
Collapse
|
25
|
Abstract
Systems pharmacology aims to understand drug actions on a multi-scale from atomic details of drug-target interactions to emergent properties of biological network and rationally design drugs targeting an interacting network instead of a single gene. Multifaceted data-driven studies, including machine learning-based predictions, play a key role in systems pharmacology. In such works, the integration of multiple omics data is the key initial step, followed by optimization and prediction. Here, we describe the overall procedures for drug-target association prediction using REMAP, a large-scale off-target prediction tool. The method introduced here can be applied to other relation inference problems in systems pharmacology.
Collapse
Affiliation(s)
- Hansaim Lim
- The Ph.D. Program in Biochemistry, The Graduate Center, The City University of New York, New York, NY, USA
| | - Lei Xie
- The Ph.D. Program in Biochemistry, The Graduate Center, The City University of New York, New York, NY, USA.
- Department of Computer Science, Hunter College, The City University of New York, New York, NY, USA.
| |
Collapse
|
26
|
Olayan RS, Ashoor H, Bajic VB. DDR: efficient computational method to predict drug-target interactions using graph mining and machine learning approaches. Bioinformatics 2018; 34:1164-1173. [PMID: 29186331 PMCID: PMC5998943 DOI: 10.1093/bioinformatics/btx731] [Citation(s) in RCA: 107] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Accepted: 11/23/2017] [Indexed: 02/06/2023] Open
Abstract
Motivation Finding computationally drug–target interactions (DTIs) is a convenient strategy to identify new DTIs at low cost with reasonable accuracy. However, the current DTI prediction methods suffer the high false positive prediction rate. Results We developed DDR, a novel method that improves the DTI prediction accuracy. DDR is based on the use of a heterogeneous graph that contains known DTIs with multiple similarities between drugs and multiple similarities between target proteins. DDR applies non-linear similarity fusion method to combine different similarities. Before fusion, DDR performs a pre-processing step where a subset of similarities is selected in a heuristic process to obtain an optimized combination of similarities. Then, DDR applies a random forest model using different graph-based features extracted from the DTI heterogeneous graph. Using 5-repeats of 10-fold cross-validation, three testing setups, and the weighted average of area under the precision-recall curve (AUPR) scores, we show that DDR significantly reduces the AUPR score error relative to the next best start-of-the-art method for predicting DTIs by 31% when the drugs are new, by 23% when targets are new and by 34% when the drugs and the targets are known but not all DTIs between them are not known. Using independent sources of evidence, we verify as correct 22 out of the top 25 DDR novel predictions. This suggests that DDR can be used as an efficient method to identify correct DTIs. Availability and implementation The data and code are provided at https://bitbucket.org/RSO24/ddr/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rawan S Olayan
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal, Saudi Arabia
| | - Haitham Ashoor
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut 06032, USA
| | - Vladimir B Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal, Saudi Arabia
| |
Collapse
|
27
|
Wang C, Kurgan L. Review and comparative assessment of similarity-based methods for prediction of drug–protein interactions in the druggable human proteome. Brief Bioinform 2018; 20:2066-2087. [DOI: 10.1093/bib/bby069] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Revised: 06/26/2018] [Accepted: 07/10/2018] [Indexed: 12/18/2022] Open
Abstract
AbstractDrug–protein interactions (DPIs) underlie the desired therapeutic actions and the adverse side effects of a significant majority of drugs. Computational prediction of DPIs facilitates research in drug discovery, characterization and repurposing. Similarity-based methods that do not require knowledge of protein structures are particularly suitable for druggable genome-wide predictions of DPIs. We review 35 high-impact similarity-based predictors that were published in the past decade. We group them based on three types of similarities and their combinations that they use. We discuss and compare key aspects of these methods including source databases, internal databases and their predictive models. Using our novel benchmark database, we perform comparative empirical analysis of predictive performance of seven types of representative predictors that utilize each type of similarity individually and all possible combinations of similarities. We assess predictive quality at the database-wide DPI level and we are the first to also include evaluation over individual drugs. Our comprehensive analysis shows that predictors that use more similarity types outperform methods that employ fewer similarities, and that the model combining all three types of similarities secures area under the receiver operating characteristic curve of 0.93. We offer a comprehensive analysis of sensitivity of predictive performance to intrinsic and extrinsic characteristics of the considered predictors. We find that predictive performance is sensitive to low levels of similarities between sequences of the drug targets and several extrinsic properties of the input drug structures, drug profiles and drug targets. The benchmark database and a webserver for the seven predictors are freely available at http://biomine.cs.vcu.edu/servers/CONNECTOR/.
Collapse
Affiliation(s)
- Chen Wang
- Computer Science Department, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Computer Science Department, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
28
|
Olayan RS, Ashoor H, Bajic VB. DDR: efficient computational method to predict drug-target interactions using graph mining and machine learning approaches. Bioinformatics 2018; 34:3779. [PMID: 29917050 PMCID: PMC6198857 DOI: 10.1093/bioinformatics/bty417] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
|
29
|
Lim H, Poleksic A, Xie L. Exploring Landscape of Drug-Target-Pathway-Side Effect Associations. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2018; 2017:132-141. [PMID: 29888057 PMCID: PMC5961812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Side effects are the second and the fourth leading causes of drug attrition and death in the US. Thus, accurate prediction of side effects and understanding their mechanism of action will significantly impact drug discovery and clinical practice. Here, we show REMAP, a neighborhood-regularized weighted and imputed one-class collaborative filtering algorithm, is effective in predicting drug-side effect associations from a drug-side effect association network, and significantly outperforms the state-of-the-art multi-target learning algorithm for predicting rare side effects. We also apply FASCINATE, an extension of REMAP for multi-layered networks, to infer associations among side effects and drug targets from drug-target-side effect networks. Then, using random permutation analysis and gene overrepresentation tests, we infer statistically significant side effect-pathway associations. The predicted drug-side effect associations and side effect-causing pathways are consistent with clinical evidences. We expect more novel drug-side effect associations and side effect-causing pathways to be identified when applying REMAP and FASCINATE to large-scale chemical-gene-side effect networks.
Collapse
Affiliation(s)
- Hansaim Lim
- PhD program in Biochemistry, the City University of New York, New York, NY, United States
| | - Aleksandar Poleksic
- Department of Computer Science, University of Northern Iowa, Cedar Falls, IA, United States
| | - Lei Xie
- Department of Computer Science, Hunter College, the CityUniversity of New York, New York, NY, United States
| |
Collapse
|
30
|
Recent developments and emerging trends of mass spectrometry for herbal ingredients analysis. Trends Analyt Chem 2017. [DOI: 10.1016/j.trac.2017.07.007] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|