1
|
Yadav S, Vora DS, Sundar D, Dhanjal JK. TCR-ESM: Employing protein language embeddings to predict TCR-peptide-MHC binding. Comput Struct Biotechnol J 2024; 23:165-173. [PMID: 38146434 PMCID: PMC10749252 DOI: 10.1016/j.csbj.2023.11.037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Revised: 11/19/2023] [Accepted: 11/20/2023] [Indexed: 12/27/2023] Open
Abstract
Cognate target identification for T-cell receptors (TCRs) is a significant barrier in T-cell therapy development, which may be overcome by accurately predicting TCR interaction with peptide-bound major histocompatibility complex (pMHC). In this study, we have employed peptide embeddings learned from a large protein language model- Evolutionary Scale Modeling (ESM), to predict TCR-pMHC binding. The TCR-ESM model presented outperforms existing predictors. The complementarity-determining region 3 (CDR3) of the hypervariable TCR is located at the center of the paratope and plays a crucial role in peptide recognition. TCR-ESM trained on paired TCR data with both CDR3α and CDR3β chain information performs significantly better than those trained on data with only CDR3β, suggesting that both TCR chains contribute to specificity, the relative importance however depends on the specific peptide-MHC targeted. The study illuminates the importance of MHC information in TCR-peptide binding which remained inconclusive so far and was thought dependent on the dataset characteristics. TCR-ESM outperforms existing approaches on external datasets, suggesting generalizability. Overall, the potential of deep learning for predicting TCR-pMHC interactions and improving the understanding of factors driving TCR specificity are highlighted. The prediction model is available at http://tcresm.dhanjal-lab.iiitd.edu.in/ as an online tool.
Collapse
Affiliation(s)
- Shashank Yadav
- Department of Biomedical Engineering, University of Arizona, Tucson 85721, AZ, USA
| | - Dhvani Sandip Vora
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, New Delhi 110016, India
- Department of Computational Biology, Indraprastha Institute of Information Technology, Delhi, New Delhi 110020, India
| | - Durai Sundar
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, New Delhi 110016, India
| | - Jaspreet Kaur Dhanjal
- Department of Computational Biology, Indraprastha Institute of Information Technology, Delhi, New Delhi 110020, India
| |
Collapse
|
2
|
Lavecchia A. Advancing drug discovery with deep attention neural networks. Drug Discov Today 2024; 29:104067. [PMID: 38925473 DOI: 10.1016/j.drudis.2024.104067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Revised: 06/10/2024] [Accepted: 06/19/2024] [Indexed: 06/28/2024]
Abstract
In the dynamic field of drug discovery, deep attention neural networks are revolutionizing our approach to complex data. This review explores the attention mechanism and its extended architectures, including graph attention networks (GATs), transformers, bidirectional encoder representations from transformers (BERT), generative pre-trained transformers (GPTs) and bidirectional and auto-regressive transformers (BART). Delving into their core principles and multifaceted applications, we uncover their pivotal roles in catalyzing de novo drug design, predicting intricate molecular properties and deciphering elusive drug-target interactions. Despite challenges, these attention-based architectures hold unparalleled promise to drive transformative breakthroughs and accelerate progress in pharmaceutical research.
Collapse
Affiliation(s)
- Antonio Lavecchia
- Drug Discovery Laboratory, Department of Pharmacy, University of Napoli Federico II, I-80131 Naples, Italy.
| |
Collapse
|
3
|
Zabihian A, Asghari J, Hooshmand M, Gharaghani S. A comparative analysis of computational drug repurposing approaches: proposing a novel tensor-matrix-tensor factorization method. Mol Divers 2024:10.1007/s11030-024-10851-7. [PMID: 38683487 DOI: 10.1007/s11030-024-10851-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 03/18/2024] [Indexed: 05/01/2024]
Abstract
Efficient drug discovery relies on drug repurposing, an important and open research field. This work presents a novel factorization method and a practical comparison of different approaches for drug repurposing. First, we propose a novel tensor-matrix-tensor (TMT) formulation as a new data array method with a gradient-based factorization procedure. Additionally, this paper examines and contrasts four computational drug repurposing approaches-factorization-based methods, machine learning methods, deep learning methods, and graph neural networks-to fulfill the second purpose. We test the strategies on two datasets and assess each approach's performance, drawbacks, problems, and benefits based on results. The results demonstrate that deep learning techniques work better than other strategies and that their results might be more reliable. Ultimately, graph neural methods need to be in an inductive manner to have a reliable prediction.
Collapse
Affiliation(s)
- Arash Zabihian
- Department of Bioinformatics, Kish International Campus, University of Tehran, Kish, Iran
| | - Javad Asghari
- Department of Computer Science and Information Technology, Institute of Advanced Studies in Basic Sciences, Zanjan, Iran
| | - Mohsen Hooshmand
- Department of Computer Science and Information Technology, Institute of Advanced Studies in Basic Sciences, Zanjan, Iran.
| | - Sajjad Gharaghani
- Laboratory of Bioinformatics and Drug Design, University of Tehran, Tehran, Iran
| |
Collapse
|
4
|
Gorantla R, Kubincová A, Weiße AY, Mey ASJS. From Proteins to Ligands: Decoding Deep Learning Methods for Binding Affinity Prediction. J Chem Inf Model 2024; 64:2496-2507. [PMID: 37983381 PMCID: PMC11005465 DOI: 10.1021/acs.jcim.3c01208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 10/26/2023] [Accepted: 10/27/2023] [Indexed: 11/22/2023]
Abstract
Accurate in silico prediction of protein-ligand binding affinity is important in the early stages of drug discovery. Deep learning-based methods exist but have yet to overtake more conventional methods such as giga-docking largely due to their lack of generalizability. To improve generalizability, we need to understand what these models learn from input protein and ligand data. We systematically investigated a sequence-based deep learning framework to assess the impact of protein and ligand encodings on predicting binding affinities for commonly used kinase data sets. The role of proteins is studied using convolutional neural network-based encodings obtained from sequences and graph neural network-based encodings enriched with structural information from contact maps. Ligand-based encodings are generated from graph-neural networks. We test different ligand perturbations by randomizing node and edge properties. For proteins, we make use of 3 different protein contact generation methods (AlphaFold2, Pconsc4, and ESM-1b) and compare these with a random control. Our investigation shows that protein encodings do not substantially impact the binding predictions, with no statistically significant difference in binding affinity for KIBA in the investigated metrics (concordance index, Pearson's R Spearman's Rank, and RMSE). Significant differences are seen for ligand encodings with random ligands and random ligand node properties, suggesting a much bigger reliance on ligand data for the learning tasks. Using different ways to combine protein and ligand encodings did not show a significant change in performance.
Collapse
Affiliation(s)
- Rohan Gorantla
- School
of Informatics, University of Edinburgh, Edinburgh, EH8 9AB, U.K.
- EaStCHEM
School of Chemistry, University of Edinburgh, Edinburgh, EH9 3FJ, U.K.
| | | | - Andrea Y. Weiße
- School
of Informatics, University of Edinburgh, Edinburgh, EH8 9AB, U.K.
- School
of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3FF, U.K.
| | | |
Collapse
|
5
|
Qi X, Zhao Y, Qi Z, Hou S, Chen J. Machine Learning Empowering Drug Discovery: Applications, Opportunities and Challenges. Molecules 2024; 29:903. [PMID: 38398653 PMCID: PMC10892089 DOI: 10.3390/molecules29040903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 02/08/2024] [Accepted: 02/14/2024] [Indexed: 02/25/2024] Open
Abstract
Drug discovery plays a critical role in advancing human health by developing new medications and treatments to combat diseases. How to accelerate the pace and reduce the costs of new drug discovery has long been a key concern for the pharmaceutical industry. Fortunately, by leveraging advanced algorithms, computational power and biological big data, artificial intelligence (AI) technology, especially machine learning (ML), holds the promise of making the hunt for new drugs more efficient. Recently, the Transformer-based models that have achieved revolutionary breakthroughs in natural language processing have sparked a new era of their applications in drug discovery. Herein, we introduce the latest applications of ML in drug discovery, highlight the potential of advanced Transformer-based ML models, and discuss the future prospects and challenges in the field.
Collapse
Affiliation(s)
- Xin Qi
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou 215011, China; (Y.Z.); (S.H.); (J.C.)
| | - Yuanchun Zhao
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou 215011, China; (Y.Z.); (S.H.); (J.C.)
| | - Zhuang Qi
- School of Software, Shandong University, Jinan 250101, China;
| | - Siyu Hou
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou 215011, China; (Y.Z.); (S.H.); (J.C.)
| | - Jiajia Chen
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou 215011, China; (Y.Z.); (S.H.); (J.C.)
| |
Collapse
|
6
|
Liu J, Yang M, Yu Y, Xu H, Li K, Zhou X. Large language models in bioinformatics: applications and perspectives. ARXIV 2024:arXiv:2401.04155v1. [PMID: 38259343 PMCID: PMC10802675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Large language models (LLMs) are a class of artificial intelligence models based on deep learning, which have great performance in various tasks, especially in natural language processing (NLP). Large language models typically consist of artificial neural networks with numerous parameters, trained on large amounts of unlabeled input using self-supervised or semi-supervised learning. However, their potential for solving bioinformatics problems may even exceed their proficiency in modeling human language. In this review, we will present a summary of the prominent large language models used in natural language processing, such as BERT and GPT, and focus on exploring the applications of large language models at different omics levels in bioinformatics, mainly including applications of large language models in genomics, transcriptomics, proteomics, drug discovery and single cell analysis. Finally, this review summarizes the potential and prospects of large language models in solving bioinformatic problems.
Collapse
Affiliation(s)
- Jiajia Liu
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, 77030, USA
| | - Mengyuan Yang
- School of Life Sciences, Zhengzhou University, Zhengzhou, Henan 450001, China
| | - Yankai Yu
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, Sichuan 611756, China
| | - Haixia Xu
- The Center of Gerontology and Geriatrics, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Kang Li
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, 77030, USA
- McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
7
|
Le NQK. Leveraging transformers-based language models in proteome bioinformatics. Proteomics 2023; 23:e2300011. [PMID: 37381841 DOI: 10.1002/pmic.202300011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 06/13/2023] [Accepted: 06/13/2023] [Indexed: 06/30/2023]
Abstract
In recent years, the rapid growth of biological data has increased interest in using bioinformatics to analyze and interpret this data. Proteomics, which studies the structure, function, and interactions of proteins, is a crucial area of bioinformatics. Using natural language processing (NLP) techniques in proteomics is an emerging field that combines machine learning and text mining to analyze biological data. Recently, transformer-based NLP models have gained significant attention for their ability to process variable-length input sequences in parallel, using self-attention mechanisms to capture long-range dependencies. In this review paper, we discuss the recent advancements in transformer-based NLP models in proteome bioinformatics and examine their advantages, limitations, and potential applications to improve the accuracy and efficiency of various tasks. Additionally, we highlight the challenges and future directions of using these models in proteome bioinformatics research. Overall, this review provides valuable insights into the potential of transformer-based NLP models to revolutionize proteome bioinformatics.
Collapse
Affiliation(s)
- Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
- AIBioMed Research Group, Taipei Medical University, Taipei, Taiwan
- Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei, Taiwan
- Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan
| |
Collapse
|
8
|
Zhang Y, Liu C, Liu M, Liu T, Lin H, Huang CB, Ning L. Attention is all you need: utilizing attention in AI-enabled drug discovery. Brief Bioinform 2023; 25:bbad467. [PMID: 38189543 PMCID: PMC10772984 DOI: 10.1093/bib/bbad467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 11/03/2023] [Accepted: 11/25/2023] [Indexed: 01/09/2024] Open
Abstract
Recently, attention mechanism and derived models have gained significant traction in drug development due to their outstanding performance and interpretability in handling complex data structures. This review offers an in-depth exploration of the principles underlying attention-based models and their advantages in drug discovery. We further elaborate on their applications in various aspects of drug development, from molecular screening and target binding to property prediction and molecule generation. Finally, we discuss the current challenges faced in the application of attention mechanisms and Artificial Intelligence technologies, including data quality, model interpretability and computational resource constraints, along with future directions for research. Given the accelerating pace of technological advancement, we believe that attention-based models will have an increasingly prominent role in future drug discovery. We anticipate that these models will usher in revolutionary breakthroughs in the pharmaceutical domain, significantly accelerating the pace of drug development.
Collapse
Affiliation(s)
- Yang Zhang
- Innovative Institute of Chinese Medicine and Pharmacy, Academy for Interdiscipline, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Caiqi Liu
- Department of Gastrointestinal Medical Oncology, Harbin Medical University Cancer Hospital, No.150 Haping Road, Nangang District, Harbin, Heilongjiang 150081, China
- Key Laboratory of Molecular Oncology of Heilongjiang Province, No.150 Haping Road, Nangang District, Harbin, Heilongjiang 150081, China
| | - Mujiexin Liu
- Chongqing Key Laboratory of Sichuan-Chongqing Co-construction for Diagnosis and Treatment of Infectious Diseases Integrated Traditional Chinese and Western Medicine, College of Medical Technology, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Tianyuan Liu
- Graduate School of Science and Technology, University of Tsukuba, Tsukuba, Japan
| | - Hao Lin
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Cheng-Bing Huang
- School of Computer Science and Technology, Aba Teachers University, Aba, China
| | - Lin Ning
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
- School of Healthcare Technology, Chengdu Neusoft University, Chengdu 611844, China
| |
Collapse
|
9
|
Ding J, Qiao Y, Zhang L. Plant disease prescription recommendation based on electronic medical records and sentence embedding retrieval. PLANT METHODS 2023; 19:91. [PMID: 37633904 PMCID: PMC10463767 DOI: 10.1186/s13007-023-01070-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 08/11/2023] [Indexed: 08/28/2023]
Abstract
BACKGROUND In the era of Agri 4.0 and the popularity of Plantwise systems, the availability of Plant Electronic Medical Records has provided opportunities to extract valuable disease information and treatment knowledge. However, developing an effective prescription recommendation method based on these records presents unique challenges, such as inadequate labeling data, lack of structural and linguistic specifications, incorporation of new prescriptions, and consideration of multiple factors in practical situations. RESULTS This study proposes a plant disease prescription recommendation method called PRSER, which is based on sentence embedding retrieval. The semantic matching model is created using a pre-trained language model and a sentence embedding method with contrast learning ideas, and the constructed prescription reference database is retrieved for optimal prescription recommendations. A multi-vegetable disease dataset and a multi-fruit disease dataset are constructed to compare three pre-trained language models, four pooling types, and two loss functions. The PRSER model achieves the best semantic matching performance by combining MacBERT, CoSENT, and CLS pooling, resulting in a Pearson coefficient of 86.34% and a Spearman coefficient of 77.67%. The prescription recommendation capability of the model is also verified. PRSER performs well in closed-set testing with Top-1/Top-3/Top-5 accuracy of 88.20%/96.07%/97.70%; and slightly worse in open-set testing with Top-1/Top-3/Top-5 accuracy of 82.04%/91.50%/94.90%. Finally, a plant disease prescription recommendation system for mobile terminals is constructed and its generalization ability with incomplete inputs is verified. When only symptom information is available without environment and plant information, our model shows slightly lower accuracy with Top-1/Top-3/Top-5 accuracy of 75.24%/88.35%/91.99% in closed-set testing and Top-1/Top-3/Top-5 accuracy of 75.08%/87.54%/89.84% in open-set testing. CONCLUSIONS The experiments validate the effectiveness and generalization ability of the proposed approach for recommending plant disease prescriptions. This research has significant potential to facilitate the implementation of artificial intelligence in plant disease treatment, addressing the needs of farmers and advancing scientific plant disease management.
Collapse
Affiliation(s)
- Junqi Ding
- China Agricultural University, Beijing, 100083, China
| | - Yan Qiao
- Beijing Plant Protection Station, Beijing, 100029, China
| | - Lingxian Zhang
- China Agricultural University, Beijing, 100083, China.
- Key Laboratory of Agricultural Informationization Standardization, Ministry of Agriculture and Rural Affairs, Beijing, China.
- College of Information and Electrical Engineering, China Agricultural University, 209# No.17 Qinghua Donglu, Haidian District, Beijing, 100083, China.
| |
Collapse
|
10
|
Zabihian A, Sayyad FZ, Hashemi SM, Shami Tanha R, Hooshmand M, Gharaghani S. DEDTI versus IEDTI: efficient and predictive models of drug-target interactions. Sci Rep 2023; 13:9238. [PMID: 37286613 DOI: 10.1038/s41598-023-36438-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 06/03/2023] [Indexed: 06/09/2023] Open
Abstract
Drug repurposing is an active area of research that aims to decrease the cost and time of drug development. Most of those efforts are primarily concerned with the prediction of drug-target interactions. Many evaluation models, from matrix factorization to more cutting-edge deep neural networks, have come to the scene to identify such relations. Some predictive models are devoted to the prediction's quality, and others are devoted to the efficiency of the predictive models, e.g., embedding generation. In this work, we propose new representations of drugs and targets useful for more prediction and analysis. Using these representations, we propose two inductive, deep network models of IEDTI and DEDTI for drug-target interaction prediction. Both of them use the accumulation of new representations. The IEDTI takes advantage of triplet and maps the input accumulated similarity features into meaningful embedding corresponding vectors. Then, it applies a deep predictive model to each drug-target pair to evaluate their interaction. The DEDTI directly uses the accumulated similarity feature vectors of drugs and targets and applies a predictive model on each pair to identify their interactions. We have done a comprehensive simulation on the DTINet dataset as well as gold standard datasets, and the results show that DEDTI outperforms IEDTI and the state-of-the-art models. In addition, we conduct a docking study on new predicted interactions between two drug-target pairs, and the results confirm acceptable drug-target binding affinity between both predicted pairs.
Collapse
Affiliation(s)
- Arash Zabihian
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
- Department of Bioinformatics, Kish International Campus, University of Tehran, Kish, Iran
| | - Faeze Zakaryapour Sayyad
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, Iran
| | - Seyyed Morteza Hashemi
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, Iran
| | - Reza Shami Tanha
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, Iran
| | - Mohsen Hooshmand
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, Iran.
| | - Sajjad Gharaghani
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran.
| |
Collapse
|
11
|
Chatterjee A, Walters R, Shafi Z, Ahmed OS, Sebek M, Gysi D, Yu R, Eliassi-Rad T, Barabási AL, Menichetti G. Improving the generalizability of protein-ligand binding predictions with AI-Bind. Nat Commun 2023; 14:1989. [PMID: 37031187 PMCID: PMC10082765 DOI: 10.1038/s41467-023-37572-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 03/23/2023] [Indexed: 04/10/2023] Open
Abstract
Identifying novel drug-target interactions is a critical and rate-limiting step in drug discovery. While deep learning models have been proposed to accelerate the identification process, here we show that state-of-the-art models fail to generalize to novel (i.e., never-before-seen) structures. We unveil the mechanisms responsible for this shortcoming, demonstrating how models rely on shortcuts that leverage the topology of the protein-ligand bipartite network, rather than learning the node features. Here we introduce AI-Bind, a pipeline that combines network-based sampling strategies with unsupervised pre-training to improve binding predictions for novel proteins and ligands. We validate AI-Bind predictions via docking simulations and comparison with recent experimental evidence, and step up the process of interpreting machine learning prediction of protein-ligand binding by identifying potential active binding sites on the amino acid sequence. AI-Bind is a high-throughput approach to identify drug-target combinations with the potential of becoming a powerful tool in drug discovery.
Collapse
Affiliation(s)
- Ayan Chatterjee
- Network Science Institute, Northeastern University, Boston, MA, USA
| | - Robin Walters
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Zohair Shafi
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Omair Shafi Ahmed
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Michael Sebek
- Network Science Institute, Northeastern University, Boston, MA, USA
- Department of Physics, Northeastern University, Boston, MA, USA
| | - Deisy Gysi
- Network Science Institute, Northeastern University, Boston, MA, USA
- Department of Physics, Northeastern University, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Rose Yu
- Department of Computer Science and Engineering, University of California, San Diego, CA, USA
| | - Tina Eliassi-Rad
- Network Science Institute, Northeastern University, Boston, MA, USA
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
- Santa Fe Institute, Santa Fe, NM, USA
- The Institute for Experiential AI, Northeastern University, Boston, MA, USA
| | - Albert-László Barabási
- Network Science Institute, Northeastern University, Boston, MA, USA
- Department of Physics, Northeastern University, Boston, MA, USA
- Department of Network and Data Science, Central European University, Budapest, Hungary
| | - Giulia Menichetti
- Network Science Institute, Northeastern University, Boston, MA, USA.
- Department of Physics, Northeastern University, Boston, MA, USA.
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
12
|
Zhu Z, Yao Z, Qi G, Mazur N, Yang P, Cong B. Associative learning mechanism for drug‐target interaction prediction. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2023. [DOI: 10.1049/cit2.12194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2023] Open
Affiliation(s)
- Zhiqin Zhu
- College of Automation Chongqing University of Posts and Telecommunications Chongqing China
| | - Zheng Yao
- College of Automation Chongqing University of Posts and Telecommunications Chongqing China
| | - Guanqiu Qi
- Computer Information Systems Department State University of New York at Buffalo State Buffalo New York USA
| | - Neal Mazur
- Computer Information Systems Department State University of New York at Buffalo State Buffalo New York USA
| | - Pan Yang
- Department of Cardiovascular Surgery Chongqing General Hospital University of Chinese Academy of Sciences Chongqing China
- Emergency Department The Second Affiliated Hospital of Chongqing Medical University Chongqing China
| | - Baisen Cong
- Data Scientist Diagnostics Digital DH (Shanghai) Diagnostics Co., Ltd. Danaher Company Shanghai China
| |
Collapse
|
13
|
Chandler M, Jain S, Halman J, Hong E, Dobrovolskaia MA, Zakharov AV, Afonin KA. Artificial Immune Cell, AI-cell, a New Tool to Predict Interferon Production by Peripheral Blood Monocytes in Response to Nucleic Acid Nanoparticles. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2022; 18:e2204941. [PMID: 36216772 PMCID: PMC9671856 DOI: 10.1002/smll.202204941] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/15/2022] [Indexed: 06/16/2023]
Abstract
Nucleic acid nanoparticles, or NANPs, rationally designed to communicate with the human immune system, can offer innovative therapeutic strategies to overcome the limitations of traditional nucleic acid therapies. Each set of NANPs is unique in their architectural parameters and physicochemical properties, which together with the type of delivery vehicles determine the kind and the magnitude of their immune response. Currently, there are no predictive tools that would reliably guide the design of NANPs to the desired immunological outcome, a step crucial for the success of personalized therapies. Through a systematic approach investigating physicochemical and immunological profiles of a comprehensive panel of various NANPs, the research team developes and experimentally validates a computational model based on the transformer architecture able to predict the immune activities of NANPs. It is anticipated that the freely accessible computational tool that is called an "artificial immune cell," or AI-cell, will aid in addressing the current critical public health challenges related to safety criteria of nucleic acid therapies in a timely manner and promote the development of novel biomedical tools.
Collapse
Affiliation(s)
- Morgan Chandler
- Department of Chemistry, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Sankalp Jain
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD 20850, USA
| | - Justin Halman
- Department of Chemistry, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Enping Hong
- Nanotechnology Characterization Lab, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA
| | - Marina A. Dobrovolskaia
- Nanotechnology Characterization Lab, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA
| | - Alexey V. Zakharov
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD 20850, USA
| | - Kirill A. Afonin
- Department of Chemistry, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| |
Collapse
|
14
|
Kalakoti Y, Yadav S, Sundar D. Deep Neural Network-Assisted Drug Recommendation Systems for Identifying Potential Drug-Target Interactions. ACS OMEGA 2022; 7:12138-12146. [PMID: 35449922 PMCID: PMC9016825 DOI: 10.1021/acsomega.2c00424] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 03/18/2022] [Indexed: 06/14/2023]
Abstract
In silico methods to identify novel drug-target interactions (DTIs) have gained significant importance over conventional techniques owing to their labor-intensive and low-throughput nature. Here, we present a machine learning-based multiclass classification workflow that segregates interactions between active, inactive, and intermediate drug-target pairs. Drug molecules, protein sequences, and molecular descriptors were transformed into machine-interpretable embeddings to extract critical features from standard datasets. Tools such as CHEMBL web resource, iFeature, and an in-house developed deep neural network-assisted drug recommendation (dNNDR)-featx were employed for data retrieval and processing. The models were trained with large-scale DTI datasets, which reported an improvement in performance over baseline methods. External validation results showed that models based on att-biLSTM and gCNN could help predict novel DTIs. When tested with a completely different dataset, the proposed models significantly outperformed competing methods. The validity of novel interactions predicted by dNNDR was backed by experimental and computational evidence in the literature. The proposed methodology could elucidate critical features that govern the relationship between a drug and its target.
Collapse
Affiliation(s)
- Yogesh Kalakoti
- DAILAB,
Department of Biochemical Engineering & Biotechnology, Indian Institute of Technology (IIT) Delhi, New Delhi 110 016, India
| | - Shashank Yadav
- DAILAB,
Department of Biochemical Engineering & Biotechnology, Indian Institute of Technology (IIT) Delhi, New Delhi 110 016, India
| | - Durai Sundar
- DAILAB,
Department of Biochemical Engineering & Biotechnology, Indian Institute of Technology (IIT) Delhi, New Delhi 110 016, India
- School
of Artificial Intelligence, Indian Institute
of Technology (IIT) Delhi, New Delhi 110 016, India
| |
Collapse
|