1
|
Mao J, Wang J, Zeb A, Cho KH, Jin H, Kim J, Lee O, Wang Y, No KT. Transformer-Based Molecular Generative Model for Antiviral Drug Design. J Chem Inf Model 2024; 64:2733-2745. [PMID: 37366644 PMCID: PMC11005037 DOI: 10.1021/acs.jcim.3c00536] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Indexed: 06/28/2023]
Abstract
Since the Simplified Molecular Input Line Entry System (SMILES) is oriented to the atomic-level representation of molecules and is not friendly in terms of human readability and editable, however, IUPAC is the closest to natural language and is very friendly in terms of human-oriented readability and performing molecular editing, we can manipulate IUPAC to generate corresponding new molecules and produce programming-friendly molecular forms of SMILES. In addition, antiviral drug design, especially analogue-based drug design, is also more appropriate to edit and design directly from the functional group level of IUPAC than from the atomic level of SMILES, since designing analogues involves altering the R group only, which is closer to the knowledge-based molecular design of a chemist. Herein, we present a novel data-driven self-supervised pretraining generative model called "TransAntivirus" to make select-and-replace edits and convert organic molecules into the desired properties for design of antiviral candidate analogues. The results indicated that TransAntivirus is significantly superior to the control models in terms of novelty, validity, uniqueness, and diversity. TransAntivirus showed excellent performance in the design and optimization of nucleoside and non-nucleoside analogues by chemical space analysis and property prediction analysis. Furthermore, to validate the applicability of TransAntivirus in the design of antiviral drugs, we conducted two case studies on the design of nucleoside analogues and non-nucleoside analogues and screened four candidate lead compounds against anticoronavirus disease (COVID-19). Finally, we recommend this framework for accelerating antiviral drug discovery.
Collapse
Affiliation(s)
- Jiashun Mao
- The
Interdisciplinary Graduate Program in Integrative Biotechnology and
Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea
| | - Jianmin Wang
- The
Interdisciplinary Graduate Program in Integrative Biotechnology and
Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea
| | - Amir Zeb
- Faculty
of Natural and Basic Sciences, University
of Turbat, Balochistan 92600, Pakistan
| | - Kwang-Hwi Cho
- School
of Systems Biomedical Science, Soongsil
University, Seoul 06978, Republic of Korea
| | - Haiyan Jin
- The
Interdisciplinary Graduate Program in Integrative Biotechnology and
Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea
| | - Jongwan Kim
- Department
of Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
- Bioinformatics
and Molecular Design Research Center (BMDRC), Incheon 21983, Republic of Korea
| | - Onju Lee
- The
Interdisciplinary Graduate Program in Integrative Biotechnology and
Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea
| | - Yunyun Wang
- School
of Pharmacy and Jiangsu Province Key Laboratory for Inflammation and
Molecular Drug Target, Nantong University, Nantong 226001, Jiangsu, P. R. China
| | - Kyoung Tai No
- The
Interdisciplinary Graduate Program in Integrative Biotechnology and
Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea
| |
Collapse
|
2
|
Wang L, Yu Z, Wang S, Guo Z, Sun Q, Lai L. Discovery of novel SARS-CoV-2 3CL protease covalent inhibitors using deep learning-based screen. Eur J Med Chem 2022; 244:114803. [PMID: 36209629 PMCID: PMC9528019 DOI: 10.1016/j.ejmech.2022.114803] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 09/20/2022] [Accepted: 09/26/2022] [Indexed: 11/28/2022]
Abstract
SARS-CoV-2 3CL protease is one of the key targets for drug development against COVID-19. Most known SARS-CoV-2 3CL protease inhibitors act by covalently binding to the active site cysteine. Yet, computational screens against this enzyme were mainly focused on non-covalent inhibitor discovery. Here, we developed a deep learning-based stepwise strategy for selective covalent inhibitor screen. We used a deep learning framework that integrated a directed message passing neural network with a feed-forward neural network to construct two different classifiers for either covalent or non-covalent inhibition activity prediction. These two classifiers were trained on the covalent and non-covalent 3CL protease inhibitors dataset, respectively, which achieved high prediction accuracy. We then successively applied the covalent inhibitor model and the non-covalent inhibitor model to screen a chemical library containing compounds with covalent warheads of cysteine. We experimentally tested the inhibition activity of 32 top-ranking compounds and 12 of them were active, among which 6 showed IC50 values less than 12 μM and the strongest one inhibited SARS-CoV-2 3CL protease with an IC50 of 1.4 μM. Further investigation demonstrated that 5 of the 6 active compounds showed typical covalent inhibition behavior with time-dependent activity. These new covalent inhibitors provide novel scaffolds for developing highly active SARS-CoV-2 3CL covalent inhibitors.
Collapse
Affiliation(s)
- Liying Wang
- BNLMS, Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, PR China
| | - Zhongtian Yu
- BNLMS, Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, PR China
| | - Shiwei Wang
- BNLMS, Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, PR China
| | - Zheng Guo
- BNLMS, Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, PR China
| | - Qi Sun
- BNLMS, Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, PR China,Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), Beijing, 100871, PR China,Corresponding author. BNLMS, Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, PR China
| | - Luhua Lai
- BNLMS, Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, PR China,Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, PR China,Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), Beijing, 100871, PR China,Corresponding author. BNLMS, Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, PR China
| |
Collapse
|
3
|
Hu F, Wang D, Huang H, Hu Y, Yin P. Bridging the Gap between Target-Based and Cell-Based Drug Discovery with a Graph Generative Multitask Model. J Chem Inf Model 2022; 62:6046-6056. [PMID: 36401569 DOI: 10.1021/acs.jcim.2c01180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The development of new drugs is crucial for protecting humans from disease. In the past several decades, target-based screening has been one of the most popular methods for developing new drugs. This method efficiently screens potential inhibitors of a target protein in vitro, but it frequently fails in vivo due to insufficient activity of the selected drugs. There is a need for accurate computational methods to bridge this gap. Here, we present a novel graph multi-task deep learning model to identify compounds with both target inhibitory and cell active (MATIC) properties. On a carefully curated SARS-CoV-2 data set, the proposed MATIC model shows advantages compared with the traditional method in screening effective compounds in vivo. Following this, we investigated the interpretability of the model and discovered that the learned features for target inhibition (in vitro) or cell active (in vivo) tasks are different with molecular property correlations and atom functional attention. Based on these findings, we utilized a Monte Carlo-based reinforcement learning generative model to generate novel multiproperty compounds with both in vitro and in vivo efficacy, thus bridging the gap between target-based and cell-based drug discovery. The tool is freely accessible at https://github.com/SIAT-code/MATIC.
Collapse
Affiliation(s)
- Fan Hu
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen518055, China
| | - Dongqi Wang
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen518055, China
| | - Huazhen Huang
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen518055, China
| | - Yishen Hu
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen518055, China
| | - Peng Yin
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen518055, China
| |
Collapse
|
4
|
Kladnik J, Dolinar A, Kljun J, Perea D, Grau-Expósito J, Genescà M, Novinec M, Buzon MJ, Turel I. Zinc pyrithione is a potent inhibitor of PL Pro and cathepsin L enzymes with ex vivo inhibition of SARS-CoV-2 entry and replication. J Enzyme Inhib Med Chem 2022; 37:2158-2168. [PMID: 35943189 PMCID: PMC9367663 DOI: 10.1080/14756366.2022.2108417] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Zinc pyrithione (1a), together with its analogues 1b–h and ruthenium pyrithione complex 2a, were synthesised and evaluated for the stability in biologically relevant media and anti-SARS-CoV-2 activity. Zinc pyrithione revealed potent in vitro inhibition of cathepsin L (IC50=1.88 ± 0.49 µM) and PLPro (IC50=0.50 ± 0.07 µM), enzymes involved in SARS-CoV-2 entry and replication, respectively, as well as antiviral entry and replication properties in an ex vivo system derived from primary human lung tissue. Zinc complexes 1b–h expressed comparable in vitro inhibition. On the contrary, ruthenium complex 2a and the ligand pyrithione a itself expressed poor inhibition in mentioned assays, indicating the importance of the selection of metal core and structure of metal complex for antiviral activity. Safe, effective, and preferably oral at-home therapeutics for COVID-19 are needed and as such zinc pyrithione, which is also commercially available, could be considered as a potential therapeutic agent against SARS-CoV-2.
Collapse
Affiliation(s)
- Jerneja Kladnik
- Faculty of Chemistry and Chemical Technology, University of Ljubljana, Ljubljana, Slovenia
| | - Ana Dolinar
- Faculty of Chemistry and Chemical Technology, University of Ljubljana, Ljubljana, Slovenia
| | - Jakob Kljun
- Faculty of Chemistry and Chemical Technology, University of Ljubljana, Ljubljana, Slovenia
| | - David Perea
- Infectious Diseases Department, Vall d'Hebron Research Institute (VHIR), Hospital Universitari Vall d'Hebron, Universitat Autònoma de Barcelona, VHIR Task Force COVID-19, Barcelona, Spain
| | - Judith Grau-Expósito
- Infectious Diseases Department, Vall d'Hebron Research Institute (VHIR), Hospital Universitari Vall d'Hebron, Universitat Autònoma de Barcelona, VHIR Task Force COVID-19, Barcelona, Spain
| | - Meritxell Genescà
- Infectious Diseases Department, Vall d'Hebron Research Institute (VHIR), Hospital Universitari Vall d'Hebron, Universitat Autònoma de Barcelona, VHIR Task Force COVID-19, Barcelona, Spain
| | - Marko Novinec
- Faculty of Chemistry and Chemical Technology, University of Ljubljana, Ljubljana, Slovenia
| | - Maria J Buzon
- Infectious Diseases Department, Vall d'Hebron Research Institute (VHIR), Hospital Universitari Vall d'Hebron, Universitat Autònoma de Barcelona, VHIR Task Force COVID-19, Barcelona, Spain
| | - Iztok Turel
- Faculty of Chemistry and Chemical Technology, University of Ljubljana, Ljubljana, Slovenia
| |
Collapse
|
5
|
Saldivar-Espinoza B, Macip G, Garcia-Segura P, Mestres-Truyol J, Puigbò P, Cereto-Massagué A, Pujadas G, Garcia-Vallve S. Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks. Int J Mol Sci 2022; 23:ijms232314683. [PMID: 36499005 PMCID: PMC9736107 DOI: 10.3390/ijms232314683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 11/18/2022] [Accepted: 11/22/2022] [Indexed: 11/26/2022] Open
Abstract
Predicting SARS-CoV-2 mutations is difficult, but predicting recurrent mutations driven by the host, such as those caused by host deaminases, is feasible. We used machine learning to predict which positions from the SARS-CoV-2 genome will hold a recurrent mutation and which mutations will be the most recurrent. We used data from April 2021 that we separated into three sets: a training set, a validation set, and an independent test set. For the test set, we obtained a specificity value of 0.69, a sensitivity value of 0.79, and an Area Under the Curve (AUC) of 0.8, showing that the prediction of recurrent SARS-CoV-2 mutations is feasible. Subsequently, we compared our predictions with updated data from January 2022, showing that some of the false positives in our prediction model become true positives later on. The most important variables detected by the model's Shapley Additive exPlanation (SHAP) are the nucleotide that mutates and RNA reactivity. This is consistent with the SARS-CoV-2 mutational bias pattern and the preference of some host deaminases for specific sequences and RNA secondary structures. We extend our investigation by analyzing the mutations from the variants of concern Alpha, Beta, Delta, Gamma, and Omicron. Finally, we analyzed amino acid changes by looking at the predicted recurrent mutations in the M-pro and spike proteins.
Collapse
Affiliation(s)
- Bryan Saldivar-Espinoza
- Research Group in Cheminformatics & Nutrition, Departament de Bioquímica i Biotecnologia, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Guillem Macip
- Research Group in Cheminformatics & Nutrition, Departament de Bioquímica i Biotecnologia, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Pol Garcia-Segura
- Research Group in Cheminformatics & Nutrition, Departament de Bioquímica i Biotecnologia, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Júlia Mestres-Truyol
- Research Group in Cheminformatics & Nutrition, Departament de Bioquímica i Biotecnologia, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Pere Puigbò
- Department of Biology, University of Turku, 20500 Turku, Finland
- Department of Biochemistry and Biotechnology, Rovira i Virgili University, 43007 Tarragona, Spain
- Nutrition and Health Unit, Eurecat Technology Centre of Catalonia, 43204 Reus, Spain
| | - Adrià Cereto-Massagué
- EURECAT Centre Tecnològic de Catalunya, Centre for Omic Sciences (COS), Joint Unit Universitat Rovira i Virgili-EURECAT, Unique Scientific and Technical Infrastructures (ICTS), 43204 Reus, Spain
| | - Gerard Pujadas
- Research Group in Cheminformatics & Nutrition, Departament de Bioquímica i Biotecnologia, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Santiago Garcia-Vallve
- Research Group in Cheminformatics & Nutrition, Departament de Bioquímica i Biotecnologia, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
- Correspondence:
| |
Collapse
|
6
|
Sun Y, Jiao Y, Shi C, Zhang Y. Deep learning-based molecular dynamics simulation for structure-based drug design against SARS-CoV-2. Comput Struct Biotechnol J 2022; 20:5014-5027. [PMID: 36091720 PMCID: PMC9448712 DOI: 10.1016/j.csbj.2022.09.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 08/03/2022] [Accepted: 09/03/2022] [Indexed: 11/26/2022] Open
Abstract
Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2), has led to a global pandemic. Deep learning (DL) technology and molecular dynamics (MD) simulation are two mainstream computational approaches to investigate the geometric, chemical and structural features of protein and guide the relevant drug design. Despite a large amount of research papers focusing on drug design for SARS-COV-2 using DL architectures, it remains unclear how the binding energy of the protein-protein/ligand complex dynamically evolves which is also vital for drug development. In addition, traditional deep neural networks usually have obvious deficiencies in predicting the interaction sites as protein conformation changes. In this review, we introduce the latest progresses of the DL and DL-based MD simulation approaches in structure-based drug design (SBDD) for SARS-CoV-2 which could address the problems of protein structure and binding prediction, drug virtual screening, molecular docking and complex evolution. Furthermore, the current challenges and future directions of DL-based MD simulation for SBDD are also discussed.
Collapse
Affiliation(s)
- Yao Sun
- School of Science, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055, China
| | - Yanqi Jiao
- School of Science, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055, China
| | - Chengcheng Shi
- State Key Lab of Urban Water Resource and Environment, School of Science, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055, China
| | - Yang Zhang
- School of Science, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055, China
| |
Collapse
|
7
|
Prediction of Potential Commercially Available Inhibitors against SARS-CoV-2 by Multi-Task Deep Learning Model. Biomolecules 2022; 12:biom12081156. [PMID: 36009050 PMCID: PMC9405964 DOI: 10.3390/biom12081156] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 08/16/2022] [Accepted: 08/18/2022] [Indexed: 11/16/2022] Open
Abstract
The outbreak of COVID-19 caused millions of deaths worldwide, and the number of total infections is still rising. It is necessary to identify some potentially effective drugs that can be used to prevent the development of severe symptoms, or even death for those infected. Fortunately, many efforts have been made and several effective drugs have been identified. The rapidly increasing amount of data is of great help for training an effective and specific deep learning model. In this study, we propose a multi-task deep learning model for the purpose of screening commercially available and effective inhibitors against SARS-CoV-2. First, we pretrained a model on several heterogenous protein-ligand interaction datasets. The model achieved competitive results on some benchmark datasets. Next, a coronavirus-specific dataset was collected and used to fine-tune the model. Then, the fine-tuned model was used to select commercially available drugs against SARS-CoV-2 protein targets. Overall, twenty compounds were listed as potential inhibitors. We further explored the model interpretability and exhibited the predicted important binding sites. Based on this prediction, molecular docking was also performed to visualize the binding modes of the selected inhibitors.
Collapse
|
8
|
Hu F, Jiang J, Yin P. Prediction of Potential Commercially Available Inhibitors against SARS-CoV-2 by Multi-Task Deep Learning Model. Biomolecules 2022. [PMID: 36009050 DOI: 10.48550/arxiv.2003.00728] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023] Open
Abstract
The outbreak of COVID-19 caused millions of deaths worldwide, and the number of total infections is still rising. It is necessary to identify some potentially effective drugs that can be used to prevent the development of severe symptoms, or even death for those infected. Fortunately, many efforts have been made and several effective drugs have been identified. The rapidly increasing amount of data is of great help for training an effective and specific deep learning model. In this study, we propose a multi-task deep learning model for the purpose of screening commercially available and effective inhibitors against SARS-CoV-2. First, we pretrained a model on several heterogenous protein-ligand interaction datasets. The model achieved competitive results on some benchmark datasets. Next, a coronavirus-specific dataset was collected and used to fine-tune the model. Then, the fine-tuned model was used to select commercially available drugs against SARS-CoV-2 protein targets. Overall, twenty compounds were listed as potential inhibitors. We further explored the model interpretability and exhibited the predicted important binding sites. Based on this prediction, molecular docking was also performed to visualize the binding modes of the selected inhibitors.
Collapse
Affiliation(s)
- Fan Hu
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Jiaxin Jiang
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Peng Yin
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| |
Collapse
|