1
|
Gómez-Sacristán P, Simeon S, Tran-Nguyen VK, Patil S, Ballester PJ. Inactive-enriched machine-learning models exploiting patent data improve structure-based virtual screening for PDL1 dimerizers. J Adv Res 2025; 67:185-196. [PMID: 38280715 DOI: 10.1016/j.jare.2024.01.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 12/01/2023] [Accepted: 01/21/2024] [Indexed: 01/29/2024] Open
Abstract
INTRODUCTION Small-molecule Programmable Cell Death Protein 1/Programmable Death-Ligand 1 (PD1/PDL1) inhibition via PDL1 dimerization has the potential to lead to inexpensive drugs with better cancer patient outcomes and milder side effects. However, this therapeutic approach has proven challenging, with only one PDL1 dimerizer reaching early clinical trials so far. There is hence a need for fast and accurate methods to develop alternative PDL1 dimerizers. OBJECTIVES We aim to show that structure-based virtual screening (SBVS) based on PDL1-specific machine-learning (ML) scoring functions (SFs) is a powerful drug design tool for detecting PD1/PDL1 inhibitors via PDL1 dimerization. METHODS By incorporating the latest MLSF advances, we generated and evaluated PDL1-specific MLSFs (classifiers and inactive-enriched regressors) on two demanding test sets. RESULTS 60 PDL1-specific MLSFs (30 classifiers and 30 regressors) were generated. Our large-scale analysis provides highly predictive PDL1-specific MLSFs that benefitted from training with large volumes of docked inactives and enabling inactive-enriched regression. CONCLUSION PDL1-specific MLSFs strongly outperformed generic SFs of various types on this target and are released here without restrictions.
Collapse
Affiliation(s)
| | - Saw Simeon
- Centre de Recherche en Cancérologie de Marseille, Marseille 13009, France
| | | | - Sachin Patil
- NanoBio Laboratory, Widener University, Chester, PA 19013, USA
| | - Pedro J Ballester
- Department of Bioengineering, Imperial College London, London SW7 2AZ, UK.
| |
Collapse
|
2
|
Yang Z, Zhong W, Lv Q, Dong T, Chen G, Chen CYC. Interaction-Based Inductive Bias in Graph Neural Networks: Enhancing Protein-Ligand Binding Affinity Predictions From 3D Structures. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:8191-8208. [PMID: 38739515 DOI: 10.1109/tpami.2024.3400515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Inductive bias in machine learning (ML) is the set of assumptions describing how a model makes predictions. Different ML-based methods for protein-ligand binding affinity (PLA) prediction have different inductive biases, leading to different levels of generalization capability and interpretability. Intuitively, the inductive bias of an ML-based model for PLA prediction should fit in with biological mechanisms relevant for binding to achieve good predictions with meaningful reasons. To this end, we propose an interaction-based inductive bias to restrict neural networks to functions relevant for binding with two assumptions: 1) A protein-ligand complex can be naturally expressed as a heterogeneous graph with covalent and non-covalent interactions; 2) The predicted PLA is the sum of pairwise atom-atom affinities determined by non-covalent interactions. The interaction-based inductive bias is embodied by an explainable heterogeneous interaction graph neural network (EHIGN) for explicitly modeling pairwise atom-atom interactions to predict PLA from 3D structures. Extensive experiments demonstrate that EHIGN achieves better generalization capability than other state-of-the-art ML-based baselines in PLA prediction and structure-based virtual screening. More importantly, comprehensive analyses of distance-affinity, pose-affinity, and substructure-affinity relations suggest that the interaction-based inductive bias can guide the model to learn atomic interactions that are consistent with physical reality. As a case study to demonstrate practical usefulness, our method is tested for predicting the efficacy of Nirmatrelvir against SARS-CoV-2 variants. EHIGN successfully recognizes the changes in the efficacy of Nirmatrelvir for different SARS-CoV-2 variants with meaningful reasons.
Collapse
|
3
|
Hemant Kumar S, Venkatachalapathy M, Sistla R, Poongavanam V. Advances in molecular glues: exploring chemical space and design principles for targeted protein degradation. Drug Discov Today 2024; 29:104205. [PMID: 39393773 DOI: 10.1016/j.drudis.2024.104205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2024] [Revised: 09/18/2024] [Accepted: 10/04/2024] [Indexed: 10/13/2024]
Abstract
The discovery of the E3 ligase cereblon (CRBN) as the target of thalidomide and its analogs revolutionized the field of targeted protein degradation (TPD). This ubiquitin-mediated degradation pathway was first harnessed by bivalent degraders. Recently, the emergence of low-molecular-weight molecular glue degraders (MGDs) has expanded the TPD landscape, because MGDs operate via the same mechanism while offering attractive physicochemical properties that are consistent with small-molecule therapeutics. This review delves into the discovery and advancement of MGDs, with case studies on cyclin K and the zinc finger protein IKZF2, highlighting the design principles, biological assays and therapeutic applications. Additionally, it examines the chemical space of molecular glues and outlines the collaborative efforts that are fueling innovation in this field.
Collapse
Affiliation(s)
- S Hemant Kumar
- thinkMolecular Technologies Pvt. Ltd, Haralur, Bangalore, KA 560102, India
| | | | - Ramesh Sistla
- thinkMolecular Technologies Pvt. Ltd, Haralur, Bangalore, KA 560102, India.
| | | |
Collapse
|
4
|
Ghislat G, Hernandez-Hernandez S, Piyawajanusorn C, Ballester PJ. Data-centric challenges with the application and adoption of artificial intelligence for drug discovery. Expert Opin Drug Discov 2024; 19:1297-1307. [PMID: 39316009 DOI: 10.1080/17460441.2024.2403639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Accepted: 09/09/2024] [Indexed: 09/25/2024]
Abstract
INTRODUCTION Artificial intelligence (AI) is exhibiting tremendous potential to reduce the massive costs and long timescales of drug discovery. There are however important challenges currently limiting the impact and scope of AI models. AREAS COVERED In this perspective, the authors discuss a range of data issues (bias, inconsistency, skewness, irrelevance, small size, high dimensionality), how they challenge AI models, and which issue-specific mitigations have been effective. Next, they point out the challenges faced by uncertainty quantification techniques aimed at enhancing and trusting the predictions from these AI models. They also discuss how conceptual errors, unrealistic benchmarks and performance misestimation can confound the evaluation of models and thus their development. Lastly, the authors explain how human bias, whether from AI experts or drug discovery experts, constitutes another challenge that can be alleviated by gaining more prospective experience. EXPERT OPINION AI models are often developed to excel on retrospective benchmarks unlikely to anticipate their prospective performance. As a result, only a few of these models are ever reported to have prospective value (e.g. by discovering potent and innovative drug leads for a therapeutic target). The authors have discussed what can go wrong in practice with AI for drug discovery. The authors hope that this will help inform the decisions of editors, funders investors, and researchers working in this area.
Collapse
Affiliation(s)
- Ghita Ghislat
- Department of Life Sciences, Imperial College London, London, UK
| | | | | | | |
Collapse
|
5
|
Fernández JF, Martinez Heredia L, Caracciolo F, Esses D, Suarez R, Siless G, Perez C, Isabel Rodríguez-Franco M, Fernández LR, Palermo JA, Lavecchia M. Target Fisher: A Consensus Structure-Based Target Prediction Tool, and its Application in the Discovery of Selective MAO-B Inhibitors. Chemistry 2024:e202401838. [PMID: 39447068 DOI: 10.1002/chem.202401838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 10/14/2024] [Accepted: 10/23/2024] [Indexed: 10/26/2024]
Abstract
In this work we introduce Target Fisher, a consensus structure-based target prediction tool that integrates molecular docking and machine learning with the aim to aid in the identification of potential biological targets and the optimization of the use of bioassays. Target Fisher uses per-residue energy decomposition profiles extracted from docking poses as fingerprints to train target-specific machine learning models. It provides predictions for a curated set of 37 protein targets, covering a diverse range of biological entities, and offers a user-friendly interface accessible via a web server (https://gqc.quimica.unlp.edu.ar/targetfisher/). In this sense, Target Fisher is a valuable tool to aid organic and medicinal chemistry groups in target identification, drug discovery and drug repurposing. As a case study, we demonstrate the efficacy of Target Fisher by screening a small library of assorted natural products for targets relevant to neurodegenerative diseases, which resulted in the identification and experimental validation of selective inhibitors of monoamine oxidase B (MAO-B).
Collapse
Affiliation(s)
- Julián F Fernández
- Departamento de Quimica Organica, Facultad de Ciencias Exactas, Universidad de Buenos Aires, Intendente Guiraldes 2160, Buenos Aires, Argentina
- Unidad de Microanálisis y Métodos Físicos en Química Orgánica (UMYMFOR), CONICET, Intendente Guiraldes 2160, Buenos Aires, Argentina
| | - Leandro Martinez Heredia
- CEQUINOR (UNLP-CONICET, CCT-La Plata, associated with CIC), Departamento de Química, Facultad de Ciencias Exactas, Universidad Nacional de La Plata, Blvd. 120 1465, La Plata, Argentina
| | - Fernando Caracciolo
- Departamento de Quimica Organica, Facultad de Ciencias Exactas, Universidad de Buenos Aires, Intendente Guiraldes 2160, Buenos Aires, Argentina
- Unidad de Microanálisis y Métodos Físicos en Química Orgánica (UMYMFOR), CONICET, Intendente Guiraldes 2160, Buenos Aires, Argentina
| | - Daniel Esses
- Departamento de Quimica Organica, Facultad de Ciencias Exactas, Universidad de Buenos Aires, Intendente Guiraldes 2160, Buenos Aires, Argentina
- Unidad de Microanálisis y Métodos Físicos en Química Orgánica (UMYMFOR), CONICET, Intendente Guiraldes 2160, Buenos Aires, Argentina
| | - Rodrigo Suarez
- Departamento de Quimica Organica, Facultad de Ciencias Exactas, Universidad de Buenos Aires, Intendente Guiraldes 2160, Buenos Aires, Argentina
- Unidad de Microanálisis y Métodos Físicos en Química Orgánica (UMYMFOR), CONICET, Intendente Guiraldes 2160, Buenos Aires, Argentina
| | - Gaston Siless
- Departamento de Quimica Organica, Facultad de Ciencias Exactas, Universidad de Buenos Aires, Intendente Guiraldes 2160, Buenos Aires, Argentina
- Unidad de Microanálisis y Métodos Físicos en Química Orgánica (UMYMFOR), CONICET, Intendente Guiraldes 2160, Buenos Aires, Argentina
| | - Concepcion Perez
- Instituto de Química Medica, CSIC, Calle Juan de la Cierva, 3, Madrid, 28006, España
| | | | - Lucía R Fernández
- Departamento de Quimica Organica, Facultad de Ciencias Exactas, Universidad de Buenos Aires, Intendente Guiraldes 2160, Buenos Aires, Argentina
- Unidad de Microanálisis y Métodos Físicos en Química Orgánica (UMYMFOR), CONICET, Intendente Guiraldes 2160, Buenos Aires, Argentina
| | - Jorge A Palermo
- Departamento de Quimica Organica, Facultad de Ciencias Exactas, Universidad de Buenos Aires, Intendente Guiraldes 2160, Buenos Aires, Argentina
- Unidad de Microanálisis y Métodos Físicos en Química Orgánica (UMYMFOR), CONICET, Intendente Guiraldes 2160, Buenos Aires, Argentina
| | - Martín Lavecchia
- CEQUINOR (UNLP-CONICET, CCT-La Plata, associated with CIC), Departamento de Química, Facultad de Ciencias Exactas, Universidad Nacional de La Plata, Blvd. 120 1465, La Plata, Argentina
| |
Collapse
|
6
|
Caba K, Tran-Nguyen VK, Rahman T, Ballester PJ. Comprehensive machine learning boosts structure-based virtual screening for PARP1 inhibitors. J Cheminform 2024; 16:40. [PMID: 38582911 PMCID: PMC10999096 DOI: 10.1186/s13321-024-00832-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 03/23/2024] [Indexed: 04/08/2024] Open
Abstract
Poly ADP-ribose polymerase 1 (PARP1) is an attractive therapeutic target for cancer treatment. Machine-learning scoring functions constitute a promising approach to discovering novel PARP1 inhibitors. Cutting-edge PARP1-specific machine-learning scoring functions were investigated using semi-synthetic training data from docking activity-labelled molecules: known PARP1 inhibitors, hard-to-discriminate decoys property-matched to them with generative graph neural networks and confirmed inactives. We further made test sets harder by including only molecules dissimilar to those in the training set. Comprehensive analysis of these datasets using five supervised learning algorithms, and protein-ligand fingerprints extracted from docking poses and ligand only features revealed one highly predictive scoring function. This is the PARP1-specific support vector machine-based regressor, when employing PLEC fingerprints, which achieved a high Normalized Enrichment Factor at the top 1% on the hardest test set (NEF1% = 0.588, median of 10 repetitions), and was more predictive than any other investigated scoring function, especially the classical scoring function employed as baseline.
Collapse
Affiliation(s)
- Klaudia Caba
- Department of Bioengineering, Imperial College London, London, SW7 2AZ, UK
| | - Viet-Khoa Tran-Nguyen
- Unité de Biologie Fonctionnelle et Adaptative (BFA), UFR Sciences du Vivant, Université Paris Cité, 75013, Paris, France
| | - Taufiq Rahman
- Department of Pharmacology, University of Cambridge, Cambridge, CB2 1PD, UK
| | - Pedro J Ballester
- Department of Bioengineering, Imperial College London, London, SW7 2AZ, UK.
| |
Collapse
|
7
|
Libouban PY, Aci-Sèche S, Gómez-Tamayo JC, Tresadern G, Bonnet P. The Impact of Data on Structure-Based Binding Affinity Predictions Using Deep Neural Networks. Int J Mol Sci 2023; 24:16120. [PMID: 38003312 PMCID: PMC10671244 DOI: 10.3390/ijms242216120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/30/2023] [Accepted: 11/01/2023] [Indexed: 11/26/2023] Open
Abstract
Artificial intelligence (AI) has gained significant traction in the field of drug discovery, with deep learning (DL) algorithms playing a crucial role in predicting protein-ligand binding affinities. Despite advancements in neural network architectures, system representation, and training techniques, the performance of DL affinity prediction has reached a plateau, prompting the question of whether it is truly solved or if the current performance is overly optimistic and reliant on biased, easily predictable data. Like other DL-related problems, this issue seems to stem from the training and test sets used when building the models. In this work, we investigate the impact of several parameters related to the input data on the performance of neural network affinity prediction models. Notably, we identify the size of the binding pocket as a critical factor influencing the performance of our statistical models; furthermore, it is more important to train a model with as much data as possible than to restrict the training to only high-quality datasets. Finally, we also confirm the bias in the typically used current test sets. Therefore, several types of evaluation and benchmarking are required to understand models' decision-making processes and accurately compare the performance of models.
Collapse
Affiliation(s)
- Pierre-Yves Libouban
- Institute of Organic and Analytical Chemistry (ICOA), UMR7311, Université d’Orléans, CNRS, Pôle de Chimie rue de Chartres, 45067 Orléans, CEDEX 2, France; (P.-Y.L.); (S.A.-S.)
| | - Samia Aci-Sèche
- Institute of Organic and Analytical Chemistry (ICOA), UMR7311, Université d’Orléans, CNRS, Pôle de Chimie rue de Chartres, 45067 Orléans, CEDEX 2, France; (P.-Y.L.); (S.A.-S.)
| | - Jose Carlos Gómez-Tamayo
- Computational Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., B-2340 Beerse, Belgium; (J.C.G.-T.); (G.T.)
| | - Gary Tresadern
- Computational Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., B-2340 Beerse, Belgium; (J.C.G.-T.); (G.T.)
| | - Pascal Bonnet
- Institute of Organic and Analytical Chemistry (ICOA), UMR7311, Université d’Orléans, CNRS, Pôle de Chimie rue de Chartres, 45067 Orléans, CEDEX 2, France; (P.-Y.L.); (S.A.-S.)
| |
Collapse
|
8
|
Silva-Júnior EFD. "You've got the Body I've got the Brains" - Could the current AI-based tools replace the human ingenuity for designing new drug candidates? Bioorg Med Chem 2023; 94:117475. [PMID: 37741120 DOI: 10.1016/j.bmc.2023.117475] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 08/12/2023] [Accepted: 09/12/2023] [Indexed: 09/25/2023]
Abstract
The emergence of artificial intelligence (AI) tools has transformed the landscape of drug discovery, providing unprecedented speed, efficiency, and cost-effectiveness in the search for new therapeutics. From target identification to drug formulation and delivery, AI-driven algorithms have revolutionized various aspects of medicinal chemistry, significantly accelerating the drug design process. Despite the transformative power of AI, this perspective article emphasizes the limitations of AI tools in drug discovery, requiring inventive skills of medicinal chemists. However, the article highlighted that there is a need for a harmonious integration of AI-based tools and human expertise in drug discovery. Such a synergistic approach promises to lead to groundbreaking therapies that address unmet medical needs and benefit humankind. As the world evolves technologically, the question remains: When will AI tools effectively design and develop drugs? The answer may lie in the seamless collaboration between AI and human researchers, unlocking transformative therapies that combat diseases effectively.
Collapse
Affiliation(s)
- Edeildo Ferreira da Silva-Júnior
- Institute of Chemistry and Biotechnology, Federal University of Alagoas, Lourival Melo Mota Avenue, AC. Simões Campus, 57072-970 Alagoas, Maceió, Brazil
| |
Collapse
|
9
|
Cavasotto CN, Di Filippo JI. The Impact of Supervised Learning Methods in Ultralarge High-Throughput Docking. J Chem Inf Model 2023; 63:2267-2280. [PMID: 37036491 DOI: 10.1021/acs.jcim.2c01471] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/11/2023]
Abstract
Structure-based virtual screening methods are, nowadays, one of the key pillars of computational drug discovery. In recent years, a series of studies have reported docking-based virtual screening campaigns of large databases ranging from hundreds to thousands of millions compounds, further identifying novel hits after experimental validation. As these larg-scale efforts are not generally accessible, machine learning-based protocols have emerged to accelerate the identification of virtual hits within an ultralarge chemical space, reaching impressive reductions in computational time. Herein, we illustrate the motivation and the problem behind the screening of large databases, providing an overview of key concepts and essential applications of machine learning-accelerated protocols, specifically concerning supervised learning methods. We also discuss where the field stands with these novel developments, highlighting possible insights for future studies.
Collapse
Affiliation(s)
- Claudio N Cavasotto
- Computational Drug Design and Biomedical Informatics Laboratory, Instituto de Investigaciones en Medicina Traslacional (IIMT), CONICET-Universidad Austral, Av. Juan Domingo Perón 1500, B1629AHJ Pilar, Argentina
- Facultad de Ciencias Biomédicas, and Facultad de Ingeniería, Universidad Austral, Av. Juan Domingo Perón 1500, B1629AHJ Pilar, Argentina
- Austral Institute for Applied Artificial Intelligence, Universidad Austral, Av. Juan Domingo Perón 1500, B1629AHJ Pilar, Argentina
| | - Juan I Di Filippo
- Computational Drug Design and Biomedical Informatics Laboratory, Instituto de Investigaciones en Medicina Traslacional (IIMT), CONICET-Universidad Austral, Av. Juan Domingo Perón 1500, B1629AHJ Pilar, Argentina
- Facultad de Ciencias Biomédicas, and Facultad de Ingeniería, Universidad Austral, Av. Juan Domingo Perón 1500, B1629AHJ Pilar, Argentina
- Austral Institute for Applied Artificial Intelligence, Universidad Austral, Av. Juan Domingo Perón 1500, B1629AHJ Pilar, Argentina
| |
Collapse
|
10
|
Tran-Nguyen VK, Ballester PJ. Beware of Simple Methods for Structure-Based Virtual Screening: The Critical Importance of Broader Comparisons. J Chem Inf Model 2023; 63:1401-1405. [PMID: 36848585 PMCID: PMC10015451 DOI: 10.1021/acs.jcim.3c00218] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2023]
Abstract
We discuss how data unbiasing and simple methods such as protein-ligand Interaction FingerPrint (IFP) can overestimate virtual screening performance. We also show that IFP is strongly outperformed by target-specific machine-learning scoring functions, which were not considered in a recent report concluding that simple methods were better than machine-learning scoring functions at virtual screening.
Collapse
Affiliation(s)
| | - Pedro J Ballester
- Department of Bioengineering, Imperial College London, London SW7 2AZ, U.K
| |
Collapse
|
11
|
Hernández-Hernández S, Ballester PJ. On the Best Way to Cluster NCI-60 Molecules. Biomolecules 2023; 13:biom13030498. [PMID: 36979433 PMCID: PMC10046274 DOI: 10.3390/biom13030498] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 03/02/2023] [Accepted: 03/06/2023] [Indexed: 03/30/2023] Open
Abstract
Machine learning-based models have been widely used in the early drug-design pipeline. To validate these models, cross-validation strategies have been employed, including those using clustering of molecules in terms of their chemical structures. However, the poor clustering of compounds will compromise such validation, especially on test molecules dissimilar to those in the training set. This study aims at finding the best way to cluster the molecules screened by the National Cancer Institute (NCI)-60 project by comparing hierarchical, Taylor-Butina, and uniform manifold approximation and projection (UMAP) clustering methods. The best-performing algorithm can then be used to generate clusters for model validation strategies. This study also aims at measuring the impact of removing outlier molecules prior to the clustering step. Clustering results are evaluated using three well-known clustering quality metrics. In addition, we compute an average similarity matrix to assess the quality of each cluster. The results show variation in clustering quality from method to method. The clusters obtained by the hierarchical and Taylor-Butina methods are more computationally expensive to use in cross-validation strategies, and both cluster the molecules poorly. In contrast, the UMAP method provides the best quality, and therefore we recommend it to analyze this highly valuable dataset.
Collapse
Affiliation(s)
- Saiveth Hernández-Hernández
- Cancer Research Center of Marseille (INSERM U1068, Institut Paoli-Calmettes, Aix-Marseille Université UM105, CNRS UMR7258), 13009 Marseille, France
| | - Pedro J Ballester
- Department of Bioengineering, Imperial College London, London SW7 2AZ, UK
| |
Collapse
|
12
|
Potlitz F, Link A, Schulig L. Advances in the discovery of new chemotypes through ultra-large library docking. Expert Opin Drug Discov 2023; 18:303-313. [PMID: 36714919 DOI: 10.1080/17460441.2023.2171984] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
INTRODUCTION The size and complexity of virtual screening libraries in drug discovery have skyrocketed in recent years, reaching up to multiple billions of accessible compounds. However, virtual screening of such ultra-large libraries poses several challenges associated with preparing the libraries, sampling, and pre-selection of suitable compounds. The utilization of artificial intelligence (AI)-assisted screening approaches, such as deep learning, poses a promising countermeasure to deal with this rapidly expanding chemical space. For example, various AI-driven methods were recently successfully used to identify novel small molecule inhibitors of the SARS-CoV-2 main protease (Mpro). AREAS COVERED This review focuses on presenting various kinds of virtual screening methods suitable for dealing with ultra-large libraries. Challenges associated with these computational methodologies are discussed, and recent advances are highlighted in the example of the discovery of novel Mpro inhibitors targeting the SARS-CoV-2 virus. EXPERT OPINION With the rapid expansion of the virtual chemical space, the methodologies for docking and screening such quantities of molecules need to keep pace. Employment of AI-driven screening compounds has already been shown to be effective in a range from a few thousand to multiple billion compounds, furthered by de novo generation of drug-like molecules without human interference.
Collapse
Affiliation(s)
- Felix Potlitz
- Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, University of Greifswald, Germany
| | - Andreas Link
- Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, University of Greifswald, Germany
| | - Lukas Schulig
- Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, University of Greifswald, Germany
| |
Collapse
|
13
|
Wang Z, Zheng L, Wang S, Lin M, Wang Z, Kong AWK, Mu Y, Wei Y, Li W. A fully differentiable ligand pose optimization framework guided by deep learning and a traditional scoring function. Brief Bioinform 2023; 24:6887112. [PMID: 36502369 DOI: 10.1093/bib/bbac520] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 10/17/2022] [Accepted: 10/31/2022] [Indexed: 12/14/2022] Open
Abstract
The recently reported machine learning- or deep learning-based scoring functions (SFs) have shown exciting performance in predicting protein-ligand binding affinities with fruitful application prospects. However, the differentiation between highly similar ligand conformations, including the native binding pose (the global energy minimum state), remains challenging that could greatly enhance the docking. In this work, we propose a fully differentiable, end-to-end framework for ligand pose optimization based on a hybrid SF called DeepRMSD+Vina combined with a multi-layer perceptron (DeepRMSD) and the traditional AutoDock Vina SF. The DeepRMSD+Vina, which combines (1) the root mean square deviation (RMSD) of the docking pose with respect to the native pose and (2) the AutoDock Vina score, is fully differentiable; thus is capable of optimizing the ligand binding pose to the energy-lowest conformation. Evaluated by the CASF-2016 docking power dataset, the DeepRMSD+Vina reaches a success rate of 94.4%, which outperforms most reported SFs to date. We evaluated the ligand conformation optimization framework in practical molecular docking scenarios (redocking and cross-docking tasks), revealing the high potentialities of this framework in drug design and discovery. Structural analysis shows that this framework has the ability to identify key physical interactions in protein-ligand binding, such as hydrogen-bonding. Our work provides a paradigm for optimizing ligand conformations based on deep learning algorithms. The DeepRMSD+Vina model and the optimization framework are available at GitHub repository https://github.com/zchwang/DeepRMSD-Vina_Optimization.
Collapse
Affiliation(s)
- Zechen Wang
- School of Physics, Shandong University, Jinan, Shandong 250100, China
| | - Liangzhen Zheng
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China.,Shanghai Zelixir Biotech Company Ltd., Shanghai 200030, China
| | - Sheng Wang
- Shanghai Zelixir Biotech Company Ltd., Shanghai 200030, China
| | - Mingzhi Lin
- Shanghai Zelixir Biotech Company Ltd., Shanghai 200030, China
| | - Zhihao Wang
- School of Physics, Shandong University, Jinan, Shandong 250100, China
| | - Adams Wai-Kin Kong
- Rolls-Royce Corporate Lab, Nanyang Technological University, Singapore 637551, Singapore
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| | - Yanjie Wei
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China
| | - Weifeng Li
- School of Physics, Shandong University, Jinan, Shandong 250100, China
| |
Collapse
|
14
|
Wang L, Shi SH, Li H, Zeng XX, Liu SY, Liu ZQ, Deng YF, Lu AP, Hou TJ, Cao DS. Reducing false positive rate of docking-based virtual screening by active learning. Brief Bioinform 2023; 24:6987822. [PMID: 36642412 DOI: 10.1093/bib/bbac626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 12/10/2022] [Accepted: 12/20/2022] [Indexed: 01/17/2023] Open
Abstract
Machine learning-based scoring functions (MLSFs) have become a very favorable alternative to classical scoring functions because of their potential superior screening performance. However, the information of negative data used to construct MLSFs was rarely reported in the literature, and meanwhile the putative inactive molecules recorded in existing databases usually have obvious bias from active molecules. Here we proposed an easy-to-use method named AMLSF that combines active learning using negative molecular selection strategies with MLSF, which can iteratively improve the quality of inactive sets and thus reduce the false positive rate of virtual screening. We chose energy auxiliary terms learning as the MLSF and validated our method on eight targets in the diverse subset of DUD-E. For each target, we screened the IterBioScreen database by AMLSF and compared the screening results with those of the four control models. The results illustrate that the number of active molecules in the top 1000 molecules identified by AMLSF was significantly higher than those identified by the control models. In addition, the free energy calculation results for the top 10 molecules screened out by the AMLSF, null model and control models based on DUD-E also proved that more active molecules can be identified, and the false positive rate can be reduced by AMLSF.
Collapse
Affiliation(s)
- Lei Wang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China
| | - Shao-Hua Shi
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
| | - Hui Li
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China
| | - Xiang-Xiang Zeng
- Department of Computer Science, Hunan University, Changsha 410082, Hunan, China
| | - Su-You Liu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China
| | - Zhao-Qian Liu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China
| | - Ya-Feng Deng
- CarbonSilicon AI Technology Co., Ltd, Hangzhou, Zhejiang 310018, China
| | - Ai-Ping Lu
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
| | - Ting-Jun Hou
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China.,Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
| |
Collapse
|
15
|
Veríssimo GC, Serafim MSM, Kronenberger T, Ferreira RS, Honorio KM, Maltarollo VG. Designing drugs when there is low data availability: one-shot learning and other approaches to face the issues of a long-term concern. Expert Opin Drug Discov 2022; 17:929-947. [PMID: 35983695 DOI: 10.1080/17460441.2022.2114451] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
INTRODUCTION Modern drug discovery generally is accessed by useful information from previous large databases or uncovering novel data. The lack of biological and/or chemical data tends to slow the development of scientific research and innovation. Here, approaches that may help provide solutions to generate or obtain enough relevant data or improve/accelerate existing methods within the last five years were reviewed. AREAS COVERED One-shot learning (OSL) approaches, structural modeling, molecular docking, scoring function space (SFS), molecular dynamics (MD), and quantum mechanics (QM) may be used to amplify the amount of available data to drug design and discovery campaigns, presenting methods, their perspectives, and discussions to be employed in the near future. EXPERT OPINION Recent works have successfully used these techniques to solve a range of issues in the face of data scarcity, including complex problems such as the challenging scenario of drug design aimed at intrinsically disordered proteins and the evaluation of potential adverse effects in a clinical scenario. These examples show that it is possible to improve and kickstart research from scarce available data to design and discover new potential drugs.
Collapse
Affiliation(s)
- Gabriel C Veríssimo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Mateus Sá M Serafim
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Thales Kronenberger
- Department of Medical Oncology and Pneumology, Internal Medicine VIII, University Hospital of Tübingen, Tübingen, Germany.,School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland, Kuopio, Finland
| | - Rafaela S Ferreira
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Kathia M Honorio
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo (USP), São Paulo, Brazil.,Centro de Ciências Naturais e Humanas, Universidade Federal do ABC (UFABC), Santo André, Brazil
| | - Vinícius G Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| |
Collapse
|
16
|
McGibbon M, Money-Kyrle S, Blay V, Houston DR. SCORCH: Improving structure-based virtual screening with machine learning classifiers, data augmentation, and uncertainty estimation. J Adv Res 2022; 46:135-147. [PMID: 35901959 PMCID: PMC10105235 DOI: 10.1016/j.jare.2022.07.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Revised: 07/08/2022] [Accepted: 07/09/2022] [Indexed: 11/17/2022] Open
Abstract
INTRODUCTION The discovery of a new drug is a costly and lengthy endeavour. The computational prediction of which small molecules can bind to a protein target can accelerate this process if the predictions are fast and accurate enough. Recent machine-learning scoring functions re-evaluate the output of molecular docking to achieve more accurate predictions. However, previous scoring functions were trained on crystalised protein-ligand complexes and datasets of decoys. The limited availability of crystal structures and biases in the decoy datasets can lower the performance of scoring functions. OBJECTIVES To address key limitations of previous scoring functions and thus improve the predictive performance of structure-based virtual screening. METHODS A novel machine-learning scoring function was created, named SCORCH (Scoring COnsensus for RMSD-based Classification of Hits). To develop SCORCH, training data is augmented by considering multiple ligand poses and labelling poses based on their RMSD from the native pose. Decoy bias is addressed by generating property-matched decoys for each ligand and using the same methodology for preparing and docking decoys and ligands. A consensus of 3 different machine learning approaches is also used to improve performance. RESULTS We find that multi-pose augmentation in SCORCH improves its docking power and screening power on independent benchmark datasets. SCORCH outperforms an equivalent scoring function trained on single poses, with a 1% enrichment factor (EF) of 13.78 vs. 10.86 on 18 DEKOIS 2.0 targets and a mean native pose rank of 5.9 vs 30.4 on CSAR 2014. Additionally, SCORCH outperforms widely used scoring functions in virtual screening and pose prediction on independent benchmark datasets. CONCLUSION By rationally addressing key limitations of previous scoring functions, SCORCH improves the performance of virtual screening. SCORCH also provides an estimate of its uncertainty, which can help reduce the cost and time required for drug discovery.
Collapse
Affiliation(s)
- Miles McGibbon
- Institute of Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh, Scotland EH9 3BF, UK
| | - Sam Money-Kyrle
- Institute of Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh, Scotland EH9 3BF, UK
| | - Vincent Blay
- Department of Microbiology and Environmental Toxicology, University of California at Santa Cruz, Santa Cruz, CA 95064, USA; Institute for Integrative Systems Biology (I(2)SysBio), Universitat de València and Spanish Research Council (CSIC), 46980 Valencia, Spain.
| | - Douglas R Houston
- Institute of Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh, Scotland EH9 3BF, UK.
| |
Collapse
|
17
|
Karasev DA, Sobolev BN, Lagunin AA, Filimonov DA, Poroikov VV. The method predicting interaction between protein targets and small-molecular ligands with the wide applicability domain. Comput Biol Chem 2022; 98:107674. [DOI: 10.1016/j.compbiolchem.2022.107674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 03/24/2022] [Accepted: 03/28/2022] [Indexed: 11/03/2022]
|
18
|
Choudhury C, Arul Murugan N, Deva Priyakumar U. Structure-based drug repurposing: traditional and advanced AI/ML-aided methods. Drug Discov Today 2022; 27:1847-1861. [PMID: 35301148 PMCID: PMC8920090 DOI: 10.1016/j.drudis.2022.03.006] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 02/16/2022] [Accepted: 03/10/2022] [Indexed: 02/08/2023]
Abstract
The current global health emergency in the form of the Coronavirus 2019 (COVID-19) pandemic has highlighted the need for fast, accurate, and efficient drug discovery pipelines. Traditional drug discovery projects relying on in vitro high-throughput screening (HTS) involve large investments and sophisticated experimental set-ups, affordable only to big biopharmaceutical companies. In this scenario, application of efficient state-of-the-art computational methods and modern artificial intelligence (AI)-based algorithms for rapid screening of repurposable chemical space [approved drugs and natural products (NPs) with proven pharmacokinetic profiles] to identify the initial leads is a powerful option to save resources and time. Structure-based drug repurposing is a popular in silico repurposing approach. In this review, we discuss traditional and modern AI-based computational methods and tools applied at various stages for structure-based drug discovery (SBDD) pipelines. Additionally, we highlight the role of generative models in generating molecules with scaffolds from repurposable chemical space. Teaser: This review highlights the importance of repurposable chemical space, and the contributions of conventional in silico approaches and modern machine-learning algorithms for rapid structure-based drug repurposing.
Collapse
Affiliation(s)
- Chinmayee Choudhury
- Department of Experimental Medicine and Biotechnology, Postgraduate Institute of Medical Education and Research, Sector-12, Chandigarh 160012, India
| | - N Arul Murugan
- Department of Computer Science, School of Electrical Engineering and Computer Sciences, KTH Royal Institute of Technology, S-100 44, Stockholm, Sweden; Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi 110020, India.
| | - U Deva Priyakumar
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500 032, India
| |
Collapse
|
19
|
Andrianov GV, Ong WJG, Serebriiskii I, Karanicolas J. Efficient Hit-to-Lead Searching of Kinase Inhibitor Chemical Space via Computational Fragment Merging. J Chem Inf Model 2021; 61:5967-5987. [PMID: 34762402 PMCID: PMC8865965 DOI: 10.1021/acs.jcim.1c00630] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
In early-stage drug discovery, the hit-to-lead optimization (or "hit expansion") stage entails starting from a newly identified active compound and improving its potency or other properties. Traditionally, this process relies on synthesizing and evaluating a series of analogues to build up structure-activity relationships. Here, we describe a computational strategy focused on kinase inhibitors, intended to expedite the process of identifying analogues with improved potency. Our protocol begins from an inhibitor of the target kinase and generalizes the synthetic route used to access it. By searching for commercially available replacements for the individual building blocks used to make the parent inhibitor, we compile an enumerated library of compounds that can be accessed using the same chemical transformations; these huge libraries can exceed many millions─or billions─of compounds. Because the resulting libraries are much too large for explicit virtual screening, we instead consider alternate approaches to identify the top-scoring compounds. We find that contributions from individual substituents are well described by a pairwise additivity approximation, provided that the corresponding fragments position their shared core in precisely the same way relative to the binding site. This key insight allows us to determine which fragments are suitable for merging into single new compounds and which are not. Further, the use of pairwise approximation allows interaction energies to be assigned to each compound in the library without the need for any further structure-based modeling: interaction energies instead can be reliably estimated from the energies of the component fragments, and the reduced computational requirements allow for flexible energy minimizations that allow the kinase to respond to each substitution. We demonstrate this protocol using libraries built from six representative kinase inhibitors drawn from the literature, which target five different kinases: CDK9, CHK1, CDK2, EGFRT790M, and ACK1. In each example, the enumerated library includes additional analogues reported by the original study to have activity, and these analogues are successfully prioritized within the library. We envision that the insights from this work can facilitate the rapid assembly and screening of increasingly large libraries for focused hit-to-lead optimization. To enable adoption of these methods and to encourage further analyses, we disseminate the computational tools needed to deploy this protocol.
Collapse
Affiliation(s)
- Grigorii V. Andrianov
- Program in Molecular Therapeutics, Fox Chase Cancer Center, Philadelphia, PA 19111-2497,Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, Russia, 420008
| | - Wern Juin Gabriel Ong
- Program in Molecular Therapeutics, Fox Chase Cancer Center, Philadelphia, PA 19111-2497,Bowdoin College, Brunswick, ME 04011
| | - Ilya Serebriiskii
- Program in Molecular Therapeutics, Fox Chase Cancer Center, Philadelphia, PA 19111-2497,Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, Russia, 420008
| | - John Karanicolas
- Program in Molecular Therapeutics, Fox Chase Cancer Center, Philadelphia, PA 19111-2497,To whom correspondence should be addressed. , 215-728-7067
| |
Collapse
|
20
|
Nguyen TB, Pires DEV, Ascher DB. CSM-carbohydrate: protein-carbohydrate binding affinity prediction and docking scoring function. Brief Bioinform 2021; 23:6457169. [PMID: 34882232 DOI: 10.1093/bib/bbab512] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 11/06/2021] [Accepted: 11/08/2021] [Indexed: 12/29/2022] Open
Abstract
Protein-carbohydrate interactions are crucial for many cellular processes but can be challenging to biologically characterise. To improve our understanding and ability to model these molecular interactions, we used a carefully curated set of 370 protein-carbohydrate complexes with experimental structural and biophysical data in order to train and validate a new tool, cutoff scanning matrix (CSM)-carbohydrate, using machine learning algorithms to accurately predict their binding affinity and rank docking poses as a scoring function. Information on both protein and carbohydrate complementarity, in terms of shape and chemistry, was captured using graph-based structural signatures. Across both training and independent test sets, we achieved comparable Pearson's correlations of 0.72 under cross-validation [root mean square error (RMSE) of 1.58 Kcal/mol] and 0.67 on the independent test (RMSE of 1.72 Kcal/mol), providing confidence in the generalisability and robustness of the final model. Similar performance was obtained across mono-, di- and oligosaccharides, further highlighting the applicability of this approach to the study of larger complexes. We show CSM-carbohydrate significantly outperformed previous approaches and have implemented our method and make all data freely available through both a user-friendly web interface and application programming interface, to facilitate programmatic access at http://biosig.unimelb.edu.au/csm_carbohydrate/. We believe CSM-carbohydrate will be an invaluable tool for helping assess docking poses and the effects of mutations on protein-carbohydrate affinity, unravelling important aspects that drive binding recognition.
Collapse
Affiliation(s)
- Thanh Binh Nguyen
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia.,Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
21
|
Ricci-Lopez J, Aguila SA, Gilson MK, Brizuela CA. Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning. J Chem Inf Model 2021; 61:5362-5376. [PMID: 34652141 DOI: 10.1021/acs.jcim.1c00511] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
One of the main challenges of structure-based virtual screening (SBVS) is the incorporation of the receptor's flexibility, as its explicit representation in every docking run implies a high computational cost. Therefore, a common alternative to include the receptor's flexibility is the approach known as ensemble docking. Ensemble docking consists of using a set of receptor conformations and performing the docking assays over each of them. However, there is still no agreement on how to combine the ensemble docking results to obtain the final ligand ranking. A common choice is to use consensus strategies to aggregate the ensemble docking scores, but these strategies exhibit slight improvement regarding the single-structure approach. Here, we claim that using machine learning (ML) methodologies over the ensemble docking results could improve the predictive power of SBVS. To test this hypothesis, four proteins were selected as study cases: CDK2, FXa, EGFR, and HSP90. Protein conformational ensembles were built from crystallographic structures, whereas the evaluated compound library comprised up to three benchmarking data sets (DUD, DEKOIS 2.0, and CSAR-2012) and cocrystallized molecules. Ensemble docking results were processed through 30 repetitions of 4-fold cross-validation to train and validate two ML classifiers: logistic regression and gradient boosting trees. Our results indicate that the ML classifiers significantly outperform traditional consensus strategies and even the best performance case achieved with single-structure docking. We provide statistical evidence that supports the effectiveness of ML to improve the ensemble docking performance.
Collapse
Affiliation(s)
- Joel Ricci-Lopez
- Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California C.P. 22860, Mexico.,Centro de Nanociencias y Nanotecnología, Universidad Nacional Autónoma de México (UNAM), Ensenada, Baja California C.P. 22860, Mexico
| | - Sergio A Aguila
- Centro de Nanociencias y Nanotecnología, Universidad Nacional Autónoma de México (UNAM), Ensenada, Baja California C.P. 22860, Mexico
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, La Jolla, San Diego, California 92093, United States
| | - Carlos A Brizuela
- Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California C.P. 22860, Mexico
| |
Collapse
|
22
|
Scardino V, Bollini M, Cavasotto CN. Combination of pose and rank consensus in docking-based virtual screening: the best of both worlds. RSC Adv 2021; 11:35383-35391. [PMID: 35424265 PMCID: PMC8965822 DOI: 10.1039/d1ra05785e] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 10/26/2021] [Indexed: 11/24/2022] Open
Abstract
The use of high-throughput docking (HTD) in the drug discovery pipeline is today widely established. In spite of methodological improvements in docking accuracy (pose prediction), scoring power, ranking power, and screening power in HTD remain challenging. In fact, pose prediction is of critical importance in view of the pose-dependent scoring process, since incorrect poses will necessarily decrease the ranking power of scoring functions. The combination of results from different docking programs (consensus scoring) has been shown to improve the performance of HTD. Moreover, it has been also shown that a pose consensus approach might also result in database enrichment. We present a new methodology named Pose/Ranking Consensus (PRC) that combines both pose and ranking consensus approaches, to overcome the limitations of each stand-alone strategy. This approach has been developed using four docking programs (ICM, rDock, Auto Dock 4, and PLANTS; the first one is commercial, the other three are free). We undertook a thorough analysis for the best way of combining pose and rank strategies, and applied the PRC to a wide range of 34 targets sampling different protein families and binding site properties. Our approach exhibits an improved systematic performance in terms of enrichment factor and hit rate with respect to either pose consensus or consensus ranking alone strategies at a lower computational cost, while always ensuring the recovery of a suitable number of ligands. An analysis using four free docking programs (replacing ICM by Auto Dock Vina) displayed comparable results. The new methodology named Pose/Ranking Consensus (PRC) combines both pose and ranking consensus strategies. It displays an enhanced performance in terms of enrichment factor and hit rate, ensuring the recovery of a suitable number of ligands.![]()
Collapse
Affiliation(s)
- Valeria Scardino
- Meton AI, Inc. Wilmington DE 19801 USA.,Austral Institute for Applied Artificial Intelligence, Universidad Austral Pilar Buenos Aires Argentina
| | - Mariela Bollini
- Centro de Investigaciones en BioNanociencias (CIBION), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) Ciudad de Buenos Aires Argentina
| | - Claudio N Cavasotto
- Austral Institute for Applied Artificial Intelligence, Universidad Austral Pilar Buenos Aires Argentina.,Computational Drug Design and Biomedical Informatics Laboratory, Instituto de Investigaciones en Medicina Traslacional (IIMT), Universidad Austral-CONICET Pilar Buenos Aires Argentina.,Facultad de Ciencias Biomédicas, and Facultad de Ingeniería, Universidad Austral Pilar Buenos Aires Argentina
| |
Collapse
|
23
|
Shen C, Hu X, Gao J, Zhang X, Zhong H, Wang Z, Xu L, Kang Y, Cao D, Hou T. The impact of cross-docked poses on performance of machine learning classifier for protein-ligand binding pose prediction. J Cheminform 2021; 13:81. [PMID: 34656169 PMCID: PMC8520186 DOI: 10.1186/s13321-021-00560-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 10/05/2021] [Indexed: 02/06/2023] Open
Abstract
Structure-based drug design depends on the detailed knowledge of the three-dimensional (3D) structures of protein-ligand binding complexes, but accurate prediction of ligand-binding poses is still a major challenge for molecular docking due to deficiency of scoring functions (SFs) and ignorance of protein flexibility upon ligand binding. In this study, based on a cross-docking dataset dedicatedly constructed from the PDBbind database, we developed several XGBoost-trained classifiers to discriminate the near-native binding poses from decoys, and systematically assessed their performance with/without the involvement of the cross-docked poses in the training/test sets. The calculation results illustrate that using Extended Connectivity Interaction Features (ECIF), Vina energy terms and docking pose ranks as the features can achieve the best performance, according to the validation through the random splitting or refined-core splitting and the testing on the re-docked or cross-docked poses. Besides, it is found that, despite the significant decrease of the performance for the threefold clustered cross-validation, the inclusion of the Vina energy terms can effectively ensure the lower limit of the performance of the models and thus improve their generalization capability. Furthermore, our calculation results also highlight the importance of the incorporation of the cross-docked poses into the training of the SFs with wide application domain and high robustness for binding pose prediction. The source code and the newly-developed cross-docking datasets can be freely available at https://github.com/sc8668/ml_pose_prediction and https://zenodo.org/record/5525936 , respectively, under an open-source license. We believe that our study may provide valuable guidance for the development and assessment of new machine learning-based SFs (MLSFs) for the predictions of protein-ligand binding poses.
Collapse
Affiliation(s)
- Chao Shen
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, People's Republic of China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, 310058, People's Republic of China
| | - Xueping Hu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, People's Republic of China
| | - Junbo Gao
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, People's Republic of China
| | - Xujun Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, People's Republic of China
| | - Haiyang Zhong
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, People's Republic of China
| | - Zhe Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, People's Republic of China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, 213001, China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, People's Republic of China.
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan, 410013, People's Republic of China.
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, People's Republic of China. .,State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, 310058, People's Republic of China.
| |
Collapse
|
24
|
Jia Y, Cai S, Muhoza B, Qi B, Li Y. Advance in dietary polyphenols as dipeptidyl peptidase-IV inhibitors to alleviate type 2 diabetes mellitus: aspects from structure-activity relationship and characterization methods. Crit Rev Food Sci Nutr 2021:1-16. [PMID: 34652225 DOI: 10.1080/10408398.2021.1989659] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Dietary polyphenols with great antidiabetic effects are the most abundant components in edible products. Dietary polyphenols have attracted attention as dipeptidyl peptidase-IV (DPP-IV) inhibitors and indirectly improve insulin secretion. The DPP-IV inhibitory activities of dietary polyphenols depend on their structural diversity. Screening methods that can be used to rapidly and accurately identify potential polyphenol DPP-IV inhibitors are urgently needed. This review focuses on the relationship between the structures of dietary polyphenols and their DPP-IV inhibitory effects. Different characterization methods used for polyphenols as DPP-IV inhibitors have been summarized and compared. We conclude that the position and number of hydroxyl groups, methoxy groups, glycosylated groups, and the extent of conjugation influence the efficiency of inhibition of DPP-IV. Various combinations of methods, such as in-vitro enzymatic inhibition, ex-vivo/in-vivo enzymatic inhibition, cell-based in situ, and in-silico virtual screening, are used to evaluate the DPP-IV inhibitory effects of dietary polyphenols. Further investigations of polyphenol DPP-IV inhibitors will improve the bioaccessibility and bioavailability of these bioactive compounds. Exploration of (i) dietary polyphenols derived from multiple targets, that can prevent diabetes, and (ii) actual binding interactions via multispectral analysis, to understand the binding interactions in the complexes, is required.
Collapse
Affiliation(s)
- Yijia Jia
- College of Food Science, Northeast Agricultural University, Harbin, China
| | - Shengbao Cai
- Faculty of Agriculture and Food, Yunnan Institute of Food Safety, Kunming University of Science and Technology, Kunming, Yunnan Province, China
| | - Bertrand Muhoza
- College of Food Science, Northeast Agricultural University, Harbin, China
| | - Baokun Qi
- College of Food Science, Northeast Agricultural University, Harbin, China.,Heilongjiang Green Food Science Research Institute, Harbin, China.,National Research Center of Soybean Engineering and Technology, Harbin, China
| | - Yang Li
- College of Food Science, Northeast Agricultural University, Harbin, China.,Heilongjiang Green Food Science Research Institute, Harbin, China.,National Research Center of Soybean Engineering and Technology, Harbin, China
| |
Collapse
|
25
|
Mak KK, Balijepalli MK, Pichika MR. Success stories of AI in drug discovery - where do things stand? Expert Opin Drug Discov 2021; 17:79-92. [PMID: 34553659 DOI: 10.1080/17460441.2022.1985108] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
INTRODUCTION Artificial intelligence (AI) in drug discovery and development (DDD) has gained more traction in the past few years. Many scientific reviews have already been made available in this area. Thus, in this review, the authors have focused on the success stories of AI-driven drug candidates and the scientometric analysis of the literature in this field. AREA COVERED The authors explore the literature to compile the success stories of AI-driven drug candidates that are currently being assessed in clinical trials or have investigational new drug (IND) status. The authors also provide the reader with their expert perspectives for future developments and their opinions on the field. EXPERT OPINION Partnerships between AI companies and the pharma industry are booming. The early signs of the impact of AI on DDD are encouraging, and the pharma industry is hoping for breakthroughs. AI can be a promising technology to unveil the greatest successes, but it has yet to be proven as AI is still at the embryonic stage.
Collapse
Affiliation(s)
- Kit-Kay Mak
- School of Postgraduate Studies and Research, International Medical University, Bukit Jalil, Malaysia.,Department of Pharmaceutical Chemistry, School of Pharmacy, International Medical University, Bukit Jalil, Malaysia.,Centre for Bioactive Molecules and Drug Delivery, Institute for Research, Development, and Innovation (Irdi), International Medical University, Bukit Jalil, Malaysia
| | | | - Mallikarjuna Rao Pichika
- Department of Pharmaceutical Chemistry, School of Pharmacy, International Medical University, Bukit Jalil, Malaysia.,Centre for Bioactive Molecules and Drug Delivery, Institute for Research, Development, and Innovation (Irdi), International Medical University, Bukit Jalil, Malaysia
| |
Collapse
|
26
|
Kingdon ADH, Alderwick LJ. Structure-based in silico approaches for drug discovery against Mycobacterium tuberculosis. Comput Struct Biotechnol J 2021; 19:3708-3719. [PMID: 34285773 PMCID: PMC8258792 DOI: 10.1016/j.csbj.2021.06.034] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 06/22/2021] [Accepted: 06/22/2021] [Indexed: 12/12/2022] Open
Abstract
Mycobacterium tuberculosis is the causative agent of TB and was estimated to cause 1.4 million death in 2019, alongside 10 million new infections. Drug resistance is a growing issue, with multi-drug resistant infections representing 3.3% of all new infections, hence novel antimycobacterial drugs are urgently required to combat this growing health emergency. Alongside this, increased knowledge of gene essentiality in the pathogenic organism and larger compound databases can aid in the discovery of new drug compounds. The number of protein structures, X-ray based and modelled, is increasing and now accounts for greater than > 80% of all predicted M. tuberculosis proteins; allowing novel targets to be investigated. This review will focus on structure-based in silico approaches for drug discovery, covering a range of complexities and computational demands, with associated antimycobacterial examples. This includes molecular docking, molecular dynamic simulations, ensemble docking and free energy calculations. Applications of machine learning onto each of these approaches will be discussed. The need for experimental validation of computational hits is an essential component, which is unfortunately missing from many current studies. The future outlooks of these approaches will also be discussed.
Collapse
Key Words
- CV, collective variable
- Docking
- Drug discovery
- In silico
- LIE, Linear Interaction Energy
- MD, Molecular Dynamic
- MDR, multi-drug resistant
- MMPB(GB)SA, Molecular Mechanics with Poisson Boltzmann (or generalised Born) and Surface Area solvation
- Machine learning
- Mt, Mycobacterium tuberculosis
- Mycobacterium tuberculosis
- PTC, peptidyl transferase centre
- RMSD, root-mean square-deviation
- Tuberculosis, TB
- cMD, Classical Molecular Dynamic
- cryo-EM, cryogenic electron microscopy
- ns, nanosecond
Collapse
Affiliation(s)
- Alexander D H Kingdon
- Institute of Microbiology and Infection, School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom
| | - Luke J Alderwick
- Institute of Microbiology and Infection, School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom
| |
Collapse
|
27
|
Yang L, Yang G, Chen X, Yang Q, Yao X, Bing Z, Niu Y, Huang L, Yang L. Deep Scoring Neural Network Replacing the Scoring Function Components to Improve the Performance of Structure-Based Molecular Docking. ACS Chem Neurosci 2021; 12:2133-2142. [PMID: 34081851 DOI: 10.1021/acschemneuro.1c00110] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Accurate prediction of protein-ligand interactions can greatly promote drug development. Recently, a number of deep-learning-based methods have been proposed to predict protein-ligand binding affinities. However, these methods independently extract the feature representations of proteins and ligands but ignore the relative spatial positions and interaction pairs between them. Here, we propose a virtual screening method based on deep learning, called Deep Scoring, which directly extracts the relative position information and atomic attribute information on proteins and ligands from the docking poses. Furthermore, we use two Resnets to extract the features of ligand atoms and protein residues, respectively, and generate an atom-residue interaction matrix to learn the underlying principles of the interactions between proteins and ligands. This is then followed by a dual attention network (DAN) to generate the attention for two related entities (i.e., proteins and ligands) and to weigh the contributions of each atom and residue to binding affinity prediction. As a result, Deep Scoring outperforms other structure-based deep learning methods in terms of screening performance (area under the receiver operating characteristic curve (AUC) of 0.901 for an unbiased DUD-E version), pose prediction (AUC of 0.935 for PDBbind test set), and generalization ability (AUC of 0.803 for the CHEMBL data set). Finally, Deep Scoring was used to select novel ERK2 inhibitor, and two compounds (D264-0698 and D483-1785) were obtained with potential inhibitory activity on ERK2 through the biological experiments.
Collapse
Affiliation(s)
- Lijuan Yang
- Institute of Modern Physics, Chinese Academy of Science, Lanzhou 730000, China
- School of Physics and Technology, Lanzhou University, Lanzhou 730000, China
- School of Physics, University of Chinese Academy of Science, Beijing 100049, China
- Advanced Energy Science and Technology Guangdong Laboratory, Huizhou 516000, China
| | - Guanghui Yang
- Institute of Modern Physics, Chinese Academy of Science, Lanzhou 730000, China
- Advanced Energy Science and Technology Guangdong Laboratory, Huizhou 516000, China
| | - Xiaolong Chen
- Institute of Modern Physics, Chinese Academy of Science, Lanzhou 730000, China
- Advanced Energy Science and Technology Guangdong Laboratory, Huizhou 516000, China
| | - Qiong Yang
- Institute of Modern Physics, Chinese Academy of Science, Lanzhou 730000, China
- Advanced Energy Science and Technology Guangdong Laboratory, Huizhou 516000, China
| | - Xiaojun Yao
- College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou 730000, China
| | - Zhitong Bing
- Institute of Modern Physics, Chinese Academy of Science, Lanzhou 730000, China
- Advanced Energy Science and Technology Guangdong Laboratory, Huizhou 516000, China
| | - Yuzhen Niu
- Shandong Provincial Research Center for Bioinformatic Engineering and Technique, School of Life Sciences, Shandong University of Technology, Zibo 255049, China
| | - Liang Huang
- School of Physics and Technology, Lanzhou University, Lanzhou 730000, China
| | - Lei Yang
- Institute of Modern Physics, Chinese Academy of Science, Lanzhou 730000, China
- Advanced Energy Science and Technology Guangdong Laboratory, Huizhou 516000, China
| |
Collapse
|
28
|
Nigam A, Pollice R, Hurley MFD, Hickman RJ, Aldeghi M, Yoshikawa N, Chithrananda S, Voelz VA, Aspuru-Guzik A. Assigning confidence to molecular property prediction. Expert Opin Drug Discov 2021; 16:1009-1023. [DOI: 10.1080/17460441.2021.1925247] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- AkshatKumar Nigam
- Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, Canada
- Department of Computer Science, University of Toronto, Toronto, Canada
| | - Robert Pollice
- Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, Canada
- Department of Computer Science, University of Toronto, Toronto, Canada
| | | | - Riley J. Hickman
- Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, Canada
- Department of Computer Science, University of Toronto, Toronto, Canada
| | - Matteo Aldeghi
- Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, Canada
- Department of Computer Science, University of Toronto, Toronto, Canada
- Vector Institute for Artificial Intelligence, University Ave Suite 710, Toronto, Canada
| | - Naruki Yoshikawa
- Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, Canada
- Department of Computer Science, University of Toronto, Toronto, Canada
| | | | | | - Alán Aspuru-Guzik
- Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, Canada
- Department of Computer Science, University of Toronto, Toronto, Canada
- Vector Institute for Artificial Intelligence, University Ave Suite 710, Toronto, Canada
- Canadian Institute for Advanced Research (CIFAR), University Ave, Toronto, Canada
| |
Collapse
|
29
|
Qin T, Zhu Z, Wang XS, Xia J, Wu S. Computational representations of protein-ligand interfaces for structure-based virtual screening. Expert Opin Drug Discov 2021; 16:1175-1192. [PMID: 34011222 DOI: 10.1080/17460441.2021.1929921] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Introduction: Structure-based virtual screening (SBVS) is an essential strategy for hit identification. SBVS primarily uses molecular docking, which exploits the protein-ligand binding mode and associated affinity score for compound ranking. Previous studies have shown that computational representation of protein-ligand interfaces and the later establishment of machine learning models are efficacious in improving the accuracy of SBVS.Areas covered: The authors review the computational methods for representing protein-ligand interfaces, which include the traditional ones that use deliberately designed fingerprints and descriptors and the more recent methods that automatically extract features with deep learning. The effects of these methods on the performance of machine learning models are briefly discussed. Additionally, case studies that applied various computational representations to machine learning are cited with remarks.Expert opinion: It has become a trend to extract binding features automatically by deep learning, which uses a completely end-to-end representation. However, there is still plenty of scope for improvement . The interpretability of deep-learning models, the organization of data management, the quantity and quality of available data, and the optimization of hyperparameters could impact the accuracy of feature extraction. In addition, other important structural factors such as water molecules and protein flexibility should be considered.
Collapse
Affiliation(s)
- Tong Qin
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Department of New Drug Research and Development, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Zihao Zhu
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Department of New Drug Research and Development, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Xiang Simon Wang
- Artificial Intelligence and Drug Discovery Core Laboratory for District of Columbia Center for AIDS Research (DC CFAR), Department of Pharmaceutical Sciences, College of Pharmacy, Howard University, U.S.A
| | - Jie Xia
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Department of New Drug Research and Development, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Song Wu
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Department of New Drug Research and Development, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
30
|
Singh N, Villoutreix BO. Resources and computational strategies to advance small molecule SARS-CoV-2 discovery: Lessons from the pandemic and preparing for future health crises. Comput Struct Biotechnol J 2021; 19:2537-2548. [PMID: 33936562 PMCID: PMC8074526 DOI: 10.1016/j.csbj.2021.04.059] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 04/22/2021] [Accepted: 04/24/2021] [Indexed: 12/11/2022] Open
Abstract
There is an urgent need to identify new therapies that prevent SARS-CoV-2 infection and improve the outcome of COVID-19 patients. This pandemic has thus spurred intensive research in most scientific areas and in a short period of time, several vaccines have been developed. But, while the race to find vaccines for COVID-19 has dominated the headlines, other types of therapeutic agents are being developed. In this mini-review, we report several databases and online tools that could assist the discovery of anti-SARS-CoV-2 small chemical compounds and peptides. We then give examples of studies that combined in silico and in vitro screening, either for drug repositioning purposes or to search for novel bioactive compounds. Finally, we question the overall lack of discussion and plan observed in academic research in many countries during this crisis and suggest that there is room for improvement.
Collapse
Affiliation(s)
- Natesh Singh
- Université de Paris, Inserm UMR 1141 NeuroDiderot, Robert-Debré Hospital, 75019 Paris, France
| | - Bruno O. Villoutreix
- Université de Paris, Inserm UMR 1141 NeuroDiderot, Robert-Debré Hospital, 75019 Paris, France
| |
Collapse
|
31
|
Jiménez-Luna J, Grisoni F, Weskamp N, Schneider G. Artificial intelligence in drug discovery: recent advances and future perspectives. Expert Opin Drug Discov 2021; 16:949-959. [PMID: 33779453 DOI: 10.1080/17460441.2021.1909567] [Citation(s) in RCA: 115] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Introduction: Artificial intelligence (AI) has inspired computer-aided drug discovery. The widespread adoption of machine learning, in particular deep learning, in multiple scientific disciplines, and the advances in computing hardware and software, among other factors, continue to fuel this development. Much of the initial skepticism regarding applications of AI in pharmaceutical discovery has started to vanish, consequently benefitting medicinal chemistry.Areas covered: The current status of AI in chemoinformatics is reviewed. The topics discussed herein include quantitative structure-activity/property relationship and structure-based modeling, de novo molecular design, and chemical synthesis prediction. Advantages and limitations of current deep learning applications are highlighted, together with a perspective on next-generation AI for drug discovery.Expert opinion: Deep learning-based approaches have only begun to address some fundamental problems in drug discovery. Certain methodological advances, such as message-passing models, spatial-symmetry-preserving networks, hybrid de novo design, and other innovative machine learning paradigms, will likely become commonplace and help address some of the most challenging questions. Open data sharing and model development will play a central role in the advancement of drug discovery with AI.
Collapse
Affiliation(s)
- José Jiménez-Luna
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Francesca Grisoni
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Nils Weskamp
- Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an Der Riss, Germany
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| |
Collapse
|