1
|
Kumar R, Sharma A, Alexiou A, Ashraf GM. Artificial Intelligence in De novo Drug Design: Are We Still There? Curr Top Med Chem 2022; 22:2483-2492. [PMID: 36263480 DOI: 10.2174/1568026623666221017143244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 09/06/2022] [Accepted: 09/15/2022] [Indexed: 01/20/2023]
Abstract
BACKGROUND The artificial intelligence (AI)-assisted design of drug candidates with novel structures and desired properties has received significant attention in the recent past, so related areas of forward prediction that aim to discover chemical matters worth synthesizing and further experimental investigation. OBJECTIVES The purpose behind developing AI-driven models is to explore the broader chemical space and suggest new drug candidate scaffolds with promising therapeutic value. Moreover, it is anticipated that such AI-based models may not only significantly reduce the cost and time but also decrease the attrition rate of drug candidates that fail to reach the desirable endpoints at the final stages of drug development. In an attempt to develop AI-based models for de novo drug design, numerous methods have been proposed by various study groups by applying machine learning and deep learning algorithms to chemical datasets. However, there are many challenges in obtaining accurate predictions, and real breakthroughs in de novo drug design are still scarce. METHODS In this review, we explore the recent trends in developing AI-based models for de novo drug design to assess the current status, challenges, and opportunities in the field. CONCLUSION The consistently improved AI algorithms and the abundance of curated training chemical data indicate that AI-based de novo drug design should perform better than the current models. Improvements in the performance are warranted to obtain better outcomes in the form of potential drug candidates, which can perform well in in vivo conditions, especially in the case of more complex diseases.
Collapse
Affiliation(s)
- Rajnish Kumar
- Amity Institute of Biotechnology, Amity University Uttar Pradesh Lucknow Campus, Uttar Pradesh, India
| | - Anju Sharma
- Department of Applied Science, Indian Institute of Information Technology, Allahabad, Uttar Pradesh, India
| | - Athanasios Alexiou
- Novel Global Community Educational Foundation, Hebersham, 2770 NSW, Australia.,AFNP Med Austria, 1010 Wien, Austria
| | - Ghulam Md Ashraf
- Pre-Clinical Research Unit (PCRU), King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia.,Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
2
|
Saldívar-González FI, Medina-Franco JL. Approaches for enhancing the analysis of chemical space for drug discovery. Expert Opin Drug Discov 2022; 17:789-798. [PMID: 35640229 DOI: 10.1080/17460441.2022.2084608] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
INTRODUCTION Chemical space is a powerful, general, and practical conceptual framework in drug discovery and other areas in chemistry that addresses the diversity of molecules and it has various applications. Moreover, chemical space is a cornerstone of chemoinformatics as a scientific discipline. In response to the increase in the set of chemical compounds in databases, generators of chemical structures, and tools to calculate molecular descriptors, novel approaches to generate visual representations of chemical space in low dimensions are emerging and evolving. Such approaches include a wide range of commercial and free applications, software, and open-source methods. AREAS COVERED The current state of chemical space in drug design and discovery is reviewed. The topics discussed herein include advances for efficient navigation in chemical space, the use of this concept in assessing the diversity of different data sets, exploring structure-property/activity relationships for one or multiple endpoints, and compound library design. Recent advances in methodologies for generating visual representations of chemical space have been highlighted, thereby emphasizing open-source methods. EXPERT OPINION Quantitative and qualitative generation and analysis of chemical space require novel approaches for handling the increasing number of molecules and their information available in chemical databases (including emerging ultra-large libraries). In addition, it is of utmost importance to note that chemical space is a conceptual framework that goes beyond visual representation in low dimensions. However, the graphical representation of chemical space has several practical applications in drug discovery and beyond.
Collapse
Affiliation(s)
- Fernanda I Saldívar-González
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| | - José L Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| |
Collapse
|
3
|
Nakamura T, Sakaue S, Fujii K, Harabuchi Y, Maeda S, Iwata S. Selecting molecules with diverse structures and properties by maximizing submodular functions of descriptors learned with graph neural networks. Sci Rep 2022; 12:1124. [PMID: 35064170 PMCID: PMC8782878 DOI: 10.1038/s41598-022-04967-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 01/04/2022] [Indexed: 12/25/2022] Open
Abstract
Selecting diverse molecules from unexplored areas of chemical space is one of the most important tasks for discovering novel molecules and reactions. This paper proposes a new approach for selecting a subset of diverse molecules from a given molecular list by using two existing techniques studied in machine learning and mathematical optimization: graph neural networks (GNNs) for learning vector representation of molecules and a diverse-selection framework called submodular function maximization. Our method, called SubMo-GNN, first trains a GNN with property prediction tasks, and then the trained GNN transforms molecular graphs into molecular vectors, which capture both properties and structures of molecules. Finally, to obtain a subset of diverse molecules, we define a submodular function, which quantifies the diversity of molecular vectors, and find a subset of molecular vectors with a large submodular function value. This can be done efficiently by using the greedy algorithm, and the diversity of selected molecules measured by the submodular function value is mathematically guaranteed to be at least 63% of that of an optimal selection. We also introduce a new evaluation criterion to measure the diversity of selected molecules based on molecular properties. Computational experiments confirm that our SubMo-GNN successfully selects diverse molecules from the QM9 dataset regarding the property-based criterion, while performing comparably to existing methods regarding standard structure-based criteria. We also demonstrate that SubMo-GNN with a GNN trained on the QM9 dataset can select diverse molecules even from other MoleculeNet datasets whose domains are different from the QM9 dataset. The proposed method enables researchers to obtain diverse sets of molecules for discovering new molecules and novel chemical reactions, and the proposed diversity criterion is useful for discussing the diversity of molecular libraries from a new property-based perspective.
Collapse
Affiliation(s)
- Tomohiro Nakamura
- Department of Mathematical Informatics, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, 113-8656, Japan.,JST, ERATO Maeda Artificial Intelligence for Chemical Reaction Design and Discovery Project, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan
| | - Shinsaku Sakaue
- Department of Mathematical Informatics, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, 113-8656, Japan. .,JST, ERATO Maeda Artificial Intelligence for Chemical Reaction Design and Discovery Project, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan.
| | - Kaito Fujii
- National Institute of Informatics, Hitotsubashi 2-1-2, Chiyoda-ku, Tokyo, 101-8430, Japan. .,JST, ERATO Maeda Artificial Intelligence for Chemical Reaction Design and Discovery Project, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan.
| | - Yu Harabuchi
- Department of Chemistry, Faculty of Science, Hokkaido University, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan. .,Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, Hokkaido, 001-0021, Japan. .,JST, ERATO Maeda Artificial Intelligence for Chemical Reaction Design and Discovery Project, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan.
| | - Satoshi Maeda
- Department of Chemistry, Faculty of Science, Hokkaido University, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan.,Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, Hokkaido, 001-0021, Japan.,National Institute for Materials Science (NIMS), Research and Services Division of Materials Data and Integrated System (MaDIS), Tsukuba, Ibaraki, 305-0044, Japan.,JST, ERATO Maeda Artificial Intelligence for Chemical Reaction Design and Discovery Project, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan
| | - Satoru Iwata
- Department of Mathematical Informatics, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, 113-8656, Japan.,Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, Hokkaido, 001-0021, Japan.,JST, ERATO Maeda Artificial Intelligence for Chemical Reaction Design and Discovery Project, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan
| |
Collapse
|
4
|
Nuñez JR, Mcgrady M, Yesiltepe Y, Renslow RS, Metz TO. Chespa: Streamlining Expansive Chemical Space Evaluation of Molecular Sets. J Chem Inf Model 2020; 60:6251-6257. [PMID: 33283505 DOI: 10.1021/acs.jcim.0c00899] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Thousands of chemical properties can be calculated for small molecules, which can be used to place the molecules within the context of a broader "chemical space." These definitions vary based on compounds of interest and the goals for the given chemical space definition. Here, we introduce a customizable Python module, chespa, built to easily assess different chemical space definitions through clustering of compounds in these spaces and visualizing trends of these clusters. To demonstrate this, chespa currently streamlines prediction of various molecular descriptors (predicted chemical properties, molecular substructures, AI-based chemical space, and chemical class ontology) in order to test six different chemical space definitions. Furthermore, we investigated how these varying definitions trend with mass spectrometry (MS)-based observability, that is, the ability of a molecule to be observed with MS (e.g., as a function of the molecule ionizability), using an example data set from the U.S. EPA's nontargeted analysis collaborative trial, where blinded samples had been analyzed previously, providing 1398 data points. Improved understanding of observability would offer many advantages in small-molecule identification, such as (i) a priori selection of experimental conditions based on suspected sample composition, (ii) the ability to reduce the number of candidate structures during compound identification by removing those less likely to ionize, and, in turn, (iii) a reduced false discovery rate and increased confidence in identifications. Factors controlling observability are not fully understood, making prediction of this property nontrivial and a prime candidate for chemical space analysis. Chespa is available at github.com/pnnl/chespa.
Collapse
Affiliation(s)
- Jamie R Nuñez
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States.,The Gene and Linda Voiland School of Chemical Engineering and Bioengineering, Washington State University, Pullman, Washington 99164, United States
| | - Monee Mcgrady
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Yasemin Yesiltepe
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States.,The Gene and Linda Voiland School of Chemical Engineering and Bioengineering, Washington State University, Pullman, Washington 99164, United States
| | - Ryan S Renslow
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States.,The Gene and Linda Voiland School of Chemical Engineering and Bioengineering, Washington State University, Pullman, Washington 99164, United States
| | - Thomas O Metz
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| |
Collapse
|
5
|
Inoue T, Tanaka K, Kotera M, Funatsu K. Improvement of the Structure Generator DAECS with Respect to Structural Diversity. Mol Inform 2020; 40:e2000225. [PMID: 33237627 DOI: 10.1002/minf.202000225] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 10/29/2020] [Indexed: 11/11/2022]
Abstract
The development of novel organic compounds with desired properties is time consuming and costly. Thus, the quantitative structure-property relationship (QSPR) model is used widely for efficiently discovering compounds with the desired properties. Novel structures can be generated from a variety of input structures in silico by structure generators. We previously developed the structure generator DAECS to yield highly active drug-like structures. However, the structural diversity of the structures generated by DAECS was still small for practical applications such as drug discovery. In this paper, we present structure modification rules and the algorithm to output more diverse structures through the DAECS workflow. Two new types of structural modification rules, bond contraction and ring mergence, were added. The new algorithm, which restricts the search area and subsequently clusters structures on a two-dimensional map generated by generative topographic mapping, was implemented for the repetitive selection of seed structures. A case study was conducted to evaluate our method using ligand structures for the histamine H1 receptor. The results showed improved structural diversity than the previous method.
Collapse
Affiliation(s)
- Takahiro Inoue
- Department of Chemical System Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
| | - Kenichi Tanaka
- Department of Chemical System Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
| | - Masaaki Kotera
- Department of Chemical System Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
| | - Kimito Funatsu
- Department of Chemical System Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
| |
Collapse
|
6
|
Vázquez J, López M, Gibert E, Herrero E, Luque FJ. Merging Ligand-Based and Structure-Based Methods in Drug Discovery: An Overview of Combined Virtual Screening Approaches. Molecules 2020; 25:E4723. [PMID: 33076254 PMCID: PMC7587536 DOI: 10.3390/molecules25204723] [Citation(s) in RCA: 77] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Revised: 10/06/2020] [Accepted: 10/11/2020] [Indexed: 12/20/2022] Open
Abstract
Virtual screening (VS) is an outstanding cornerstone in the drug discovery pipeline. A variety of computational approaches, which are generally classified as ligand-based (LB) and structure-based (SB) techniques, exploit key structural and physicochemical properties of ligands and targets to enable the screening of virtual libraries in the search of active compounds. Though LB and SB methods have found widespread application in the discovery of novel drug-like candidates, their complementary natures have stimulated continued efforts toward the development of hybrid strategies that combine LB and SB techniques, integrating them in a holistic computational framework that exploits the available information of both ligand and target to enhance the success of drug discovery projects. In this review, we analyze the main strategies and concepts that have emerged in the last years for defining hybrid LB + SB computational schemes in VS studies. Particularly, attention is focused on the combination of molecular similarity and docking, illustrating them with selected applications taken from the literature.
Collapse
Affiliation(s)
- Javier Vázquez
- Pharmacelera, Plaça Pau Vila, 1, Sector C 2a, Edificio Palau de Mar, 08039 Barcelona, Spain;
- Department of Nutrition, Food Science and Gastronomy, Faculty of Pharmacy and Food Sciences, Institute of Biomedicine (IBUB), and Institute of Theoretical and Computational Chemistry (IQTC-UB), University of Barcelona, Av. Prat de la Riba 171, E-08921 Santa Coloma de Gramanet, Spain
| | - Manel López
- AB Science, Parc Scientifique de Luminy, Zone Luminy Enterprise, Case 922, 163 Av. de Luminy, 13288 Marseille, France;
| | - Enric Gibert
- Pharmacelera, Plaça Pau Vila, 1, Sector C 2a, Edificio Palau de Mar, 08039 Barcelona, Spain;
| | - Enric Herrero
- Pharmacelera, Plaça Pau Vila, 1, Sector C 2a, Edificio Palau de Mar, 08039 Barcelona, Spain;
| | - F. Javier Luque
- Department of Nutrition, Food Science and Gastronomy, Faculty of Pharmacy and Food Sciences, Institute of Biomedicine (IBUB), and Institute of Theoretical and Computational Chemistry (IQTC-UB), University of Barcelona, Av. Prat de la Riba 171, E-08921 Santa Coloma de Gramanet, Spain
| |
Collapse
|
7
|
Shibayama S, Marcou G, Horvath D, Baskin II, Funatsu K, Varnek A. Application of the mol2vec Technology to Large-size Data Visualization and Analysis. Mol Inform 2020; 39:e1900170. [PMID: 32090493 DOI: 10.1002/minf.201900170] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2019] [Accepted: 02/11/2020] [Indexed: 11/09/2022]
Abstract
Generative Topographic Mapping (GTM) is a dimensionality reduction method, which is widely used for both data visualization and structure-activity modeling. Large dimensionality of the initial data space may require significant computational resources and slow down the GTM construction. Therefore, it may be meaningful to reduce the number of descriptors used for encoding molecular structures. The Principal Component Analysis (PCA), a standard preprocessing tool, suffers from the information loss upon the dimensionality reduction. As an alternative, we propose to use substructure vector embedding provided by the mol2vec technique. In addition to the data dimensionality reduction, this technology also accounts for proximity of substructures in molecular graphs. In this study, dimensionality of large descriptor spaces of ISIDA fragment descriptors or Morgan fingerprints were reduced using either the PCA or the mol2vec method. The latter significantly speeds up GTM training without compromising its predictive power in bioactivity classification tasks.
Collapse
Affiliation(s)
- Shojiro Shibayama
- Department of Chemical System Engineering, School of Engineering, University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo, Japan.,Department, Institution Laboratoire de Chemoinformatique, UMR7140 University of Strasbourg-CNRS, 4, rue Blaise Pascal, 67000, Strasbourg, France
| | - Gilles Marcou
- Department, Institution Laboratoire de Chemoinformatique, UMR7140 University of Strasbourg-CNRS, 4, rue Blaise Pascal, 67000, Strasbourg, France
| | - Dragos Horvath
- Department, Institution Laboratoire de Chemoinformatique, UMR7140 University of Strasbourg-CNRS, 4, rue Blaise Pascal, 67000, Strasbourg, France
| | - Igor I Baskin
- Faculty of Physics, Lomonosov Moscow State University, Leninskie Gory, 119991, Moscow, Russian Federation
| | - Kimito Funatsu
- Department of Chemical System Engineering, School of Engineering, University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo, Japan
| | - Alexandre Varnek
- Department, Institution Laboratoire de Chemoinformatique, UMR7140 University of Strasbourg-CNRS, 4, rue Blaise Pascal, 67000, Strasbourg, France.,Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, 001-0021, Sapporo, Japan
| |
Collapse
|
8
|
Gantzer P, Creton B, Nieto-Draghi C. Inverse-QSPR for de novo Design: A Review. Mol Inform 2019; 39:e1900087. [PMID: 31682079 DOI: 10.1002/minf.201900087] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 11/04/2019] [Indexed: 11/09/2022]
Abstract
The use of computer tools to solve chemistry-related problems has given rise to a large and increasing number of publications these last decades. This new field of science is now well recognized and labelled Chemoinformatics. Among all chemoinformatics techniques, the use of statistical based approaches for property predictions has been the subject of numerous research reflecting both new developments and many cases of applications. The so obtained predictive models relating a property to molecular features - descriptors - are gathered under the acronym QSPR, for Quantitative Structure Property Relationships. Apart from the obvious use of such models to predict property values for new compounds, their use to virtually synthesize new molecules - de novo design - is currently a high-interest subject. Inverse-QSPR (i-QSPR) methods have hence been developed to accelerate the discovery of new materials that meet a set of specifications. In the proposed manuscript, we review existing i-QSPR methodologies published in the open literature in a way to highlight developments, applications, improvements and limitations of each.
Collapse
Affiliation(s)
- Philippe Gantzer
- IFP Energies nouvelles, 1 et 4 avenue de Bois-Préau, 92852, Rueil-Malmaison, France
| | - Benoit Creton
- IFP Energies nouvelles, 1 et 4 avenue de Bois-Préau, 92852, Rueil-Malmaison, France
| | - Carlos Nieto-Draghi
- IFP Energies nouvelles, 1 et 4 avenue de Bois-Préau, 92852, Rueil-Malmaison, France
| |
Collapse
|
9
|
Yang X, Wang Y, Byrne R, Schneider G, Yang S. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem Rev 2019; 119:10520-10594. [PMID: 31294972 DOI: 10.1021/acs.chemrev.8b00728] [Citation(s) in RCA: 351] [Impact Index Per Article: 70.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Artificial intelligence (AI), and, in particular, deep learning as a subcategory of AI, provides opportunities for the discovery and development of innovative drugs. Various machine learning approaches have recently (re)emerged, some of which may be considered instances of domain-specific AI which have been successfully employed for drug discovery and design. This review provides a comprehensive portrayal of these machine learning techniques and of their applications in medicinal chemistry. After introducing the basic principles, alongside some application notes, of the various machine learning algorithms, the current state-of-the art of AI-assisted pharmaceutical discovery is discussed, including applications in structure- and ligand-based virtual screening, de novo drug design, physicochemical and pharmacokinetic property prediction, drug repurposing, and related aspects. Finally, several challenges and limitations of the current methods are summarized, with a view to potential future directions for AI-assisted drug discovery and design.
Collapse
Affiliation(s)
- Xin Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Yifei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Ryan Byrne
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Gisbert Schneider
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Shengyong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| |
Collapse
|
10
|
Chu Y, He X. MoleGear: A Java-Based Platform for Evolutionary De Novo Molecular Design. Molecules 2019; 24:E1444. [PMID: 30979097 PMCID: PMC6479339 DOI: 10.3390/molecules24071444] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 04/03/2019] [Accepted: 04/10/2019] [Indexed: 11/17/2022] Open
Abstract
A Java-based platform, MoleGear, is developed for de novo molecular design based on the chemistry development kit (CDK) and other Java packages. MoleGear uses evolutionary algorithm (EA) to explore chemical space, and a suite of fragment-based operators of growing, crossover, and mutation for assembling novel molecules that can be scored by prediction of binding free energy or a weighted-sum multi-objective fitness function. The EA can be conducted in parallel over multiple nodes to support large-scale molecular optimizations. Some complementary utilities such as fragment library design, chemical space analysis, and graphical user interface are also integrated into MoleGear. The candidate molecules as inhibitors for the human immunodeficiency virus 1 (HIV-1) protease were designed by MoleGear, which validates the potential capability for de novo molecular design.
Collapse
Affiliation(s)
- Yunhan Chu
- Department of Chemical Engineering, Norwegian University of Science and Technology, N-7491 Trondheim, Norway.
| | - Xuezhong He
- Department of Chemical Engineering, Norwegian University of Science and Technology, N-7491 Trondheim, Norway.
| |
Collapse
|
11
|
Hoffer L, Muller C, Roche P, Morelli X. Chemistry-driven Hit-to-lead Optimization Guided by Structure-based Approaches. Mol Inform 2018; 37:e1800059. [PMID: 30051601 DOI: 10.1002/minf.201800059] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Accepted: 06/24/2018] [Indexed: 12/17/2022]
Abstract
For several decades, hit identification for drug discovery has been facilitated by developments in both fragment-based and high-throughput screening technologies. However, a major bottleneck in drug discovery projects continues to be the optimization of primary hits from screening campaigns in order to derive lead compounds. Computational chemistry or molecular modeling can play an important role during this hit-to-lead (H2L) stage by both suggesting putative optimizations and decreasing the number of compounds to be experimentally synthesized and evaluated. However, it is also crucial to consider the feasibility of organically synthesizing these virtually designed compounds. Furthermore, the generated molecules should have reasonable physicochemical properties and be medicinally relevant. This review focuses on chemistry-driven and structure-based computational methods that can be used to tackle the difficult problem of H2L optimization, with emphasis being placed on the strategy developed in our laboratory.
Collapse
Affiliation(s)
- Laurent Hoffer
- CNRS, Inserm, Institut Paoli-Calmettes, Aix-Marseille Univ, CRCM, Marseille, France
| | | | - Philippe Roche
- CNRS, Inserm, Institut Paoli-Calmettes, Aix-Marseille Univ, CRCM, Marseille, France
| | - Xavier Morelli
- CNRS, Inserm, Institut Paoli-Calmettes, Aix-Marseille Univ, CRCM, Marseille, France.,Institut Paoli-Calmettes, IPC Drug Discovery, Marseille, France
| |
Collapse
|
12
|
Qiu T, Wu D, Qiu J, Cao Z. Finding the molecular scaffold of nuclear receptor inhibitors through high-throughput screening based on proteochemometric modelling. J Cheminform 2018; 10:21. [PMID: 29651663 PMCID: PMC5897275 DOI: 10.1186/s13321-018-0275-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 04/02/2018] [Indexed: 02/10/2023] Open
Abstract
Nuclear receptors (NR) are a class of proteins that are responsible for sensing steroid and thyroid hormones and certain other molecules. In that case, NR have the ability to regulate the expression of specific genes and associated with various diseases, which make it essential drug targets. Approaches which can predict the inhibition ability of compounds for different NR target should be particularly helpful for drug development. In this study, proteochemometric modelling was introduced to analysis the bioactivity between chemical compounds and NR targets. Results illustrated the ability of our PCM model for high-throughput NR-inhibitor screening after evaluated on both internal (AUC > 0.870) and external (AUC > 0.746) validation set. Moreover, in-silico predicted bioactive compounds were clustered according to structure similarity and a series of representative molecular scaffolds can be derived for five major NR targets. Through scaffolds analysis, those essential bioactive scaffolds of different NR target can be detected and compared. Generally, the methods and molecular scaffolds proposed in this article can not only help the screening of potential therapeutic NR-inhibitors but also able to guide the future NR-related drug discovery.
Collapse
Affiliation(s)
- Tianyi Qiu
- School of Life Sciences and Technology, Shanghai 10th People's Hospital, Tongji University, No. 1239 SiPing Road, Shanghai, China.,The Institute of Biomedical Sciences, Fudan University, No. 138 Medical College Road, Shanghai, China
| | - Dingfeng Wu
- School of Life Sciences and Technology, Shanghai 10th People's Hospital, Tongji University, No. 1239 SiPing Road, Shanghai, China
| | - Jingxuan Qiu
- School of Life Sciences and Technology, Shanghai 10th People's Hospital, Tongji University, No. 1239 SiPing Road, Shanghai, China.,School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, No. 516 JunGong Road, Shanghai, China
| | - Zhiwei Cao
- School of Life Sciences and Technology, Shanghai 10th People's Hospital, Tongji University, No. 1239 SiPing Road, Shanghai, China.
| |
Collapse
|
13
|
Segler MHS, Kogej T, Tyrchan C, Waller MP. Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks. ACS CENTRAL SCIENCE 2018; 4:120-131. [PMID: 29392184 PMCID: PMC5785775 DOI: 10.1021/acscentsci.7b00512] [Citation(s) in RCA: 668] [Impact Index Per Article: 111.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Indexed: 05/20/2023]
Abstract
In de novo drug design, computational strategies are used to generate novel molecules with good affinity to the desired biological target. In this work, we show that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing. We demonstrate that the properties of the generated molecules correlate very well with the properties of the molecules used to train the model. In order to enrich libraries with molecules active toward a given biological target, we propose to fine-tune the model with small sets of molecules, which are known to be active against that target. Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test molecules that medicinal chemists designed, whereas against Plasmodium falciparum (Malaria), it reproduced 28% of 1240 test molecules. When coupled with a scoring function, our model can perform the complete de novo drug design cycle to generate large sets of novel molecules for drug discovery.
Collapse
Affiliation(s)
- Marwin H. S. Segler
- Institute of Organic
Chemistry & Center for Multiscale Theory and Computation, Westfälische Wilhelms-Universität Münster, 48149 Münster, Germany
| | - Thierry Kogej
- Hit Discovery, Discovery Sciences, AstraZeneca R&D, Gothenburg, Sweden
| | - Christian Tyrchan
- Department of Medicinal
Chemistry, IMED RIA, AstraZeneca R&D, Gothenburg, Sweden
| | - Mark P. Waller
- Department of Physics & International Centre for Quantum and
Molecular Structures, Shanghai University, Shanghai, China
| |
Collapse
|
14
|
Suryanarayanan V, Panwar U, Chandra I, Singh SK. De Novo Design of Ligands Using Computational Methods. Methods Mol Biol 2018; 1762:71-86. [PMID: 29594768 DOI: 10.1007/978-1-4939-7756-7_5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
De novo design technique is complementary to high-throughput virtual screening and is believed to contribute in pharmaceutical development of novel drugs with desired properties at a very low cost and time-efficient manner. In this chapter, we outline the basic de novo design concepts based on computational methods with an example.
Collapse
Affiliation(s)
- Venkatesan Suryanarayanan
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, India
| | - Umesh Panwar
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, India
| | - Ishwar Chandra
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, India
| | - Sanjeev Kumar Singh
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, India.
| |
Collapse
|
15
|
Maeda I, Hasegawa K, Kaneko H, Funatsu K. Novel Method Proposing Chemical Structures with Desirable Profile of Activities Based on Chemical and Protein Spaces. Mol Inform 2017; 36. [PMID: 28857513 DOI: 10.1002/minf.201700075] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2017] [Accepted: 08/09/2017] [Indexed: 11/06/2022]
Abstract
Active molecules among numerous chemical structures in a chemical database can be searched easily by statistical prediction of compound-protein interactions. However, constructing a simple prediction model against one protein does not aid drug design, because detecting chemical structures that act similarly against multiple proteins is necessary for preventing side effects of the potential drug. To tackle this problem, we propose a new method that visualizes chemical and protein spaces. For simultaneous visualization of both spaces, we employ a counterpropagation neural network (CPNN) and develop a new visualization method named multi-input CPNN (MICPNN). In a case study of the kinase protein family, the MICPNN model predicted accurately the complex relationships between compounds and proteins. The proposed method identified chemical structures with promising activity against kinases. Our proposed method is also applicable to other protein families, such as G-protein coupled receptors, ion channels and transporters.
Collapse
Affiliation(s)
- Iwao Maeda
- The University of Tokyo, School of Engineering, Department of Chemical System Engineering, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656
| | - Kiyoshi Hasegawa
- Chugai Pharmaceutical Company, Kamakura Research Laboratories, 200 Kajiwara, Kamakura, Kanagawa, 247-8530, Japan
| | - Hiromasa Kaneko
- The University of Tokyo, School of Engineering, Department of Chemical System Engineering, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656
| | - Kimito Funatsu
- The University of Tokyo, School of Engineering, Department of Chemical System Engineering, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656
| |
Collapse
|
16
|
Miyao T, Funatsu K. Finding Chemical Structures Corresponding to a Set of Coordinates in Chemical Descriptor Space. Mol Inform 2017; 36. [DOI: 10.1002/minf.201700030] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2017] [Accepted: 04/04/2017] [Indexed: 11/10/2022]
Affiliation(s)
- Tomoyuki Miyao
- Department of Chemical System Engineering; The University of Tokyo; 7-3-1 Hongo, Bunkyo-ku Tokyo 113-8656 Japan
| | - Kimito Funatsu
- Department of Chemical System Engineering; The University of Tokyo; 7-3-1 Hongo, Bunkyo-ku Tokyo 113-8656 Japan
| |
Collapse
|