1
|
Zhao M, Yu W, MacKerell AD. Enhancing SILCS-MC via GPU Acceleration and Ligand Conformational Optimization with Genetic and Parallel Tempering Algorithms. J Phys Chem B 2024. [PMID: 39031121 DOI: 10.1021/acs.jpcb.4c03045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/22/2024]
Abstract
In the domain of computer-aided drug design, achieving precise and accurate estimates of ligand-protein binding is paramount in the context of screening extensive drug libraries and performing ligand optimization. A fundamental aspect of the SILCS (site identification by ligand competitive saturation) methodology lies in the generation of comprehensive 3D free-energy functional group affinity maps (FragMaps), encompassing the entirety of the target molecule structure. These FragMaps offer an intricate landscape of functional group affinities across the protein, bilayer, or RNA, acting as the basis for subsequent SILCS-Monte Carlo (MC) simulations wherein ligands are docked to the target molecule. To augment the efficiency and breadth of ligand sampling capabilities, we implemented an improved SILCS-MC methodology. By harnessing the parallel computing capability of GPUs, our approach facilitates concurrent calculations over multiple ligands and binding sites, markedly enhancing the computational efficiency. Moreover, the integration of a genetic algorithm (GA) with MC allows us to employ an evolutionary approach to perform ligand sampling, assuring enhanced convergence characteristics. In addition, the potential utility of parallel tempering (PT) to improve sampling was investigated. Implementation of SILCS-MC on GPU architecture is shown to accelerate the speed of SILCS-MC calculations by over 2-orders of magnitude. Use of GA and PT yield improvements over Markov-chain MC, increasing the precision of the resultant docked orientations and binding free energies, though the extent of improvements is relatively small. Accordingly, significant improvements in speed are obtained through the GPU implementation with minor improvements in the precision of the docking obtained via the tested GA and PT algorithms.
Collapse
Affiliation(s)
- Mingtian Zhao
- Computer Aided Drug Design Center, Department of Pharmaceutical Sciences, University of Maryland, School of Pharmacy, 20 Penn Street, Baltimore, Maryland 21201, United States
| | - Wenbo Yu
- Computer Aided Drug Design Center, Department of Pharmaceutical Sciences, University of Maryland, School of Pharmacy, 20 Penn Street, Baltimore, Maryland 21201, United States
| | - Alexander D MacKerell
- Computer Aided Drug Design Center, Department of Pharmaceutical Sciences, University of Maryland, School of Pharmacy, 20 Penn Street, Baltimore, Maryland 21201, United States
| |
Collapse
|
2
|
Xia X, Liu Y, Zheng C, Zhang X, Wu Q, Gao X, Zeng X, Su Y. Evolutionary Multiobjective Molecule Optimization in an Implicit Chemical Space. J Chem Inf Model 2024; 64:5161-5174. [PMID: 38870455 PMCID: PMC11235097 DOI: 10.1021/acs.jcim.4c00031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 05/08/2024] [Accepted: 05/13/2024] [Indexed: 06/15/2024]
Abstract
Optimization techniques play a pivotal role in advancing drug development, serving as the foundation of numerous generative methods tailored to efficiently design optimized molecules derived from existing lead compounds. However, existing methods often encounter difficulties in generating diverse, novel, and high-property molecules that simultaneously optimize multiple drug properties. To overcome this bottleneck, we propose a multiobjective molecule optimization framework (MOMO). MOMO employs a specially designed Pareto-based multiproperty evaluation strategy at the molecular sequence level to guide the evolutionary search in an implicit chemical space. A comparative analysis of MOMO with five state-of-the-art methods across two benchmark multiproperty molecule optimization tasks reveals that MOMO markedly outperforms them in terms of diversity, novelty, and optimized properties. The practical applicability of MOMO in drug discovery has also been validated on four challenging tasks in the real-world discovery problem. These results suggest that MOMO can provide a useful tool to facilitate molecule optimization problems with multiple properties.
Collapse
Affiliation(s)
- Xin Xia
- The
Key Laboratory of Intelligent Computing and Signal Processing of Ministry
of Education, School of Artificial Intelligence, Anhui University, Hefei 230601, China
- Institute
of Artificial Intelligence, Hefei Comprehensive
National Science Center, 5089 Wangjiang West Road, Hefei 230088, AnhuiChina
| | - Yiping Liu
- College
of Computer Science and Electronic Engineering, Hunan University, Changsha 410012, China
| | - Chunhou Zheng
- The
Key Laboratory of Intelligent Computing and Signal Processing of Ministry
of Education, School of Artificial Intelligence, Anhui University, Hefei 230601, China
| | - Xingyi Zhang
- The
Key Laboratory of Intelligent Computing and Signal Processing of Ministry
of Education, School of Artificial Intelligence, Anhui University, Hefei 230601, China
| | - Qingwen Wu
- The
Key Laboratory of Intelligent Computing and Signal Processing of Ministry
of Education, School of Artificial Intelligence, Anhui University, Hefei 230601, China
| | - Xin Gao
- Computer
Science Program, Computer, Electrical and Mathematical Sciences and
Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology
(KAUST), Thuwal 23955-6900, Kingdom
of Saudi Arabia
| | - Xiangxiang Zeng
- College
of Computer Science and Electronic Engineering, Hunan University, Changsha 410012, China
| | - Yansen Su
- The
Key Laboratory of Intelligent Computing and Signal Processing of Ministry
of Education, School of Artificial Intelligence, Anhui University, Hefei 230601, China
- Institute
of Artificial Intelligence, Hefei Comprehensive
National Science Center, 5089 Wangjiang West Road, Hefei 230088, AnhuiChina
| |
Collapse
|
3
|
Humayun F, Khan F, Khan A, Alshammari A, Ji J, Farhan A, Fawad N, Alam W, Ali A, Wei DQ. De novo generation of dual-target ligands for the treatment of SARS-CoV-2 using deep learning, virtual screening, and molecular dynamic simulations. J Biomol Struct Dyn 2024; 42:3019-3029. [PMID: 37449757 DOI: 10.1080/07391102.2023.2234481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Accepted: 04/30/2023] [Indexed: 07/18/2023]
Abstract
De novo generation of molecules with the necessary features offers a promising opportunity for artificial intelligence, such as deep generative approaches. However, creating novel compounds having biological activities toward two distinct targets continues to be a very challenging task. In this study, we develop a unique computational framework for the de novo synthesis of bioactive compounds directed at two predetermined therapeutic targets. This framework is referred to as the dual-target ligand generative network. Our approach uses a stochastic policy to explore chemical spaces called a sequence-based simple molecular input line entry system (SMILES) generator. The steps in the high-level workflow would be to gather and prepare the training data for both targets' molecules, build a neural network model and train it to make molecules, create new molecules using generative AI, and then virtually screen the newly validated molecules against the SARS-CoV-2 PLpro and 3CLpro drug targets. Results shows that novel molecules generated have higher binding affinity with both targets than the conventional drug i.e. Remdesivir being used for the treatment of SARS-CoV-2.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Fahad Humayun
- Department of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, PR China
- State Key Laboratory of Microbial Metabolism and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, PR China
| | - Fatima Khan
- National Institute of Health, Islamabad, Pakistan
| | - Abbas Khan
- Department of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, PR China
- State Key Laboratory of Microbial Metabolism and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, PR China
| | - Abdulrahman Alshammari
- Department of Pharmacology and Toxicology, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | - Jun Ji
- Henan Provincial Engineering and Technology Center of Health Products for Livestock and Poultry, Henan Provincial Engineering and Technology Center of Animal Disease Diagnosis and Integrated Control, Nanyang Normal University, Nanyang, PR China
| | - Ali Farhan
- Department of Chemistry, Chung Yuan Christian University, Taoyuan, Taiwan
| | - Nasim Fawad
- Poultry Research Institute, Rawalpindi, Pakistan
| | - Waheed Alam
- National Institute of Health, Islamabad, Pakistan
| | - Arif Ali
- Department of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, PR China
- State Key Laboratory of Microbial Metabolism and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, PR China
| | - Dong-Qing Wei
- Department of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, PR China
- State Key Laboratory of Microbial Metabolism and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, PR China
- Centre for Research in Molecular Modeling, Concordia University, Québec, Canada
| |
Collapse
|
4
|
Angelo JS, Guedes IA, Barbosa HJC, Dardenne LE. Multi-and many-objective optimization: present and future in de novo drug design. Front Chem 2023; 11:1288626. [PMID: 38192501 PMCID: PMC10773868 DOI: 10.3389/fchem.2023.1288626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 11/27/2023] [Indexed: 01/10/2024] Open
Abstract
de novo Drug Design (dnDD) aims to create new molecules that satisfy multiple conflicting objectives. Since several desired properties can be considered in the optimization process, dnDD is naturally categorized as a many-objective optimization problem (ManyOOP), where more than three objectives must be simultaneously optimized. However, a large number of objectives typically pose several challenges that affect the choice and the design of optimization methodologies. Herein, we cover the application of multi- and many-objective optimization methods, particularly those based on Evolutionary Computation and Machine Learning techniques, to enlighten their potential application in dnDD. Additionally, we comprehensively analyze how molecular properties used in the optimization process are applied as either objectives or constraints to the problem. Finally, we discuss future research in many-objective optimization for dnDD, highlighting two important possible impacts: i) its integration with the development of multi-target approaches to accelerate the discovery of innovative and more efficacious drug therapies and ii) its role as a catalyst for new developments in more fundamental and general methodological frameworks in the field.
Collapse
Affiliation(s)
| | | | | | - Laurent E. Dardenne
- Coordenação de Modelagem Computacional, Laboratório Nacional de Computação Científica, Petrópolis, Brazil
| |
Collapse
|
5
|
Blanchard AE, Bhowmik D, Fox Z, Gounley J, Glaser J, Akpa BS, Irle S. Adaptive language model training for molecular design. J Cheminform 2023; 15:59. [PMID: 37291633 DOI: 10.1186/s13321-023-00719-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 04/03/2023] [Indexed: 06/10/2023] Open
Abstract
The vast size of chemical space necessitates computational approaches to automate and accelerate the design of molecular sequences to guide experimental efforts for drug discovery. Genetic algorithms provide a useful framework to incrementally generate molecules by applying mutations to known chemical structures. Recently, masked language models have been applied to automate the mutation process by leveraging large compound libraries to learn commonly occurring chemical sequences (i.e., using tokenization) and predict rearrangements (i.e., using mask prediction). Here, we consider how language models can be adapted to improve molecule generation for different optimization tasks. We use two different generation strategies for comparison, fixed and adaptive. The fixed strategy uses a pre-trained model to generate mutations; the adaptive strategy trains the language model on each new generation of molecules selected for target properties during optimization. Our results show that the adaptive strategy allows the language model to more closely fit the distribution of molecules in the population. Therefore, for enhanced fitness optimization, we suggest the use of the fixed strategy during an initial phase followed by the use of the adaptive strategy. We demonstrate the impact of adaptive training by searching for molecules that optimize both heuristic metrics, drug-likeness and synthesizability, as well as predicted protein binding affinity from a surrogate model. Our results show that the adaptive strategy provides a significant improvement in fitness optimization compared to the fixed pre-trained model, empowering the application of language models to molecular design tasks.
Collapse
Affiliation(s)
- Andrew E Blanchard
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
| | - Debsindhu Bhowmik
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA.
| | - Zachary Fox
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
| | - John Gounley
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
| | - Jens Glaser
- National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
| | - Belinda S Akpa
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
- Chemical & Biomolecular Engineering, University of Tennessee, Knoxville, TN, 37996, USA
| | - Stephan Irle
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
| |
Collapse
|
6
|
Fromer JC, Coley CW. Computer-aided multi-objective optimization in small molecule discovery. PATTERNS (NEW YORK, N.Y.) 2023; 4:100678. [PMID: 36873904 PMCID: PMC9982302 DOI: 10.1016/j.patter.2023.100678] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
Molecular discovery is a multi-objective optimization problem that requires identifying a molecule or set of molecules that balance multiple, often competing, properties. Multi-objective molecular design is commonly addressed by combining properties of interest into a single objective function using scalarization, which imposes assumptions about relative importance and uncovers little about the trade-offs between objectives. In contrast to scalarization, Pareto optimization does not require knowledge of relative importance and reveals the trade-offs between objectives. However, it introduces additional considerations in algorithm design. In this review, we describe pool-based and de novo generative approaches to multi-objective molecular discovery with a focus on Pareto optimization algorithms. We show how pool-based molecular discovery is a relatively direct extension of multi-objective Bayesian optimization and how the plethora of different generative models extend from single-objective to multi-objective optimization in similar ways using non-dominated sorting in the reward function (reinforcement learning) or to select molecules for retraining (distribution learning) or propagation (genetic algorithms). Finally, we discuss some remaining challenges and opportunities in the field, emphasizing the opportunity to adopt Bayesian optimization techniques into multi-objective de novo design.
Collapse
Affiliation(s)
- Jenna C Fromer
- Department of Chemical Engineering, MIT, Cambridge, MA 02139, USA
| | - Connor W Coley
- Department of Chemical Engineering, MIT, Cambridge, MA 02139, USA.,Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA 02139, USA
| |
Collapse
|
7
|
Menon D, Ranganathan R. A Generative Approach to Materials Discovery, Design, and Optimization. ACS OMEGA 2022; 7:25958-25973. [PMID: 35936396 PMCID: PMC9352221 DOI: 10.1021/acsomega.2c03264] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 07/11/2022] [Indexed: 05/25/2023]
Abstract
Despite its potential to transform society, materials research suffers from a major drawback: its long research timeline. Recently, machine-learning techniques have emerged as a viable solution to this drawback and have shown accuracies comparable to other computational techniques like density functional theory (DFT) at a fraction of the computational time. One particular class of machine-learning models, known as "generative models", is of particular interest owing to its ability to approximate high-dimensional probability distribution functions, which in turn can be used to generate novel data such as molecular structures by sampling these approximated probability distribution functions. This review article aims to provide an in-depth understanding of the underlying mathematical principles of popular generative models such as recurrent neural networks, variational autoencoders, and generative adversarial networks and discuss their state-of-the-art applications in the domains of biomaterials and organic drug-like materials, energy materials, and structural materials. Here, we discuss a broad range of applications of these models spanning from the discovery of drugs that treat cancer to finding the first room-temperature superconductor and from the discovery and optimization of battery and photovoltaic materials to the optimization of high-entropy alloys. We conclude by presenting a brief outlook of the major challenges that lie ahead for the mainstream usage of these models for materials research.
Collapse
|
8
|
Mukaidaisi M, Vu A, Grantham K, Tchagang A, Li Y. Multi-Objective Drug Design Based on Graph-Fragment Molecular Representation and Deep Evolutionary Learning. Front Pharmacol 2022; 13:920747. [PMID: 35860028 PMCID: PMC9291509 DOI: 10.3389/fphar.2022.920747] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 05/26/2022] [Indexed: 11/19/2022] Open
Abstract
Drug discovery is a challenging process with a huge molecular space to be explored and numerous pharmacological properties to be appropriately considered. Among various drug design protocols, fragment-based drug design is an effective way of constraining the search space and better utilizing biologically active compounds. Motivated by fragment-based drug search for a given protein target and the emergence of artificial intelligence (AI) approaches in this field, this work advances the field of in silico drug design by (1) integrating a graph fragmentation-based deep generative model with a deep evolutionary learning process for large-scale multi-objective molecular optimization, and (2) applying protein-ligand binding affinity scores together with other desired physicochemical properties as objectives. Our experiments show that the proposed method can generate novel molecules with improved property values and binding affinities.
Collapse
Affiliation(s)
- Muhetaer Mukaidaisi
- Biomedical Data Science Laboratory, Department of Computer Science, Brock University, St. Catharines, ON, Canada
| | - Andrew Vu
- Biomedical Data Science Laboratory, Department of Computer Science, Brock University, St. Catharines, ON, Canada
| | - Karl Grantham
- Biomedical Data Science Laboratory, Department of Computer Science, Brock University, St. Catharines, ON, Canada
| | - Alain Tchagang
- Scientific Data Mining Team, Digital Technologies Research Centre, National Research Council Canada, Ottawa, ON, Canada
| | - Yifeng Li
- Biomedical Data Science Laboratory, Department of Computer Science, Brock University, St. Catharines, ON, Canada
- *Correspondence: Yifeng Li ,
| |
Collapse
|
9
|
|
10
|
Perron Q, Mirguet O, Tajmouati H, Skiredj A, Rojas A, Gohier A, Ducrot P, Bourguignon MP, Sansilvestri-Morel P, Do Huu N, Gellibert F, Gaston-Mathé Y. Deep generative models for ligand-based de novo design applied to multi-parametric optimization. J Comput Chem 2022; 43:692-703. [PMID: 35218219 DOI: 10.1002/jcc.26826] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 01/26/2022] [Accepted: 01/27/2022] [Indexed: 11/08/2022]
Abstract
Multi-parameter optimization (MPO) is a major challenge in new chemical entity (NCE) drug discovery. Recently, promising results were reported for deep learning generative models applied to de novo molecular design, but, to our knowledge, until now no report was made of the value of this new technology for addressing MPO in an actual drug discovery project. In this study, we demonstrate the benefit of applying AI technology in a real drug discovery project. We evaluate the potential of a ligand-based de novo design technology using deep learning generative models to accelerate the obtention of lead compounds meeting 11 different biological activity objectives simultaneously. Using the initial dataset of the project, we built QSAR models for all the 11 objectives, with moderate to high performance (precision between 0.67 and 1.0 on an independent test set). Our DL-based AI de novo design algorithm, combined with the QSAR models, generated 150 virtual compounds predicted as active on all objectives. Eleven were synthetized and tested. The AI-designed compounds met 9.5 objectives on average (i.e., 86% success rate) versus 6.4 (i.e., 58% success rate) for the initial molecules measured on all objectives. One of the AI-designed molecules was active on all 11 measured objectives, and two were active on 10 objectives while being in the error margin of the assay for the last one. The AI algorithm designed compounds with functional groups, which, although being rare or absent in the initial dataset, turned out to be highly beneficial for the MPO.
Collapse
Affiliation(s)
| | - Olivier Mirguet
- Institut De Recherches Servier, Suresnes, France.,Institut De Recherches Servier, Croissy, France
| | | | | | - Anne Rojas
- Institut De Recherches Servier, Suresnes, France.,Institut De Recherches Servier, Croissy, France
| | - Arnaud Gohier
- Institut De Recherches Servier, Suresnes, France.,Institut De Recherches Servier, Croissy, France
| | - Pierre Ducrot
- Institut De Recherches Servier, Suresnes, France.,Institut De Recherches Servier, Croissy, France
| | - Marie-Pierre Bourguignon
- Institut De Recherches Servier, Suresnes, France.,Institut De Recherches Servier, Croissy, France
| | | | | | - Françoise Gellibert
- Institut De Recherches Servier, Suresnes, France.,Institut De Recherches Servier, Croissy, France
| | | |
Collapse
|
11
|
Kerstjens A, De Winter H. LEADD: Lamarckian evolutionary algorithm for de novo drug design. J Cheminform 2022; 14:3. [PMID: 35033209 PMCID: PMC8760751 DOI: 10.1186/s13321-022-00582-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 12/30/2021] [Indexed: 11/10/2022] Open
Abstract
Given an objective function that predicts key properties of a molecule, goal-directed de novo molecular design is a useful tool to identify molecules that maximize or minimize said objective function. Nonetheless, a common drawback of these methods is that they tend to design synthetically unfeasible molecules. In this paper we describe a Lamarckian evolutionary algorithm for de novo drug design (LEADD). LEADD attempts to strike a balance between optimization power, synthetic accessibility of designed molecules and computational efficiency. To increase the likelihood of designing synthetically accessible molecules, LEADD represents molecules as graphs of molecular fragments, and limits the bonds that can be formed between them through knowledge-based pairwise atom type compatibility rules. A reference library of drug-like molecules is used to extract fragments, fragment preferences and compatibility rules. A novel set of genetic operators that enforce these rules in a computationally efficient manner is presented. To sample chemical space more efficiently we also explore a Lamarckian evolutionary mechanism that adapts the reproductive behavior of molecules. LEADD has been compared to both standard virtual screening and a comparable evolutionary algorithm using a standardized benchmark suite and was shown to be able to identify fitter molecules more efficiently. Moreover, the designed molecules are predicted to be easier to synthesize than those designed by other evolutionary algorithms.
Collapse
Affiliation(s)
- Alan Kerstjens
- Department of Pharmaceutical Sciences, Faculty of Pharmaceutical, Biomedical and Veterinary Sciences, University of Antwerp, Universiteitsplein 1A, 2610, Wilrijk, Belgium
| | - Hans De Winter
- Department of Pharmaceutical Sciences, Faculty of Pharmaceutical, Biomedical and Veterinary Sciences, University of Antwerp, Universiteitsplein 1A, 2610, Wilrijk, Belgium.
| |
Collapse
|
12
|
Abstract
INTRODUCTION The popularity and success of advanced AI methods like deep neural networks has led to novel ways for exploring chemical space. Their opaque nature poses challenges for model evaluation regarding novelty, uniqueness, and distribution of the chemical space covered. However, these methods also promise to be able to explore uncharted chemical space in novel ways that do not rely directly on structural similarity. AREAS COVERED This review provides an overview of popular deep learning methods for chemical space exploration. Crucial aspects like choice of molecular representation, training for focused chemical space exploration, and criteria for assessing and validating chemical space coverage are discussed. EXPERT OPINION Deep learning offers great potential for chemical space exploration beyond conventional fragment-based methods. Given the rarity of prospective applications and considering the difficulty in assessing representativeness and comprehensiveness of chemical space covered, developing criteria for assessing and validating generative models is of great significance. Latent space models like variational autoencoders are conceptually appealing for inverse QSAR/QSPR approaches as neighborhood relationships in latent space can be trained to reflect property similarities. Future research in understanding and interpreting generative models might lead to a better understanding of biologically relevant properties of molecules.
Collapse
Affiliation(s)
- Martin Vogt
- Department of Life Science Informatics, B-it, Limes Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich Wilhelms-Universität, Bonn, Germany
| |
Collapse
|
13
|
Cincilla G, Masoni S, Blobel J. Individual and collective human intelligence in drug design: evaluating the search strategy. J Cheminform 2021; 13:80. [PMID: 34635158 PMCID: PMC8507178 DOI: 10.1186/s13321-021-00556-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 09/18/2021] [Indexed: 11/10/2022] Open
Abstract
In recent years, individual and collective human intelligence, defined as the knowledge, skills, reasoning and intuition of individuals and groups, have been used in combination with computer algorithms to solve complex scientific problems. Such approach was successfully used in different research fields such as: structural biology, comparative genomics, macromolecular crystallography and RNA design. Herein we describe an attempt to use a similar approach in small-molecule drug discovery, specifically to drive search strategies of de novo drug design. This is assessed with a case study that consists of a series of public experiments in which participants had to explore the huge chemical space in silico to find predefined compounds by designing molecules and analyzing the score associate with them. Such a process may be seen as an instantaneous surrogate of the classical design-make-test cycles carried out by medicinal chemists during the drug discovery hit to lead phase but not hindered by long synthesis and testing times. We present first findings on (1) assessing human intelligence in chemical space exploration, (2) comparing individual and collective human intelligence performance in this task and (3) contrasting some human and artificial intelligence achievements in de novo drug design.
Collapse
Affiliation(s)
- Giovanni Cincilla
- Molomics, Barcelona Science Park, c/Baldiri i Reixac 4-12, 08028, Barcelona, Spain.
| | - Simone Masoni
- Molomics, Barcelona Science Park, c/Baldiri i Reixac 4-12, 08028, Barcelona, Spain.
| | - Jascha Blobel
- Molomics, Barcelona Science Park, c/Baldiri i Reixac 4-12, 08028, Barcelona, Spain.
| |
Collapse
|
14
|
Pereira T, Abbasi M, Oliveira JL, Ribeiro B, Arrais J. Optimizing blood-brain barrier permeation through deep reinforcement learning for de novo drug design. Bioinformatics 2021; 37:i84-i92. [PMID: 34252946 PMCID: PMC8336597 DOI: 10.1093/bioinformatics/btab301] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
MOTIVATION The process of placing new drugs into the market is time-consuming, expensive and complex. The application of computational methods for designing molecules with bespoke properties can contribute to saving resources throughout this process. However, the fundamental properties to be optimized are often not considered or conflicting with each other. In this work, we propose a novel approach to consider both the biological property and the bioavailability of compounds through a deep reinforcement learning framework for the targeted generation of compounds. We aim to obtain a promising set of selective compounds for the adenosine A2A receptor and, simultaneously, that have the necessary properties in terms of solubility and permeability across the blood-brain barrier to reach the site of action. The cornerstone of the framework is based on a recurrent neural network architecture, the Generator. It seeks to learn the building rules of valid molecules to sample new compounds further. Also, two Predictors are trained to estimate the properties of interest of the new molecules. Finally, the fine-tuning of the Generator was performed with reinforcement learning, integrated with multi-objective optimization and exploratory techniques to ensure that the Generator is adequately biased. RESULTS The biased Generator can generate an interesting set of molecules, with approximately 85% having the two fundamental properties biased as desired. Thus, this approach has transformed a general molecule generator into a model focused on optimizing specific objectives. Furthermore, the molecules' synthesizability and drug-likeness demonstrate the potential applicability of the de novo drug design in medicinal chemistry. AVAILABILITY AND IMPLEMENTATION All code is publicly available in the https://github.com/larngroup/De-Novo-Drug-Design. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tiago Pereira
- CSUC/DEI, University of Coimbra, Coimbra 3030-290, Portugal.,IEETA/DETI, University of Aveiro, Aveiro 3810-193, Portugal
| | - Maryam Abbasi
- CSUC/DEI, University of Coimbra, Coimbra 3030-290, Portugal
| | | | | | - Joel Arrais
- CSUC/DEI, University of Coimbra, Coimbra 3030-290, Portugal
| |
Collapse
|
15
|
Steinmann C, Jensen JH. Using a genetic algorithm to find molecules with good docking scores. PEERJ PHYSICAL CHEMISTRY 2021. [DOI: 10.7717/peerj-pchem.18] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
A graph-based genetic algorithm (GA) is used to identify molecules (ligands) with high absolute docking scores as estimated by the Glide software package, starting from randomly chosen molecules from the ZINC database, for four different targets: Bacillus subtilis chorismate mutase (CM), human β2-adrenergic G protein-coupled receptor (β2AR), the DDR1 kinase domain (DDR1), and β-cyclodextrin (BCD). By the combined use of functional group filters and a score modifier based on a heuristic synthetic accessibility (SA) score our approach identifies between ca 500 and 6,000 structurally diverse molecules with scores better than known binders by screening a total of 400,000 molecules starting from 8,000 randomly selected molecules from the ZINC database. Screening 250,000 molecules from the ZINC database identifies significantly more molecules with better docking scores than known binders, with the exception of CM, where the conventional screening approach only identifies 60 compounds compared to 511 with GA+Filter+SA. In the case of β2AR and DDR1, the GA+Filter+SA approach finds significantly more molecules with docking scores lower than −9.0 and −10.0. The GA+Filters+SA docking methodology is thus effective in generating a large and diverse set of synthetically accessible molecules with very good docking scores for a particular target. An early incarnation of the GA+Filter+SA approach was used to identify potential binders to the COVID-19 main protease and submitted to the early stages of the COVID Moonshot project, a crowd-sourced initiative to accelerate the development of a COVID antiviral.
Collapse
Affiliation(s)
- Casper Steinmann
- Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Jan H. Jensen
- Department of Chemistry, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
16
|
Pereira T, Abbasi M, Ribeiro B, Arrais JP. Diversity oriented Deep Reinforcement Learning for targeted molecule generation. J Cheminform 2021; 13:21. [PMID: 33750461 PMCID: PMC7944916 DOI: 10.1186/s13321-021-00498-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 02/22/2021] [Indexed: 11/10/2022] Open
Abstract
In this work, we explore the potential of deep learning to streamline the process of identifying new potential drugs through the computational generation of molecules with interesting biological properties. Two deep neural networks compose our targeted generation framework: the Generator, which is trained to learn the building rules of valid molecules employing SMILES strings notation, and the Predictor which evaluates the newly generated compounds by predicting their affinity for the desired target. Then, the Generator is optimized through Reinforcement Learning to produce molecules with bespoken properties. The innovation of this approach is the exploratory strategy applied during the reinforcement training process that seeks to add novelty to the generated compounds. This training strategy employs two Generators interchangeably to sample new SMILES: the initially trained model that will remain fixed and a copy of the previous one that will be updated during the training to uncover the most promising molecules. The evolution of the reward assigned by the Predictor determines how often each one is employed to select the next token of the molecule. This strategy establishes a compromise between the need to acquire more information about the chemical space and the need to sample new molecules, with the experience gained so far. To demonstrate the effectiveness of the method, the Generator is trained to design molecules with an optimized coefficient of partition and also high inhibitory power against the Adenosine [Formula: see text] and [Formula: see text] opioid receptors. The results reveal that the model can effectively adjust the newly generated molecules towards the wanted direction. More importantly, it was possible to find promising sets of unique and diverse molecules, which was the main purpose of the newly implemented strategy.
Collapse
Affiliation(s)
- Tiago Pereira
- Department of Informatics Engineering, Centre for Informatics and Systems of the University of Coimbra, University of Coimbra, Pinhal de Marrocos, Coimbra, Portugal
| | - Maryam Abbasi
- Department of Informatics Engineering, Centre for Informatics and Systems of the University of Coimbra, University of Coimbra, Pinhal de Marrocos, Coimbra, Portugal
| | - Bernardete Ribeiro
- Department of Informatics Engineering, Centre for Informatics and Systems of the University of Coimbra, University of Coimbra, Pinhal de Marrocos, Coimbra, Portugal
| | - Joel P. Arrais
- Department of Informatics Engineering, Centre for Informatics and Systems of the University of Coimbra, University of Coimbra, Pinhal de Marrocos, Coimbra, Portugal
| |
Collapse
|
17
|
Mouchlis VD, Afantitis A, Serra A, Fratello M, Papadiamantis AG, Aidinis V, Lynch I, Greco D, Melagraki G. Advances in de Novo Drug Design: From Conventional to Machine Learning Methods. Int J Mol Sci 2021; 22:1676. [PMID: 33562347 PMCID: PMC7915729 DOI: 10.3390/ijms22041676] [Citation(s) in RCA: 96] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Revised: 01/31/2021] [Accepted: 01/31/2021] [Indexed: 12/11/2022] Open
Abstract
. De novo drug design is a computational approach that generates novel molecular structures from atomic building blocks with no a priori relationships. Conventional methods include structure-based and ligand-based design, which depend on the properties of the active site of a biological target or its known active binders, respectively. Artificial intelligence, including machine learning, is an emerging field that has positively impacted the drug discovery process. Deep reinforcement learning is a subdivision of machine learning that combines artificial neural networks with reinforcement-learning architectures. This method has successfully been employed to develop novel de novo drug design approaches using a variety of artificial networks including recurrent neural networks, convolutional neural networks, generative adversarial networks, and autoencoders. This review article summarizes advances in de novo drug design, from conventional growth algorithms to advanced machine-learning methodologies and highlights hot topics for further development.
Collapse
Affiliation(s)
| | - Antreas Afantitis
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1046, Cyprus;
| | - Angela Serra
- Faculty of Medicine and Health Technology, Tampere University, 33520 Tampere, Finland; (A.S.); (M.F.); (D.G.)
- BioMEdiTech Institute, Tampere University, 33520 Tampere, Finland
| | - Michele Fratello
- Faculty of Medicine and Health Technology, Tampere University, 33520 Tampere, Finland; (A.S.); (M.F.); (D.G.)
- BioMEdiTech Institute, Tampere University, 33520 Tampere, Finland
| | - Anastasios G. Papadiamantis
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1046, Cyprus;
- School of Geography, Earth and Environmental Sciences, University of Birmingham, Birmingham B15 2TT, UK;
| | - Vassilis Aidinis
- Institute for Bioinnovation, Biomedical Sciences Research Center Alexander Fleming, Fleming 34, 16672 Athens, Greece;
| | - Iseult Lynch
- School of Geography, Earth and Environmental Sciences, University of Birmingham, Birmingham B15 2TT, UK;
| | - Dario Greco
- Faculty of Medicine and Health Technology, Tampere University, 33520 Tampere, Finland; (A.S.); (M.F.); (D.G.)
- BioMEdiTech Institute, Tampere University, 33520 Tampere, Finland
- Institute of Biotechnology, University of Helsinki, 00014 Helsinki, Finland
- Finnish Center for Alternative Methods (FICAM), Tampere University, 33520 Tampere, Finland
| | - Georgia Melagraki
- Division of Physical Sciences & Applications, Hellenic Military Academy, 16672 Vari, Greece
| |
Collapse
|
18
|
Liu X, IJzerman AP, van Westen GJP. Computational Approaches for De Novo Drug Design: Past, Present, and Future. Methods Mol Biol 2021; 2190:139-165. [PMID: 32804364 DOI: 10.1007/978-1-0716-0826-5_6] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Drug discovery is time- and resource-consuming. To this end, computational approaches that are applied in de novo drug design play an important role to improve the efficiency and decrease costs to develop novel drugs. Over several decades, a variety of methods have been proposed and applied in practice. Traditionally, drug design problems are always taken as combinational optimization in discrete chemical space. Hence optimization methods were exploited to search for new drug molecules to meet multiple objectives. With the accumulation of data and the development of machine learning methods, computational drug design methods have gradually shifted to a new paradigm. There has been particular interest in the potential application of deep learning methods to drug design. In this chapter, we will give a brief description of these two different de novo methods, compare their application scopes and discuss their possible development in the future.
Collapse
Affiliation(s)
- Xuhan Liu
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands
| | - Adriaan P IJzerman
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands
| | - Gerard J P van Westen
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands.
| |
Collapse
|
19
|
Lambrinidis G, Tsantili-Kakoulidou A. Multi-objective optimization methods in novel drug design. Expert Opin Drug Discov 2020; 16:647-658. [PMID: 33353441 DOI: 10.1080/17460441.2021.1867095] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Introduction: In multi-objective drug design, optimization gains importance, being upgraded to a discipline that attracts its own research. Current strategies are broadly classified into single - objective optimization (SOO) and multi-objective optimization (MOO).Areas covered: Starting with SOO and the ways used to incorporate multiple criteria into it, the present review focuses on MOO techniques, their comparison, advantages, and restrictions. Pareto analysis and the concept of dominance stand in the core of MOO. The Pareto front, Pareto ranking, and limitations of Pareto-based methods, due to high dimensions and data uncertainty, are outlined. Desirability functions and the weighted sum approaches are described as stand-alone techniques to transform the MOO problem to SOO or in combination with pareto analysis and evolutionary algorithms. Representative applications in different drug research areas are also discussed.Expert opinion: Despite their limitations, the use of combined MOO techniques, as well as being complementary to SOO or in conjunction with artificial intelligence, contributes dramatically to efficient drug design, assisting decisions and increasing success probabilities. For multi-target drug design, optimization is supported by network approaches, while applicability of MOO to other fields like drug technology or biological complexity opens new perspectives in the interrelated fields of medicinal chemistry and molecular biology.
Collapse
Affiliation(s)
- George Lambrinidis
- Division of Pharmaceutical Chemistry, Department of Pharmacy, National and Kapodistrian University of Athens, Panepistimiopolis, Zografou, Athens, Greece
| | - Anna Tsantili-Kakoulidou
- Division of Pharmaceutical Chemistry, Department of Pharmacy, National and Kapodistrian University of Athens, Panepistimiopolis, Zografou, Athens, Greece
| |
Collapse
|
20
|
Yasonik J. Multiobjective de novo drug design with recurrent neural networks and nondominated sorting. J Cheminform 2020; 12:14. [PMID: 33430996 PMCID: PMC7026957 DOI: 10.1186/s13321-020-00419-6] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 02/10/2020] [Indexed: 01/28/2023] Open
Abstract
Research productivity in the pharmaceutical industry has declined significantly in recent decades, with higher costs, longer timelines, and lower success rates of drug candidates in clinical trials. This has prioritized the scalability and multiobjectivity of drug discovery and design. De novo drug design has emerged as a promising approach; molecules are generated from scratch, thus reducing the reliance on trial and error and premade molecular repositories. However, optimizing for molecular traits remains challenging, impeding the implementation of de novo methods. In this work, we propose a de novo approach capable of optimizing multiple traits collectively. A recurrent neural network was used to generate molecules which were then ranked based on multiple properties by a nondominated sorting algorithm. The best of the molecules generated were selected and used to fine-tune the recurrent neural network through transfer learning, creating a cycle that mimics the traditional design–synthesis–test cycle. We demonstrate the efficacy of this approach through a proof of concept, optimizing for constraints on molecular weight, octanol-water partition coefficient, the number of rotatable bonds, hydrogen bond donors, and hydrogen bond acceptors simultaneously. Analysis of the molecules generated after five iterations of the cycle revealed a 14-fold improvement in the quality of generated molecules, along with improvements to the accuracy of the recurrent neural network and the structural diversity of the molecules generated. This cycle notably does not require large amounts of training data nor any handwritten scoring functions. Altogether, this approach uniquely combines scalable generation with multiobjective optimization of molecules.
Collapse
|
21
|
Hsu HH, Huang CH, Lin ST. New Data Structure for Computational Molecular Design with Atomic or Fragment Resolution. J Chem Inf Model 2019; 59:3703-3713. [PMID: 31393721 DOI: 10.1021/acs.jcim.9b00478] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A new molecular data structure and molecular structure operation algorithms are proposed for general purpose molecular design. The data structure allows for a variety of molecular operations for creating new molecules. Two types of molecular operations were developed, unimolecular and bimolecular operations. In unimolecular operations, a child molecule can be created from a parent via addition of a functional group, deletion of a fragment, mutation of an atom, etc. In bimolecular operations, children molecules are generated from two parent molecules through combination or crossover (hybridization). These molecular operations are essential for the creation and modification of molecules for the purpose of molecular design. The data structure is capable of representing linear, branched, multifunctional, and multivalent compounds. Algorithms are developed for deriving the molecular data structure of a molecule from its atomic coordinates and vice versa. We show that this new molecular data structure and the developed algorithms, referred to as Molecular Assembling and Representation Suite, allow one to generate a comprehensive library of new molecules via performing every possible molecular structure modification.
Collapse
Affiliation(s)
- Hsuan-Hao Hsu
- Department of Chemical Engineering , National Taiwan University , Taipei 10617 , Taiwan
| | - Chen-Hsuan Huang
- Department of Chemical Engineering , National Taiwan University , Taipei 10617 , Taiwan
| | - Shiang-Tai Lin
- Department of Chemical Engineering , National Taiwan University , Taipei 10617 , Taiwan
| |
Collapse
|
22
|
Cardoso-Silva J, Papageorgiou LG, Tsoka S. Network-based piecewise linear regression for QSAR modelling. J Comput Aided Mol Des 2019; 33:831-844. [PMID: 31628660 PMCID: PMC6825651 DOI: 10.1007/s10822-019-00228-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Accepted: 09/28/2019] [Indexed: 02/07/2023]
Abstract
Quantitative Structure-Activity Relationship (QSAR) models are critical in various areas of drug discovery, for example in lead optimisation and virtual screening. Recently, the need for models that are not only predictive but also interpretable has been highlighted. In this paper, a new methodology is proposed to build interpretable QSAR models by combining elements of network analysis and piecewise linear regression. The algorithm presented, modSAR, splits data using a two-step procedure. First, compounds associated with a common target are represented as a network in terms of their structural similarity, revealing modules of similar chemical properties. Second, each module is subdivided into subsets (regions), each of which is modelled by an independent linear equation. Comparative analysis of QSAR models across five data sets of protein inhibitors obtained from ChEMBL is reported and it is shown that modSAR offers similar predictive accuracy to popular algorithms, such as Random Forest and Support Vector Machine. Moreover, we show that models built by modSAR are interpretatable, capable of evaluating the applicability domain of the compounds and serve well tasks such as virtual screening and the development of new drug leads.
Collapse
Affiliation(s)
- Jonathan Cardoso-Silva
- Department of Informatics, Faculty of Natural and Mathematical Sciences, King's College London, Bush House, 30 Aldwych, London, WC2B 4BG, UK
| | - Lazaros G Papageorgiou
- Centre for Process Systems Engineering, Department of Chemical Engineering, University College London, Roberts Building, Torrington Place, London, WC1E 7JE, UK
| | - Sophia Tsoka
- Department of Informatics, Faculty of Natural and Mathematical Sciences, King's College London, Bush House, 30 Aldwych, London, WC2B 4BG, UK.
| |
Collapse
|
23
|
Chu Y, He X. MoleGear: A Java-Based Platform for Evolutionary De Novo Molecular Design. Molecules 2019; 24:E1444. [PMID: 30979097 PMCID: PMC6479339 DOI: 10.3390/molecules24071444] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 04/03/2019] [Accepted: 04/10/2019] [Indexed: 11/17/2022] Open
Abstract
A Java-based platform, MoleGear, is developed for de novo molecular design based on the chemistry development kit (CDK) and other Java packages. MoleGear uses evolutionary algorithm (EA) to explore chemical space, and a suite of fragment-based operators of growing, crossover, and mutation for assembling novel molecules that can be scored by prediction of binding free energy or a weighted-sum multi-objective fitness function. The EA can be conducted in parallel over multiple nodes to support large-scale molecular optimizations. Some complementary utilities such as fragment library design, chemical space analysis, and graphical user interface are also integrated into MoleGear. The candidate molecules as inhibitors for the human immunodeficiency virus 1 (HIV-1) protease were designed by MoleGear, which validates the potential capability for de novo molecular design.
Collapse
Affiliation(s)
- Yunhan Chu
- Department of Chemical Engineering, Norwegian University of Science and Technology, N-7491 Trondheim, Norway.
| | - Xuezhong He
- Department of Chemical Engineering, Norwegian University of Science and Technology, N-7491 Trondheim, Norway.
| |
Collapse
|
24
|
Brown N, Fiscato M, Segler MHS, Vaucher AC. GuacaMol: Benchmarking Models for de Novo Molecular Design. J Chem Inf Model 2019; 59:1096-1108. [PMID: 30887799 DOI: 10.1021/acs.jcim.8b00839] [Citation(s) in RCA: 293] [Impact Index Per Article: 58.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
De novo design seeks to generate molecules with required property profiles by virtual design-make-test cycles. With the emergence of deep learning and neural generative models in many application areas, models for molecular design based on neural networks appeared recently and show promising results. However, the new models have not been profiled on consistent tasks, and comparative studies to well-established algorithms have only seldom been performed. To standardize the assessment of both classical and neural models for de novo molecular design, we propose an evaluation framework, GuacaMol, based on a suite of standardized benchmarks. The benchmark tasks encompass measuring the fidelity of the models to reproduce the property distribution of the training sets, the ability to generate novel molecules, the exploration and exploitation of chemical space, and a variety of single and multiobjective optimization tasks. The benchmarking open-source Python code and a leaderboard can be found on https://benevolent.ai/guacamol .
Collapse
Affiliation(s)
- Nathan Brown
- BenevolentAI , 4-8 Maple Street , W1T 5HD London , U.K
| | - Marco Fiscato
- BenevolentAI , 4-8 Maple Street , W1T 5HD London , U.K
| | | | | |
Collapse
|
25
|
Lambrinidis G, Tsantili-Kakoulidou A. Challenges with multi-objective QSAR in drug discovery. Expert Opin Drug Discov 2018; 13:851-859. [DOI: 10.1080/17460441.2018.1496079] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Affiliation(s)
- George Lambrinidis
- Division of Pharmaceutical Chemistry, Department of Pharmacy, National and Kapodistrian University of Athens, Zografou, Athens, Greece
| | - Anna Tsantili-Kakoulidou
- Division of Pharmaceutical Chemistry, Department of Pharmacy, National and Kapodistrian University of Athens, Zografou, Athens, Greece
| |
Collapse
|
26
|
Suryanarayanan V, Panwar U, Chandra I, Singh SK. De Novo Design of Ligands Using Computational Methods. Methods Mol Biol 2018; 1762:71-86. [PMID: 29594768 DOI: 10.1007/978-1-4939-7756-7_5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
De novo design technique is complementary to high-throughput virtual screening and is believed to contribute in pharmaceutical development of novel drugs with desired properties at a very low cost and time-efficient manner. In this chapter, we outline the basic de novo design concepts based on computational methods with an example.
Collapse
Affiliation(s)
- Venkatesan Suryanarayanan
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, India
| | - Umesh Panwar
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, India
| | - Ishwar Chandra
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, India
| | - Sanjeev Kumar Singh
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, India.
| |
Collapse
|
27
|
Allen WJ, Fochtman BC, Balius TE, Rizzo RC. Customizable de novo design strategies for DOCK: Application to HIVgp41 and other therapeutic targets. J Comput Chem 2017; 38:2641-2663. [PMID: 28940386 DOI: 10.1002/jcc.25052] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2017] [Accepted: 08/03/2017] [Indexed: 12/12/2022]
Abstract
De novo design can be used to explore vast areas of chemical space in computational lead discovery. As a complement to virtual screening, from-scratch construction of molecules is not limited to compounds in pre-existing vendor catalogs. Here, we present an iterative fragment growth method, integrated into the program DOCK, in which new molecules are built using rules for allowable connections based on known molecules. The method leverages DOCK's advanced scoring and pruning approaches and users can define very specific criteria in terms of properties or features to customize growth toward a particular region of chemical space. The code was validated using three increasingly difficult classes of calculations: (1) Rebuilding known X-ray ligands taken from 663 complexes using only their component parts (focused libraries), (2) construction of new ligands in 57 drug target sites using a library derived from ∼13M drug-like compounds (generic libraries), and (3) application to a challenging protein-protein interface on the viral drug target HIVgp41. The computational testing confirms that the de novo DOCK routines are robust and working as envisioned, and the compelling results highlight the potential utility for designing new molecules against a wide variety of important protein targets. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- William J Allen
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, 11794
| | - Brian C Fochtman
- Department of Biochemistry and Cell Biology, Stony Brook University, Stony Brook, New York, 11794
| | - Trent E Balius
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California, 94158
| | - Robert C Rizzo
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, 11794.,Institute of Chemical Biology and Drug Discovery, Stony Brook University, Stony Brook, New York, 11794.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, 11794
| |
Collapse
|
28
|
Bayesian molecular design with a chemical language model. J Comput Aided Mol Des 2017; 31:379-391. [PMID: 28281211 PMCID: PMC5393296 DOI: 10.1007/s10822-016-0008-z] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2016] [Accepted: 12/31/2016] [Indexed: 11/05/2022]
Abstract
The aim of computational molecular design is the identification of promising hypothetical molecules with a predefined set of desired properties. We address the issue of accelerating the material discovery with state-of-the-art machine learning techniques. The method involves two different types of prediction; the forward and backward predictions. The objective of the forward prediction is to create a set of machine learning models on various properties of a given molecule. Inverting the trained forward models through Bayes’ law, we derive a posterior distribution for the backward prediction, which is conditioned by a desired property requirement. Exploring high-probability regions of the posterior with a sequential Monte Carlo technique, molecules that exhibit the desired properties can computationally be created. One major difficulty in the computational creation of molecules is the exclusion of the occurrence of chemically unfavorable structures. To circumvent this issue, we derive a chemical language model that acquires commonly occurring patterns of chemical fragments through natural language processing of ASCII strings of existing compounds, which follow the SMILES chemical language notation. In the backward prediction, the trained language model is used to refine chemical strings such that the properties of the resulting structures fall within the desired property region while chemically unfavorable structures are successfully removed. The present method is demonstrated through the design of small organic molecules with the property requirements on HOMO-LUMO gap and internal energy. The R package iqspr is available at the CRAN repository.
Collapse
|
29
|
Daeyaert F, Deem MW. A Pareto Algorithm for Efficient De Novo Design of Multi-functional Molecules. Mol Inform 2016; 36. [PMID: 28124835 DOI: 10.1002/minf.201600044] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 07/06/2016] [Indexed: 12/19/2022]
Abstract
We have introduced a Pareto sorting algorithm into Synopsis, a de novo design program that generates synthesizable molecules with desirable properties. We give a detailed description of the algorithm and illustrate its working in 2 different de novo design settings: the design of putative dual and selective FGFR and VEGFR inhibitors, and the successful design of organic structure determining agents (OSDAs) for the synthesis of zeolites. We show that the introduction of Pareto sorting not only enables the simultaneous optimization of multiple properties but also greatly improves the performance of the algorithm to generate molecules with hard-to-meet constraints. This in turn allows us to suggest approaches to address the problem of false positive hits in de novo structure based drug design by introducing structural and physicochemical constraints in the designed molecules, and by forcing essential interactions between these molecules and their target receptor.
Collapse
Affiliation(s)
- Frits Daeyaert
- FD Computing, Stijn Streuvelsstraat 64, 2340, Beerse, Belgium.,Department of Bioengineering, Rice University, 6100 Main Street, Houston, TX, USA
| | - Micheal W Deem
- Department of Bioengineering, Rice University, 6100 Main Street, Houston, TX, USA
| |
Collapse
|
30
|
Miyao T, Kaneko H, Funatsu K. Inverse QSPR/QSAR Analysis for Chemical Structure Generation (from y to x). J Chem Inf Model 2016; 56:286-99. [PMID: 26818135 DOI: 10.1021/acs.jcim.5b00628] [Citation(s) in RCA: 66] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Retrieving descriptor information (x information) from a value of an objective variable (y) is a fundamental problem in inverse quantitative structure-property relationship (inverse-QSPR) analysis but challenging because of the complexity of the preimage function. Herewith, we propose using a cluster-wise multiple linear regression (cMLR) model as a QSPR model for inverse-QSPR analysis. x information is acquired as a probability density function by combining cMLR and the prior distribution modeled with a mixture of Gaussians (GMMs). Three case studies were conducted to demonstrate various aspects of the potential of cMLR. It was found that the predictive power of cMLR was superior to that of MLR, especially for data with nonlinearity. Moreover, it turned out that the applicability domain could be considered since the posterior distribution inherits the prior distribution's feature (i.e., training data feature) and represents the possibility of having the desired property. Finally, a series of inverse analyses with the GMMs/cMLR was demonstrated with the aim to generate de novo structures having specific aqueous solubility.
Collapse
Affiliation(s)
- Tomoyuki Miyao
- Department of Chemical System Engineering, The University of Tokyo , 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Hiromasa Kaneko
- Department of Chemical System Engineering, The University of Tokyo , 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Kimito Funatsu
- Department of Chemical System Engineering, The University of Tokyo , 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| |
Collapse
|
31
|
Klammer M, Dybowski JN, Hoffmann D, Schaab C. Pareto Optimization Identifies Diverse Set of Phosphorylation Signatures Predicting Response to Treatment with Dasatinib. PLoS One 2015; 10:e0128542. [PMID: 26083411 PMCID: PMC4470654 DOI: 10.1371/journal.pone.0128542] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2015] [Accepted: 04/26/2015] [Indexed: 01/17/2023] Open
Abstract
Multivariate biomarkers that can predict the effectiveness of targeted therapy in individual patients are highly desired. Previous biomarker discovery studies have largely focused on the identification of single biomarker signatures, aimed at maximizing prediction accuracy. Here, we present a different approach that identifies multiple biomarkers by simultaneously optimizing their predictive power, number of features, and proximity to the drug target in a protein-protein interaction network. To this end, we incorporated NSGA-II, a fast and elitist multi-objective optimization algorithm that is based on the principle of Pareto optimality, into the biomarker discovery workflow. The method was applied to quantitative phosphoproteome data of 19 non-small cell lung cancer (NSCLC) cell lines from a previous biomarker study. The algorithm successfully identified a total of 77 candidate biomarker signatures predicting response to treatment with dasatinib. Through filtering and similarity clustering, this set was trimmed to four final biomarker signatures, which then were validated on an independent set of breast cancer cell lines. All four candidates reached the same good prediction accuracy (83%) as the originally published biomarker. Although the newly discovered signatures were diverse in their composition and in their size, the central protein of the originally published signature — integrin β4 (ITGB4) — was also present in all four Pareto signatures, confirming its pivotal role in predicting dasatinib response in NSCLC cell lines. In summary, the method presented here allows for a robust and simultaneous identification of multiple multivariate biomarkers that are optimized for prediction performance, size, and relevance.
Collapse
Affiliation(s)
- Martin Klammer
- Evotec (München) GmbH, Dept. of Bioinformatics, Am Klopferspitz 19a, 82152 Martinsried, Germany
| | - J. Nikolaj Dybowski
- Evotec (München) GmbH, Dept. of Bioinformatics, Am Klopferspitz 19a, 82152 Martinsried, Germany
| | - Daniel Hoffmann
- Center for Medical Biotechnology, University of Duisburg-Essen, Universitätsstrasse 1-4, 45141 Essen, Germany
| | - Christoph Schaab
- Evotec (München) GmbH, Dept. of Bioinformatics, Am Klopferspitz 19a, 82152 Martinsried, Germany
- Max-Plack Institute for Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
- * E-mail:
| |
Collapse
|
32
|
Le TC, Winkler DA. A Bright Future for Evolutionary Methods in Drug Design. ChemMedChem 2015; 10:1296-300. [DOI: 10.1002/cmdc.201500161] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2015] [Revised: 05/01/2015] [Indexed: 11/12/2022]
|
33
|
Firth NC, Atrash B, Brown N, Blagg J. MOARF, an Integrated Workflow for Multiobjective Optimization: Implementation, Synthesis, and Biological Evaluation. J Chem Inf Model 2015; 55:1169-80. [DOI: 10.1021/acs.jcim.5b00073] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Nicholas C. Firth
- Cancer Research UK Cancer
Therapeutics Unit, Division of Cancer Therapeutics, The Institute of Cancer Research, London, SM2 5NG, U.K
| | - Butrus Atrash
- Cancer Research UK Cancer
Therapeutics Unit, Division of Cancer Therapeutics, The Institute of Cancer Research, London, SM2 5NG, U.K
| | - Nathan Brown
- Cancer Research UK Cancer
Therapeutics Unit, Division of Cancer Therapeutics, The Institute of Cancer Research, London, SM2 5NG, U.K
| | - Julian Blagg
- Cancer Research UK Cancer
Therapeutics Unit, Division of Cancer Therapeutics, The Institute of Cancer Research, London, SM2 5NG, U.K
| |
Collapse
|
34
|
Devi RV, Sathya SS, Coumar MS. Evolutionary algorithms for de novo drug design – A survey. Appl Soft Comput 2015. [DOI: 10.1016/j.asoc.2014.09.042] [Citation(s) in RCA: 76] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
35
|
Mishima K, Kaneko H, Funatsu K. Development of a New De Novo Design Algorithm for Exploring Chemical Space. Mol Inform 2014; 33:779-89. [PMID: 27485424 DOI: 10.1002/minf.201400056] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2014] [Accepted: 07/29/2014] [Indexed: 01/10/2023]
Abstract
In the first stage of development of new drugs, various lead compounds with high activity are required. To design such compounds, we focus on chemical space defined by structural descriptors. New compounds close to areas where highly active compounds exist will show the same degree of activity. We have developed a new de novo design system to search a target area in chemical space. First, highly active compounds are manually selected as initial seeds. Then, the seeds are entered into our system, and structures slightly different from the seeds are generated and pooled. Next, seeds are selected from the new structure pool based on the distance from target coordinates on the map. To test the algorithm, we used two datasets of ligand binding affinity and showed that the proposed generator could produce diverse virtual compounds that had high activity in docking simulations.
Collapse
Affiliation(s)
- Kazuaki Mishima
- Department of Chemical System Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan tel:(+81) 03-5841-7751
| | - Hiromasa Kaneko
- Department of Chemical System Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan tel:(+81) 03-5841-7751
| | - Kimito Funatsu
- Department of Chemical System Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan tel:(+81) 03-5841-7751.
| |
Collapse
|
36
|
Abstract
INTRODUCTION A high-quality drug must achieve a balance of physicochemical and absorption, distribution, metabolism and elimination properties, safety and potency against its therapeutic target(s). Multiparameter optimization (MPO) methods guide the simultaneous optimization of multiple factors to quickly target compounds with the highest chance of downstream success. MPO can be combined with 'de novo design' methods to automatically generate and assess a large number of diverse structures and identify strategies to optimize a compound's overall balance of properties. AREAS COVERED The article provides a review of MPO methods and recent developments in the methods and opinions in the field. It also provides a description of advances in de novo design that improve the relevance of automatically generated compound structures and integrate MPO. Finally, the article provides discussion of a recent case study of the automatic design of ligands to polypharmacological profiles. EXPERT OPINION Recent developments have reduced the generation of chemically infeasible structures and improved the quality of compounds generated by de novo design methods. There are concerns about the ability of simple drug-like properties and ligand efficiency indices to effectively guide the detailed optimization of compounds. De novo design methods cannot identify a perfect compound for synthesis, but it can identify high-quality ideas for detailed consideration by an expert scientist.
Collapse
Affiliation(s)
- Matthew Segall
- Optibrium Ltd , 7221 Cambridge Research Park, Beach Drive, Cambridge, CB25 9TL , UK +44 1223 815902 ; +44 1223 815907 ;
| |
Collapse
|
37
|
Multi-objective optimization methods in drug design. DRUG DISCOVERY TODAY. TECHNOLOGIES 2014; 10:e427-35. [PMID: 24050140 DOI: 10.1016/j.ddtec.2013.02.001] [Citation(s) in RCA: 95] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Drug discovery is a challenging multi-objective problem where numerous pharmaceutically important objectives need to be adequately satisfied for a solution to be found. The problem is characterized by vast, complex solution spaces further perplexed by the presence of conflicting objectives. Multi-objective optimization methods, designed specifically to address such problems, have been introduced to the drug discovery field over a decade ago and have steadily gained in acceptance ever since. This paper reviews the latest multi-objective methods and applications reported in the literature, specifically in quantitative structure–activity modeling, docking, de novo design and library design. Further, the paper reports on related developments in drug discovery research and advances in the multi-objective optimization field.
Collapse
|
38
|
Beccari AR, Cavazzoni C, Beato C, Costantino G. LiGen: a high performance workflow for chemistry driven de novo design. J Chem Inf Model 2013; 53:1518-27. [PMID: 23617275 DOI: 10.1021/ci400078g] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Tools for molecular de novo design are actively sought incorporating sets of chemical rules for fast and efficient identification of structurally new chemotypes endowed with a desired set of biological properties. In this paper, we present LiGen, a suite of programs which can be used sequentially or as stand-alone tools for specific purposes. In its standard application, LiGen modules are used to define input constraints, either structure-based, through active site identification, or ligand-based, through pharmacophore definition, to docking and to de novo generation. Alternatively, individual modules can be combined in a user-defined manner to generate project-centric workflows. Specific features of LiGen are the use of a pharmacophore-based docking procedure which allows flexible docking without conformer enumeration and accurate and flexible reactant mapping coupled with reactant tagging through substructure searching. The full description of LiGen functionalities is presented.
Collapse
Affiliation(s)
- Andrea R Beccari
- Dompé R&D Centre, Dompé SpA, Via Campo di Pile, 67100 L'Aquila, Italy.
| | | | | | | |
Collapse
|
39
|
Affiliation(s)
- Peter Willett
- Information School, University of Sheffield, 211 Portobello Street, Sheffield S1 4DP, United Kingdom.
| |
Collapse
|
40
|
Besnard J, Ruda GF, Setola V, Abecassis K, Rodriguiz RM, Huang XP, Norval S, Sassano MF, Shin AI, Webster LA, Simeons FRC, Stojanovski L, Prat A, Seidah NG, Constam DB, Bickerton GR, Read KD, Wetsel WC, Gilbert IH, Roth BL, Hopkins AL. Automated design of ligands to polypharmacological profiles. Nature 2012; 492:215-20. [PMID: 23235874 PMCID: PMC3653568 DOI: 10.1038/nature11691] [Citation(s) in RCA: 598] [Impact Index Per Article: 49.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2011] [Accepted: 10/19/2012] [Indexed: 12/22/2022]
Abstract
The clinical efficacy and safety of a drug is determined by its activity profile across many proteins in the proteome. However, designing drugs with a specific multi-target profile is both complex and difficult. Therefore methods to design drugs rationally a priori against profiles of several proteins would have immense value in drug discovery. Here we describe a new approach for the automated design of ligands against profiles of multiple drug targets. The method is demonstrated by the evolution of an approved acetylcholinesterase inhibitor drug into brain-penetrable ligands with either specific polypharmacology or exquisite selectivity profiles for G-protein-coupled receptors. Overall, 800 ligand-target predictions of prospectively designed ligands were tested experimentally, of which 75% were confirmed to be correct. We also demonstrate target engagement in vivo. The approach can be a useful source of drug leads when multi-target profiles are required to achieve either selectivity over other drug targets or a desired polypharmacology.
Collapse
Affiliation(s)
- Jérémy Besnard
- Division of Biological Chemistry and Drug Discovery, College of Life Sciences, University of Dundee, Dundee DD1 5EH, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Pirard B. The quest for novel chemical matter and the contribution of computer-aided de novo design. Expert Opin Drug Discov 2012; 6:225-31. [PMID: 22647201 DOI: 10.1517/17460441.2011.554394] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Identifying novel chemical matter is the focus of many drug discovery efforts. Through these efforts, computer-based de novo design of drug-like molecules, which aim to build an entire molecule 'from scratch', has emerged as a valuable approach to identify novel chemical matter. In this paper, the author discusses the recent research efforts that aim to build, in silico, more chemically accessible molecules, sample more efficiently the chemical space and rank the proposed molecules. The author reviews de novo design algorithms developed between 2008 and 2010 and the issue of validation, and highlights some recent successful applications of de novo design to drug discovery projects. Although research has addressed the lack of synthetic accessibility of the molecules proposed by the first generation of de novo design tools, the lack of accurate scoring function remains a major limitation of structure-based de novo design. However, de novo design is a valuable approach to generate either chemical starting points or ideas.
Collapse
Affiliation(s)
- Bernard Pirard
- Novartis Institute for BioMedical Research, Global Discovery Chemistry, Computer-Aided Drug Discovery, Forum 1, CH-4002 Basel, Switzerland +41 61 32 45 620 ;
| |
Collapse
|
42
|
Sengupta S, Bandyopadhyay S. De novo design of potential RecA inhibitors using multi objective optimization. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1139-1154. [PMID: 22392725 DOI: 10.1109/tcbb.2012.35] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
De novo ligand design involves optimization of several ligand properties such as binding affinity, ligand volume, drug likeness, etc. Therefore, optimization of these properties independently and simultaneously seems appropriate. In this paper, the ligand design problem is modeled in a multiobjective using Archived MultiObjective Simulated Annealing (AMOSA) as the underlying search algorithm. The multiple objectives considered are the energy components similarity to a known inhibitor and a novel drug likeliness measure based on Lipinski's rule of five. RecA protein of Mycobacterium tuberculosis, causative agent of tuberculosis, is taken as the target for the drug design. To gauge the goodness of the results, they are compared to the outputs of LigBuilder, NEWLEAD, and Variable genetic algorithm (VGA). The same problem has also been modeled using a well-established genetic algorithm-based multiobjective optimization technique, Nondominated Sorting Genetic Algorithm-II (NSGA-II), to find the efficacy of AMOSA through comparative analysis. Results demonstrate that while some small molecules designed by the proposed approach are remarkably similar to the known inhibitors of RecA, some new ones are discovered that may be potential candidates for novel lead molecules against tuberculosis.
Collapse
Affiliation(s)
- Soumi Sengupta
- Machine Intelligence Unit, Indian Statistical Institute, 203 B.T. Road, Kolkata 700108, India.
| | | |
Collapse
|
43
|
van der Horst E, Marqués-Gallego P, Mulder-Krieger T, van Veldhoven J, Kruisselbrink J, Aleman A, Emmerich MTM, Brussee J, Bender A, IJzerman AP. Multi-Objective Evolutionary Design of Adenosine Receptor Ligands. J Chem Inf Model 2012; 52:1713-21. [DOI: 10.1021/ci2005115] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Eelke van der Horst
- Division of Medicinal Chemistry,
Leiden/Amsterdam Center for Drug Research, P.O. Box 9502, 2300 RA
Leiden, The Netherlands
| | - Patricia Marqués-Gallego
- Division of Medicinal Chemistry,
Leiden/Amsterdam Center for Drug Research, P.O. Box 9502, 2300 RA
Leiden, The Netherlands
| | - Thea Mulder-Krieger
- Division of Medicinal Chemistry,
Leiden/Amsterdam Center for Drug Research, P.O. Box 9502, 2300 RA
Leiden, The Netherlands
| | - Jacobus van Veldhoven
- Division of Medicinal Chemistry,
Leiden/Amsterdam Center for Drug Research, P.O. Box 9502, 2300 RA
Leiden, The Netherlands
| | - Johannes Kruisselbrink
- Leiden Institute of Advanced
Computer Science, Leiden University, P.O. Box 9512, 2300RA Leiden,
The Netherlands
| | - Alexander Aleman
- Leiden Institute of Advanced
Computer Science, Leiden University, P.O. Box 9512, 2300RA Leiden,
The Netherlands
| | - Michael T. M. Emmerich
- Leiden Institute of Advanced
Computer Science, Leiden University, P.O. Box 9512, 2300RA Leiden,
The Netherlands
| | - Johannes Brussee
- Division of Medicinal Chemistry,
Leiden/Amsterdam Center for Drug Research, P.O. Box 9502, 2300 RA
Leiden, The Netherlands
| | - Andreas Bender
- Division of Medicinal Chemistry,
Leiden/Amsterdam Center for Drug Research, P.O. Box 9502, 2300 RA
Leiden, The Netherlands
| | - Adriaan P. IJzerman
- Division of Medicinal Chemistry,
Leiden/Amsterdam Center for Drug Research, P.O. Box 9502, 2300 RA
Leiden, The Netherlands
| |
Collapse
|
44
|
Reymond JL, Ruddigkeit L, Blum L, van Deursen R. The enumeration of chemical space. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2012. [DOI: 10.1002/wcms.1104] [Citation(s) in RCA: 74] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
|
45
|
Sheng C, Zhang W. Fragment Informatics and Computational Fragment-Based Drug Design: An Overview and Update. Med Res Rev 2012; 33:554-98. [DOI: 10.1002/med.21255] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Chunquan Sheng
- Department of Medicinal Chemistry; School of Pharmacy; Second Military Medical University; 325 Guohe Road Shanghai 200433 People's Republic of China
| | - Wannian Zhang
- Department of Medicinal Chemistry; School of Pharmacy; Second Military Medical University; 325 Guohe Road Shanghai 200433 People's Republic of China
| |
Collapse
|
46
|
Filz OA, Lagunin AA, Filimonov DA, Poroikov VV. In silico fragment-based drug design using a PASS approach. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2012; 23:279-296. [PMID: 22372682 DOI: 10.1080/1062936x.2012.657238] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Fragment-based drug design integrates different methods to create novel ligands using fragment libraries focused on particular biological activities. Experimental approaches to the preparation of fragment libraries have some drawbacks caused by the need for target crystallization (X-ray and nuclear magnetic resonance) and careful immobilization (surface plasmon resonance). Molecular modelling (docking) requires accurate data on protein-ligand interactions, which are difficult to obtain for some proteins. The main drawbacks of QSAR application are associated with the need to collect large homogeneous datasets of chemical structures with experimentally determined self-consistent quantitative values (potency). We propose a ligand-based approach to the selection of fragments with positive contribution to biological activity, developed on the basis of the PASS algorithm. The robustness of the PASS algorithm for heterogeneous datasets has been shown earlier. PASS estimates qualitative (yes/no) prediction of biological activity spectra for over 4000 biological activities and, therefore, provides the basis for the preparation of a fragment library corresponding to multiple criteria. The algorithm for fragment selection has been validated using the fractions of intermolecular interactions calculated for known inhibitors of nine enzymes extracted from the Protein Data Bank database. The statistical significance of differences between fractions of intermolecular interactions corresponds, for several enzymes, to the estimated positive and negative contribution of fragments in enzyme inhibition.
Collapse
Affiliation(s)
- O A Filz
- Department of Bioinformatics, Biomedical Chemistry Institute of the Russian Medical Sciences Academy, Moscow, Russia.
| | | | | | | |
Collapse
|
47
|
Fjell CD, Hiss JA, Hancock REW, Schneider G. Designing antimicrobial peptides: form follows function. Nat Rev Drug Discov 2011; 11:37-51. [PMID: 22173434 DOI: 10.1038/nrd3591] [Citation(s) in RCA: 1350] [Impact Index Per Article: 103.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Multidrug-resistant bacteria are a severe threat to public health. Conventional antibiotics are becoming increasingly ineffective as a result of resistance, and it is imperative to find new antibacterial strategies. Natural antimicrobials, known as host defence peptides or antimicrobial peptides, defend host organisms against microbes but most have modest direct antibiotic activity. Enhanced variants have been developed using straightforward design and optimization strategies and are being tested clinically. Here, we describe advanced computer-assisted design strategies that address the difficult problem of relating primary sequence to peptide structure, and are delivering more potent, cost-effective, broad-spectrum peptides as potential next-generation antibiotics.
Collapse
Affiliation(s)
- Christopher D Fjell
- Centre for Microbial Diseases and Immunity Research, University of British Columbia, 2259 Lower Mall, Vancouver, British Columbia V6T 1Z4, Canada
| | | | | | | |
Collapse
|
48
|
Eleftheriou P, Geronikaki A, Hadjipavlou-Litina D, Vicini P, Filz O, Filimonov D, Poroikov V, Chaudhaery SS, Roy KK, Saxena AK. Fragment-based design, docking, synthesis, biological evaluation and structure-activity relationships of 2-benzo/benzisothiazolimino-5-aryliden-4-thiazolidinones as cycloxygenase/lipoxygenase inhibitors. Eur J Med Chem 2011; 47:111-24. [PMID: 22119153 DOI: 10.1016/j.ejmech.2011.10.029] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2011] [Revised: 10/11/2011] [Accepted: 10/13/2011] [Indexed: 11/16/2022]
Abstract
Balanced modulation of several targets is one of the current strategies for the treatment of multi-factorial diseases. Based on the knowledge of inflammation mechanisms, it was inferred that the balanced inhibition of cyclooxygenase-1/cyclooxygenase-2/lipoxygenase might be a promising approach for treatment of such a multifactorial disease state as inflammation. Detection of fragments responsible for interaction with enzyme's binding site provides the basis for designing new molecules with increased affinity and selectivity. A new chemoinformatics approach was proposed and applied to create a fragment library that was used to design novel inhibitors of cycloxygenase-1/cycloxygenase-2/lipoxygenase enzymes. Potential binding sites were elucidated by docking. Synthesis of novel compounds, and the in vitro/in vivo biological testing confirmed the results of computational studies. The benzothiazolyl moiety was proved to be of great significance for developing more potent inhibitors.
Collapse
Affiliation(s)
- Phaedra Eleftheriou
- Department of Medical Laboratory Studies, School of Health and Medical Care, Alexander Technological Education Institute of Thessaloniki, Thessaloniki 57400, Greece
| | | | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Lusher SJ, McGuire R, Azevedo R, Boiten JW, van Schaik RC, de Vlieg J. A molecular informatics view on best practice in multi-parameter compound optimization. Drug Discov Today 2011; 16:555-68. [DOI: 10.1016/j.drudis.2011.05.005] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2010] [Revised: 02/25/2011] [Accepted: 05/06/2011] [Indexed: 01/30/2023]
|
50
|
Nicolotti O, Giangreco I, Introcaso A, Leonetti F, Stefanachi A, Carotti A. Strategies of multi-objective optimization in drug discovery and development. Expert Opin Drug Discov 2011; 6:871-84. [DOI: 10.1517/17460441.2011.588696] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|