1
|
Wang J, Zhu F. Multi-objective molecular generation via clustered Pareto-based reinforcement learning. Neural Netw 2024; 179:106596. [PMID: 39163823 DOI: 10.1016/j.neunet.2024.106596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 06/16/2024] [Accepted: 08/01/2024] [Indexed: 08/22/2024]
Abstract
De novo molecular design is the process of learning knowledge from existing data to propose new chemical structures that satisfy the desired properties. By using de novo design to generate compounds in a directed manner, better solutions can be obtained in large chemical libraries with less comparison cost. But drug design needs to take multiple factors into consideration. For example, in polypharmacology, molecules that activate or inhibit multiple target proteins produce multiple pharmacological activities and are less susceptible to drug resistance. However, most existing molecular generation methods either focus only on affinity for a single target or fail to effectively balance the relationship between multiple targets, resulting in insufficient validity and desirability of the generated molecules. To address the problems, an approach called clustered Pareto-based reinforcement learning (CPRL) is proposed. In CPRL, a pre-trained model is constructed to grasp existing molecular knowledge in a supervised learning manner. In addition, the clustered Pareto optimization algorithm is presented to find the best solution between different objectives. The algorithm first extracts an update set from the sampled molecules through the designed aggregation-based molecular clustering. Then, the final reward is computed by constructing the Pareto frontier ranking of the molecules from the updated set. To explore the vast chemical space, a reinforcement learning agent is designed in CPRL that can be updated under the guidance of the final reward to balance multiple properties. Furthermore, to increase the internal diversity of the molecules, a fixed-parameter exploration model is used for sampling in conjunction with the agent. The experimental results demonstrate that CPRL is capable of balancing multiple properties of the molecule and has higher desirability and validity, reaching 0.9551 and 0.9923, respectively.
Collapse
Affiliation(s)
- Jing Wang
- School of Computer Science and Technology, Soochow University, Suzhou, 215006, China.
| | - Fei Zhu
- School of Computer Science and Technology, Soochow University, Suzhou, 215006, China.
| |
Collapse
|
2
|
Luukkonen S, van den Maagdenberg HW, Emmerich MTM, van Westen GJP. Artificial intelligence in multi-objective drug design. Curr Opin Struct Biol 2023; 79:102537. [PMID: 36774727 DOI: 10.1016/j.sbi.2023.102537] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 12/21/2022] [Accepted: 01/03/2023] [Indexed: 02/12/2023]
Abstract
The factors determining a drug's success are manifold, making de novo drug design an inherently multi-objective optimisation (MOO) problem. With the advent of machine learning and optimisation methods, the field of multi-objective compound design has seen a rapid increase in developments and applications. Population-based metaheuris-tics and deep reinforcement learning are the most commonly used artificial intelligence methods in the field, but recently conditional learning methods are gaining popularity. The former approaches are coupled with a MOO strat-egy which is most commonly an aggregation function, but Pareto-based strategies are widespread too. Besides these and conditional learning, various innovative approaches to tackle MOO in drug design have been proposed. Here we provide a brief overview of the field and the latest innovations.
Collapse
Affiliation(s)
- Sohvi Luukkonen
- Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, the Netherlands. https://twitter.com/sohvi_luukkonen
| | - Helle W van den Maagdenberg
- Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, the Netherlands
| | - Michael T M Emmerich
- Leiden Institute of Advanced Computer Science, Leiden University, Niels Bohrweg 1, Leiden, 2333 CC, the Netherlands
| | - Gerard J P van Westen
- Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, the Netherlands.
| |
Collapse
|
3
|
Ucak UV, Ashyrmamatov I, Lee J. Reconstruction of lossless molecular representations from fingerprints. J Cheminform 2023; 15:26. [PMID: 36823647 PMCID: PMC9948316 DOI: 10.1186/s13321-023-00693-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Accepted: 02/04/2023] [Indexed: 02/25/2023] Open
Abstract
The simplified molecular-input line-entry system (SMILES) is the most prevalent molecular representation used in AI-based chemical applications. However, there are innate limitations associated with the internal structure of SMILES representations. In this context, this study exploits the resolution and robustness of unique molecular representations, i.e., SMILES and SELFIES (SELF-referencIng Embedded strings), reconstructed from a set of structural fingerprints, which are proposed and used herein as vital representational tools for chemical and natural language processing (NLP) applications. This is achieved by restoring the connectivity information lost during fingerprint transformation with high accuracy. Notably, the results reveal that seemingly irreversible molecule-to-fingerprint conversion is feasible. More specifically, four structural fingerprints, extended connectivity, topological torsion, atom pairs, and atomic environments can be used as inputs and outputs of chemical NLP applications. Therefore, this comprehensive study addresses the major limitation of structural fingerprints that precludes their use in NLP models. Our findings will facilitate the development of text- or fingerprint-based chemoinformatic models for generative and translational tasks.
Collapse
Affiliation(s)
- Umit V. Ucak
- grid.31501.360000 0004 0470 5905Research Institute of Pharmaceutical Science, College of Pharmacy, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 08826 Republic of Korea
| | - Islambek Ashyrmamatov
- grid.412010.60000 0001 0707 9039Department of Chemistry, Kangwon National University, Chuncheon, 24341 Republic of Korea
| | - Juyong Lee
- Research Institute of Pharmaceutical Science, College of Pharmacy, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 08826, Republic of Korea. .,Molecular Medicine and Biopharmaceutical Sciences, Graduate School of Convergence Science and Technology, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 08826, Republic of Korea.
| |
Collapse
|
4
|
Nemoto K, Kaneko H. De Novo Direct Inverse QSPR/QSAR: Chemical Variational Autoencoder and Gaussian Mixture Regression Models. J Chem Inf Model 2023; 63:794-805. [PMID: 36635071 DOI: 10.1021/acs.jcim.2c01298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Herein, we propose a de novo direct inverse quantitative structure-property relationship/quantitative structure-activity relationship (QSPR/QSAR) analysis method, based on the chemical variational autoencoder (VAE) and Gaussian mixture regression (GMR) models, to generate molecules with the desired target variables of interest for properties and activities (y). A data set of molecules was analyzed, and an encoder was used to transform the simplified molecular input line entry system (SMILES) strings to latent variables (x), while a decoder was used to transform x to SMILES strings. A chemical VAE model was used for analysis and a GMR model (between x and y) was constructed for direct inverse analysis. The target y values were input into the GMR model to directly predict the x values. Following this, the predicted x values were input into the decoder associated with the chemical VAE model and the SMILES string representations (or chemical structures of molecules) were obtained as the output, indicating that the proposed method could be used to selectively obtain the molecules that were characterized by the target y values. We confirmed that the proposed method can be used to generate molecules within the target y ranges even when the conventional chemical VAE model failed to generate the target molecules.
Collapse
Affiliation(s)
- Kohei Nemoto
- Department of Applied Chemistry, School of Science and Technology, Meiji University, 1-1-1 Higashi-Mita, Tama-ku, Kawasaki, Kanagawa214-8571, Japan
| | - Hiromasa Kaneko
- Department of Applied Chemistry, School of Science and Technology, Meiji University, 1-1-1 Higashi-Mita, Tama-ku, Kawasaki, Kanagawa214-8571, Japan
| |
Collapse
|
5
|
Bon M, Bilsland A, Bower J, McAulay K. Fragment-based drug discovery-the importance of high-quality molecule libraries. Mol Oncol 2022; 16:3761-3777. [PMID: 35749608 PMCID: PMC9627785 DOI: 10.1002/1878-0261.13277] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 05/16/2022] [Accepted: 06/23/2022] [Indexed: 12/24/2022] Open
Abstract
Fragment-based drug discovery (FBDD) is now established as a complementary approach to high-throughput screening (HTS). Contrary to HTS, where large libraries of drug-like molecules are screened, FBDD screens involve smaller and less complex molecules which, despite a low affinity to protein targets, display more 'atom-efficient' binding interactions than larger molecules. Fragment hits can, therefore, serve as a more efficient start point for subsequent optimisation, particularly for hard-to-drug targets. Since the number of possible molecules increases exponentially with molecular size, small fragment libraries allow for a proportionately greater coverage of their respective 'chemical space' compared with larger HTS libraries comprising larger molecules. However, good library design is essential to ensure optimal chemical and pharmacophore diversity, molecular complexity, and physicochemical characteristics. In this review, we describe our views on fragment library design, and on what constitutes a good fragment from a medicinal and computational chemistry perspective. We highlight emerging chemical and computational technologies in FBDD and discuss strategies for optimising fragment hits. The impact of novel FBDD approaches is already being felt, with the recent approval of the covalent KRASG12C inhibitor sotorasib highlighting the utility of FBDD against targets that were long considered undruggable.
Collapse
Affiliation(s)
- Marta Bon
- Cancer Research HorizonsCancer Research UK Beatson InstituteGlasgowUK
| | - Alan Bilsland
- Cancer Research HorizonsCancer Research UK Beatson InstituteGlasgowUK
| | - Justin Bower
- Cancer Research HorizonsCancer Research UK Beatson InstituteGlasgowUK
| | - Kirsten McAulay
- Cancer Research HorizonsCancer Research UK Beatson InstituteGlasgowUK
| |
Collapse
|
6
|
Mroz A, Posligua V, Tarzia A, Wolpert EH, Jelfs KE. Into the Unknown: How Computation Can Help Explore Uncharted Material Space. J Am Chem Soc 2022; 144:18730-18743. [PMID: 36206484 PMCID: PMC9585593 DOI: 10.1021/jacs.2c06833] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Indexed: 11/28/2022]
Abstract
Novel functional materials are urgently needed to help combat the major global challenges facing humanity, such as climate change and resource scarcity. Yet, the traditional experimental materials discovery process is slow and the material space at our disposal is too vast to effectively explore using intuition-guided experimentation alone. Most experimental materials discovery programs necessarily focus on exploring the local space of known materials, so we are not fully exploiting the enormous potential material space, where more novel materials with unique properties may exist. Computation, facilitated by improvements in open-source software and databases, as well as computer hardware has the potential to significantly accelerate the rational development of materials, but all too often is only used to postrationalize experimental observations. Thus, the true predictive power of computation, where theory leads experimentation, is not fully utilized. Here, we discuss the challenges to successful implementation of computation-driven materials discovery workflows, and then focus on the progress of the field, with a particular emphasis on the challenges to reaching novel materials.
Collapse
Affiliation(s)
- Austin
M. Mroz
- Department
of Chemistry, Molecular Sciences Research Hub, Imperial College London, White City Campus,
Wood Lane, London, W12 0BZ, U.K.
| | - Victor Posligua
- Department
of Chemistry, Molecular Sciences Research Hub, Imperial College London, White City Campus,
Wood Lane, London, W12 0BZ, U.K.
| | - Andrew Tarzia
- Department
of Chemistry, Molecular Sciences Research Hub, Imperial College London, White City Campus,
Wood Lane, London, W12 0BZ, U.K.
| | - Emma H. Wolpert
- Department
of Chemistry, Molecular Sciences Research Hub, Imperial College London, White City Campus,
Wood Lane, London, W12 0BZ, U.K.
| | - Kim E. Jelfs
- Department
of Chemistry, Molecular Sciences Research Hub, Imperial College London, White City Campus,
Wood Lane, London, W12 0BZ, U.K.
| |
Collapse
|
7
|
Carbery A, Skyner R, von Delft F, Deane CM. Fragment Libraries Designed to Be Functionally Diverse Recover Protein Binding Information More Efficiently Than Standard Structurally Diverse Libraries. J Med Chem 2022; 65:11404-11413. [PMID: 35960886 PMCID: PMC9421645 DOI: 10.1021/acs.jmedchem.2c01004] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Current fragment-based drug design relies on the efficient exploration of chemical space by using structurally diverse libraries of small fragments. However, structurally dissimilar compounds can exploit the same interactions and thus be functionally similar. Using three-dimensional structures of many fragments bound to multiple targets, we examined if a better strategy for selecting fragments for screening libraries exists. We show that structurally diverse fragments can be described as functionally redundant, often making the same interactions. Ranking fragments by the number of novel interactions they made, we show that functionally diverse selections of fragments substantially increase the amount of information recovered for unseen targets compared to the amounts recovered by other methods of selection. Using these results, we design small functionally efficient libraries that can give significantly more information about new protein targets than similarly sized structurally diverse libraries. By covering more functional space, we can generate more diverse sets of drug leads.
Collapse
Affiliation(s)
- Anna Carbery
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, U.K.,Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, U.K
| | - Rachael Skyner
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, U.K
| | - Frank von Delft
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, U.K.,Centre for Medicines Discovery, University of Oxford, Oxford OX3 7DQ, U.K
| | - Charlotte M Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, U.K
| |
Collapse
|
8
|
Ton AT, Pandey M, Smith JR, Ban F, Fernandez M, Cherkasov A. Targeting SARS-CoV-2 Papain-Like Protease in the Post-Vaccine Era. Trends Pharmacol Sci 2022; 43:906-919. [PMID: 36114026 PMCID: PMC9399131 DOI: 10.1016/j.tips.2022.08.008] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 08/10/2022] [Accepted: 08/19/2022] [Indexed: 11/29/2022]
Abstract
While vaccines remain at the forefront of global healthcare responses, pioneering therapeutics against SARS-CoV-2 are expected to fill the gaps for waning immunity. Rapid development and approval of orally available direct-acting antivirals targeting crucial SARS-CoV-2 proteins marked the beginning of the era of small-molecule drugs for COVID-19. In that regard, the papain-like protease (PLpro) can be considered a major SARS-CoV-2 therapeutic target due to its dual biological role in suppressing host innate immune responses and in ensuring viral replication. Here, we summarize the challenges of targeting PLpro and innovative early-stage PLpro-specific small molecules. We propose that state-of-the-art computer-aided drug design (CADD) methodologies will play a critical role in the discovery of PLpro compounds as a novel class of COVID-19 drugs.
Collapse
Affiliation(s)
- Anh-Tien Ton
- Vancouver Prostate Centre, University of British Columbia, Vancouver, BC, Canada
| | - Mohit Pandey
- Vancouver Prostate Centre, University of British Columbia, Vancouver, BC, Canada
| | - Jason R Smith
- Vancouver Prostate Centre, University of British Columbia, Vancouver, BC, Canada; Department of Chemistry, Simon Fraser University, Burnaby, Canada
| | - Fuqiang Ban
- Vancouver Prostate Centre, University of British Columbia, Vancouver, BC, Canada
| | - Michael Fernandez
- Vancouver Prostate Centre, University of British Columbia, Vancouver, BC, Canada
| | - Artem Cherkasov
- Vancouver Prostate Centre, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
9
|
Bilsland AE, Pugliese A, Bower J. Implementation of an AI-assisted fragment-generator in an open-source platform. RSC Med Chem 2022; 13:1205-1211. [PMID: 36320432 PMCID: PMC9579942 DOI: 10.1039/d2md00152g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 07/27/2022] [Indexed: 11/21/2022] Open
Abstract
We recently reported a deep learning model to facilitate fragment library design, which is critical for efficient hit identification. However, our model was implemented in Python. We have now created an implementation in the KNIME graphical pipelining environment which we hope will allow experimentation by users with limited programming knowledge. We report a deep learning model to facilitate fragment library design, which is critical for efficient hit identification, and an implementation in the KNIME graphical workflow environment which should facilitate a more codeless use.![]()
Collapse
Affiliation(s)
- Alan E. Bilsland
- Cancer Research Horizons – Therapeutic Innovation, Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Glasgow G61 1BD, UK
| | - Angelo Pugliese
- BioAscent Discovery, Bo'Ness Road, Newhouse, Lanarkshire ML1 5UH, UK
| | - Justin Bower
- Cancer Research Horizons – Therapeutic Innovation, Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Glasgow G61 1BD, UK
| |
Collapse
|
10
|
Fragment-to-lead tailored in silico design. DRUG DISCOVERY TODAY. TECHNOLOGIES 2021; 40:44-57. [PMID: 34916022 DOI: 10.1016/j.ddtec.2021.08.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 06/25/2021] [Accepted: 08/11/2021] [Indexed: 02/07/2023]
Abstract
Fragment-based drug discovery (FBDD) emerged as a disruptive technology and became established during the last two decades. Its rationality and low entry costs make it appealing, and the numerous examples of approved drugs discovered through FBDD validate the approach. However, FBDD still faces numerous challenges. Perhaps the most important one is the transformation of the initial fragment hits into viable leads. Fragment-to-lead (F2L) optimization is resource-intensive and is therefore limited in the possibilities that can be actively pursued. In silico strategies play an important role in F2L, as they can perform a deeper exploration of chemical space, prioritize molecules with high probabilities of being active and generate non-obvious ideas. Here we provide a critical overview of current in silico strategies in F2L optimization and highlight their remarkable impact. While very effective, most solutions are target- or fragment- specific. We propose that fully integrated in silico strategies, capable of automatically and systematically exploring the fast-growing available chemical space can have a significant impact on accelerating the release of fragment originated drugs.
Collapse
|