Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Brown N, Fiscato M, Segler MHS, Vaucher AC. GuacaMol: Benchmarking Models for de Novo Molecular Design. J Chem Inf Model 2019;59:1096-1108. [PMID: 30887799 DOI: 10.1021/acs.jcim.8b00839] [Citation(s) in RCA: 309] [Impact Index Per Article: 51.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

For:	Brown N, Fiscato M, Segler MHS, Vaucher AC. GuacaMol: Benchmarking Models for de Novo Molecular Design. J Chem Inf Model 2019;59:1096-1108. [PMID: 30887799 DOI: 10.1021/acs.jcim.8b00839] [Citation(s) in RCA: 309] [Impact Index Per Article: 51.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Number

Cited by Other Article(s)

Liu Q, He D, Fan M, Wang J, Cui Z, Wang H, Mi Y, Li N, Meng Q, Hou Y. Prediction and Interpretation Microglia Cytotoxicity by Machine Learning. J Chem Inf Model 2024;64:9306-9326. [PMID: 38949724 DOI: 10.1021/acs.jcim.4c00366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]

Abstract

Ameliorating microglia-mediated neuroinflammation is a crucial strategy in developing new drugs for neurodegenerative diseases. Plant compounds are an important screening target for the discovery of drugs for the treatment of neurodegenerative diseases. However, due to the spatial complexity of phytochemicals, it becomes particularly important to evaluate the effectiveness of compounds while avoiding the mixing of cytotoxic substances in the early stages of compound screening. Traditional high-throughput screening methods suffer from high cost and low efficiency. A computational model based on machine learning provides a novel avenue for cytotoxicity determination. In this study, a microglia cytotoxicity classifier was developed using a machine learning approach. First, we proposed a data splitting strategy based on the molecule murcko generic scaffold, under this condition, three machine learning approaches were coupled with three kinds of molecular representation methods to construct microglia cytotoxicity classifier, which were then compared and assessed by the predictive accuracy, balanced accuracy, F1-score, and Matthews Correlation Coefficient. Then, the recursive feature elimination integrated with support vector machine (RFE-SVC) dimension reduction method was introduced to molecular fingerprints with high dimensions to further improve the model performance. Among all the microglial cytotoxicity classifiers, the SVM coupled with ECFP4 fingerprint after feature selection (ECFP4-RFE-SVM) obtained the most accurate classification for the test set (ACC of 0.99, BA of 0.99, F1-score of 0.99, MCC of 0.97). Finally, the Shapley additive explanations (SHAP) method was used in interpreting the microglia cytotoxicity classifier and key substructure smart identified as structural alerts. Experimental results show that ECFP4-RFE-SVM have reliable classification capability for microglia cytotoxicity, and SHAP can not only provide a rational explanation for microglia cytotoxicity predictions, but also offer a guideline for subsequent molecular cytotoxicity modifications.

Collapse

Affiliation(s)

Qing Liu College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
Dakuo He College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
Mengmeng Fan College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
Jinpeng Wang College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
Zeyu Cui College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
Hao Wang College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
Yan Mi Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Key Laboratory of Data Analytics and Optimization for Smart Industry, Ministry of Education, Northeastern University, Shenyang 110169, P. R. China
Ning Li School of Traditional Chinese Materia Medica, Key Laboratory for TCM Material Basis Study and Innovative Drug Development of Shenyang City, Shenyang Pharmaceutical University, Shenyang 110016, P. R. China
Qingqi Meng Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Key Laboratory of Data Analytics and Optimization for Smart Industry, Ministry of Education, Northeastern University, Shenyang 110169, P. R. China
Yue Hou Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Key Laboratory of Data Analytics and Optimization for Smart Industry, Ministry of Education, Northeastern University, Shenyang 110169, P. R. China

Collapse

Li J, Zhang O, Sun K, Wang Y, Guan X, Bagni D, Haghighatlari M, Kearns FL, Parks C, Amaro RE, Head-Gordon T. Mining for Potent Inhibitors through Artificial Intelligence and Physics: A Unified Methodology for Ligand Based and Structure Based Drug Design. J Chem Inf Model 2024;64:9082-9097. [PMID: 38843070 DOI: 10.1021/acs.jcim.4c00634] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]

Nie D, Zhao H, Zhang O, Weng G, Zhang H, Jin J, Lin H, Huang Y, Liu L, Li D, Hou T, Kang Y. Durian: A Comprehensive Benchmark for Structure-Based 3D Molecular Generation. J Chem Inf Model 2024. [PMID: 39681323 DOI: 10.1021/acs.jcim.4c02232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2024]

Abstract

Three-dimensional (3D) molecular generation models employ deep neural networks to simultaneously generate both topological representation and molecular conformations. Due to their advantages in utilizing the structural and interaction information on targets, as well as their reduced reliance on existing bioactivity data, these models have attracted widespread attention. However, limited training and testing data sets and the unexpected biases inherent in single evaluation metrics pose a significant challenge in comparing these models in practical settings. In this work, we proposed Durian, an evaluation framework for structure-based 3D molecular generation that incorporates protein-ligand data with experimental affinity and a comprehensive array of physicochemical and geometric metrics. The benchmark tasks encompass assessing the capability of models to reproduce the property distribution of training sets, generate molecules with rational distributions of drug-related properties, and exhibit potential high affinity toward given targets. Binding affinities were evaluated using three independent docking methods (QuickVina2, Surflex and Gnina) with both "Dock" and "Score" modes to reduce false positives arising from conformational searches or scoring functions. Specifically, we applied Durian to six 3D molecular generation methods: LiGAN, Pocket2Mol, DiffSBDD, SBDD, GraphBP, and SurfGen. While most methods demonstrated the ability to generate drug-like small molecules with reasonable physicochemical properties, they exhibited varying degrees of limitations in balancing novelty, structural rationality, and synthetic accessibility, thereby constraining their practical applications in drug discovery. Based on a total of 17 metrics, Durian highlights the importance of multiobjective optimization in 3D molecular generation methods. For instance, SurfGen and SBDD showed relatively comprehensive performance but could benefit from further improvements in molecular conformational rationality. Our evaluation framework is expected to provide meaningful guidance for the selection, optimization, and application of 3D generative models in practical drug design tasks.

Collapse

Affiliation(s)

Dou Nie Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Huifeng Zhao Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Odin Zhang Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Gaoqi Weng Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Hui Zhang Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Jieyu Jin Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Haitao Lin Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Yufei Huang Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Liwei Liu Huawei Nanjing Research & Development Center, No. 101 Software Avenue, Yuhuatai District, Nanjing, 210012 Jiangsu, China
Dan Li Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Tingjun Hou Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Yu Kang Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China

Collapse

Flores-Hernandez H, Martinez-Ledesma E. A systematic review of deep learning chemical language models in recent era. J Cheminform 2024;16:129. [PMID: 39558376 PMCID: PMC11571686 DOI: 10.1186/s13321-024-00916-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Accepted: 10/17/2024] [Indexed: 11/20/2024] Open

Abstract

Discovering new chemical compounds with specific properties can provide advantages for fields that rely on materials for their development, although this task comes at a high cost in terms of complexity and resources. Since the beginning of the data age, deep learning techniques have revolutionized the process of designing molecules by analyzing and learning from representations of molecular data, greatly reducing the resources and time involved. Various deep learning approaches have been developed to date, using a variety of architectures and strategies, in order to explore the extensive and discontinuous chemical space, providing benefits for generating compounds with specific properties. In this study, we present a systematic review that offers a statistical description and comparison of the strategies utilized to generate molecules through deep learning techniques, utilizing the metrics proposed in Molecular Sets (MOSES) or Guacamol. The study included 48 articles retrieved from a query-based search of Scopus and Web of Science and 25 articles retrieved from citation search, yielding a total of 72 retrieved articles, of which 62 correspond to chemical language models approaches to molecule generation and other 10 retrieved articles correspond to molecular graph representations. Transformers, recurrent neural networks (RNNs), generative adversarial networks (GANs), Structured Space State Sequence (S4) models, and variational autoencoders (VAEs) are considered the main deep learning architectures used for molecule generation in the set of retrieved articles. In addition, transfer learning, reinforcement learning, and conditional learning are the most employed techniques for biased model generation and exploration of specific chemical space regions. Finally, this analysis focuses on the central themes of molecular representation, databases, training dataset size, validity-novelty trade-off, and performance of unbiased and biased chemical language models. These themes were selected to conduct a statistical analysis utilizing graphical representation and statistical tests. The resulting analysis reveals the main challenges, advantages, and opportunities in the field of chemical language models over the past four years.

Collapse

Qiang H, Wang F, Lu W, Xing X, Kim H, Merette SAM, Ayres LB, Oler E, AbuSalim JE, Roichman A, Neinast M, Cordova RA, Lee WD, Herbst E, Gupta V, Neff S, Hiebert-Giesbrecht M, Young A, Gautam V, Tian S, Wang B, Röst H, Greiner R, Chen L, Johnston CW, Foster LJ, Shapiro AM, Wishart DS, Rabinowitz JD, Skinnider MA. Language model-guided anticipation and discovery of unknown metabolites. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.13.623458. [PMID: 39605668 PMCID: PMC11601323 DOI: 10.1101/2024.11.13.623458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]

Xu W. Current Status of Computational Approaches for Small Molecule Drug Discovery. J Med Chem 2024;67:18633-18636. [PMID: 39445455 DOI: 10.1021/acs.jmedchem.4c02462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2024]

Méndez-Lucio O, Nicolaou CA, Earnshaw B. MolE: a foundation model for molecular graphs using disentangled attention. Nat Commun 2024;15:9431. [PMID: 39532853 PMCID: PMC11557931 DOI: 10.1038/s41467-024-53751-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 10/18/2024] [Indexed: 11/16/2024] Open

Wang J, Zhu F. Multi-objective molecular generation via clustered Pareto-based reinforcement learning. Neural Netw 2024;179:106596. [PMID: 39163823 DOI: 10.1016/j.neunet.2024.106596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 06/16/2024] [Accepted: 08/01/2024] [Indexed: 08/22/2024]

Abstract

De novo molecular design is the process of learning knowledge from existing data to propose new chemical structures that satisfy the desired properties. By using de novo design to generate compounds in a directed manner, better solutions can be obtained in large chemical libraries with less comparison cost. But drug design needs to take multiple factors into consideration. For example, in polypharmacology, molecules that activate or inhibit multiple target proteins produce multiple pharmacological activities and are less susceptible to drug resistance. However, most existing molecular generation methods either focus only on affinity for a single target or fail to effectively balance the relationship between multiple targets, resulting in insufficient validity and desirability of the generated molecules. To address the problems, an approach called clustered Pareto-based reinforcement learning (CPRL) is proposed. In CPRL, a pre-trained model is constructed to grasp existing molecular knowledge in a supervised learning manner. In addition, the clustered Pareto optimization algorithm is presented to find the best solution between different objectives. The algorithm first extracts an update set from the sampled molecules through the designed aggregation-based molecular clustering. Then, the final reward is computed by constructing the Pareto frontier ranking of the molecules from the updated set. To explore the vast chemical space, a reinforcement learning agent is designed in CPRL that can be updated under the guidance of the final reward to balance multiple properties. Furthermore, to increase the internal diversity of the molecules, a fixed-parameter exploration model is used for sampling in conjunction with the agent. The experimental results demonstrate that CPRL is capable of balancing multiple properties of the molecule and has higher desirability and validity, reaching 0.9551 and 0.9923, respectively.

Collapse

Nakata S, Mori Y, Tanaka S. Navigating Ultralarge Virtual Chemical Spaces with Product-of-Experts Chemical Language Models. J Chem Inf Model 2024;64:7873-7884. [PMID: 39413401 DOI: 10.1021/acs.jcim.4c01214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2024]

Suzuki T, Ma D, Yasuo N, Sekijima M. Mothra: Multiobjective de novo Molecular Generation Using Monte Carlo Tree Search. J Chem Inf Model 2024;64:7291-7302. [PMID: 39317969 PMCID: PMC11481094 DOI: 10.1021/acs.jcim.4c00759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/26/2024]

Alakhdar A, Poczos B, Washburn N. Diffusion Models in De Novo Drug Design. J Chem Inf Model 2024;64:7238-7256. [PMID: 39322943 PMCID: PMC11481093 DOI: 10.1021/acs.jcim.4c01107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 09/14/2024] [Accepted: 09/16/2024] [Indexed: 09/27/2024]

Cheng AH, Ser CT, Skreta M, Guzmán-Cordero A, Thiede L, Burger A, Aldossary A, Leong SX, Pablo-García S, Strieth-Kalthoff F, Aspuru-Guzik A. Spiers Memorial Lecture: How to do impactful research in artificial intelligence for chemistry and materials science. Faraday Discuss 2024. [PMID: 39400305 DOI: 10.1039/d4fd00153b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2024]

Affiliation(s)

Austin H Cheng Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada. Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
Cher Tian Ser Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada. Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
Marta Skreta Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
Andrés Guzmán-Cordero Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada Tinbergen Institute, University of Amsterdam, Amsterdam, Netherlands
Luca Thiede Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
Andreas Burger Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
Abdulrahman Aldossary Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada.
Shi Xuan Leong Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada. School of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, Singapore 63737, Singapore
Sergio Pablo-García Acceleration Consortium, Toronto, Ontario M5G 1X6, Canada
Felix Strieth-Kalthoff School of Mathematics and Natural Sciences, University of Wuppertal, Wuppertal, Germany
Alán Aspuru-Guzik Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada. Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada Acceleration Consortium, Toronto, Ontario M5G 1X6, Canada Department of Chemical Engineering and Applied Chemistry, University of Toronto, Canada Department of Materials Science and Engineering, University of Toronto, Canada Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), Canada

Collapse

Roucairol M, Georgiou A, Cazenave T, Prischi F, Pardo OE. DrugSynthMC: An Atom-Based Generation of Drug-like Molecules with Monte Carlo Search. J Chem Inf Model 2024;64:7097-7107. [PMID: 39249497 PMCID: PMC11423341 DOI: 10.1021/acs.jcim.4c01451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/10/2024]

Kneiding H, Balcells D. Augmenting genetic algorithms with machine learning for inverse molecular design. Chem Sci 2024:d4sc02934h. [PMID: 39296997 PMCID: PMC11404003 DOI: 10.1039/d4sc02934h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Accepted: 09/09/2024] [Indexed: 09/21/2024] Open

Bhattacharya D, Cassady HJ, Hickner MA, Reinhart WF. Large Language Models as Molecular Design Engines. J Chem Inf Model 2024. [PMID: 39231030 DOI: 10.1021/acs.jcim.4c01396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2024]

Liu Y, Zhang R, Yuan Y, Ma J, Li T, Yu Z. A Multi-view Molecular Pre-training with Generative Contrastive Learning. Interdiscip Sci 2024;16:741-754. [PMID: 38710957 DOI: 10.1007/s12539-024-00632-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 03/20/2024] [Accepted: 04/06/2024] [Indexed: 05/08/2024]

Lavecchia A. Navigating the frontier of drug-like chemical space with cutting-edge generative AI models. Drug Discov Today 2024;29:104133. [PMID: 39103144 DOI: 10.1016/j.drudis.2024.104133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 07/20/2024] [Accepted: 07/31/2024] [Indexed: 08/07/2024]

Tom G, Schmid SP, Baird SG, Cao Y, Darvish K, Hao H, Lo S, Pablo-García S, Rajaonson EM, Skreta M, Yoshikawa N, Corapi S, Akkoc GD, Strieth-Kalthoff F, Seifrid M, Aspuru-Guzik A. Self-Driving Laboratories for Chemistry and Materials Science. Chem Rev 2024;124:9633-9732. [PMID: 39137296 PMCID: PMC11363023 DOI: 10.1021/acs.chemrev.4c00055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]

Affiliation(s)

Gary Tom Department of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada Department of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada Vector Institute for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
Stefan P. Schmid Department of Chemistry and Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 1, CH-8093 Zurich, Switzerland
Sterling G. Baird Acceleration Consortium, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
Yang Cao Department of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada Department of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada Acceleration Consortium, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
Kourosh Darvish Department of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada Vector Institute for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada Acceleration Consortium, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
Han Hao Department of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada Department of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada Acceleration Consortium, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
Stanley Lo Department of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
Sergio Pablo-García Department of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada Department of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
Ella M. Rajaonson Department of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada Vector Institute for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
Marta Skreta Department of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada Vector Institute for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
Naruki Yoshikawa Department of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada Vector Institute for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
Samantha Corapi Department of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
Gun Deniz Akkoc Forschungszentrum Jülich GmbH, Helmholtz Institute for Renewable Energy Erlangen-Nürnberg, Cauerstr. 1, 91058 Erlangen, Germany Department of Chemical and Biological Engineering, Friedrich-Alexander Universität Erlangen-Nürnberg, Egerlandstr. 3, 91058 Erlangen, Germany
Felix Strieth-Kalthoff Department of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada Department of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada School of Mathematics and Natural Sciences, University of Wuppertal, Gaußstraße 20, 42119 Wuppertal, Germany
Martin Seifrid Department of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada Department of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada Department of Materials Science and Engineering, North Carolina State University, Raleigh, North Carolina 27695, United States of America
Alán Aspuru-Guzik Department of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada Department of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada Vector Institute for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada Acceleration Consortium, 80 St. George St, Toronto, Ontario M5S 3H6, Canada Department of Chemical Engineering & Applied Chemistry, University of Toronto, Toronto, Ontario M5S 3E5, Canada Department of Materials Science & Engineering, University of Toronto, Toronto, Ontario M5S 3E4, Canada Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), 661 University Ave, Toronto, Ontario M5G 1M1, Canada

Collapse

Renz P, Luukkonen S, Klambauer G. Diverse Hits in De Novo Molecule Design: Diversity-Based Comparison of Goal-Directed Generators. J Chem Inf Model 2024;64:5756-5761. [PMID: 39029090 PMCID: PMC11323242 DOI: 10.1021/acs.jcim.4c00519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 07/10/2024] [Accepted: 07/11/2024] [Indexed: 07/21/2024]

Bou A, Thomas M, Dittert S, Navarro C, Majewski M, Wang Y, Patel S, Tresadern G, Ahmad M, Moens V, Sherman W, Sciabola S, De Fabritiis G. ACEGEN: Reinforcement Learning of Generative Chemical Agents for Drug Discovery. J Chem Inf Model 2024;64:5900-5911. [PMID: 39092857 PMCID: PMC11581341 DOI: 10.1021/acs.jcim.4c00895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 07/03/2024] [Accepted: 07/19/2024] [Indexed: 08/04/2024]

Hu X, Liu G, Yao Q, Zhao Y, Zhang H. Hamiltonian diversity: effectively measuring molecular diversity by shortest Hamiltonian circuits. J Cheminform 2024;16:94. [PMID: 39113120 PMCID: PMC11308660 DOI: 10.1186/s13321-024-00883-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 07/11/2024] [Indexed: 08/10/2024] Open

Liu Y, Xu C, Yang X, Zhang Y, Chen Y, Liu H. Application progress of deep generative models in de novo drug design. Mol Divers 2024;28:2411-2427. [PMID: 39097862 DOI: 10.1007/s11030-024-10942-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Accepted: 07/16/2024] [Indexed: 08/05/2024]

Saifi I, Bhat BA, Hamdani SS, Bhat UY, Lobato-Tapia CA, Mir MA, Dar TUH, Ganie SA. Artificial intelligence and cheminformatics tools: a contribution to the drug development and chemical science. J Biomol Struct Dyn 2024;42:6523-6541. [PMID: 37434311 DOI: 10.1080/07391102.2023.2234039] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Accepted: 07/03/2023] [Indexed: 07/13/2023]

Wang Q, Hu X, Wei Z, Lu H, Liu H. Reinforcement learning-driven exploration of peptide space: accelerating generation of drug-like peptides. Brief Bioinform 2024;25:bbae444. [PMID: 39256196 PMCID: PMC11387070 DOI: 10.1093/bib/bbae444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 08/05/2024] [Accepted: 08/27/2024] [Indexed: 09/12/2024] Open

Chen S, Jung Y. Estimating the synthetic accessibility of molecules with building block and reaction-aware SAScore. J Cheminform 2024;16:83. [PMID: 39044299 PMCID: PMC11267797 DOI: 10.1186/s13321-024-00879-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Accepted: 07/09/2024] [Indexed: 07/25/2024] Open

Abstract

Synthetic accessibility prediction is a task to estimate how easily a given molecule might be synthesizable in the laboratory, playing a crucial role in computer-aided molecular design. Although synthesis planning programs can determine synthesis routes, their slow processing times make them impractical for large-scale molecule screening. On the other hand, existing rapid synthesis accessibility estimation methods offer speed but typically lack integration with actual synthesis routes and building block information. In this work, we introduce BR-SAScore, an enhanced version of SAScore that integrates the available building block information (B) and reaction knowledge (R) from synthesis planning programs into the scoring process. In particular, we differentiate fragments inherent in building blocks and fragments to be derived from synthesis (reactions) when scoring synthetic accessibility. Compared to existing methods, our experimental findings demonstrate that BR-SAScore offers more accurate and precise identification of a molecule's synthetic accessibility by the synthesis planning program with a fast calculation time. Moreover, we illustrate how BR-SAScore provides chemically interpretable results, aligning with the capability of the synthesis planning program embedded with the same reaction knowledge and available building blocks.Scientific contributionWe introduce BR-SAScore, an extension of SAScore, to estimate the synthetic accessibility of molecules by leveraging known building-block and reactivity information. In our experiments, BR-SAScore shows superior prediction performance on predicting molecule synthetic accessibility compared to previous methods, including SAScore and deep-learning models, while requiring significantly less computation time. In addition, we show that BR-SAScore is able to precisely identify the chemical fragment contributing to the synthetic infeasibility, holding great potential for future molecule synthesizability optimization.

Collapse

Özçelik R, de Ruiter S, Criscuolo E, Grisoni F. Chemical language modeling with structured state space sequence models. Nat Commun 2024;15:6176. [PMID: 39039051 PMCID: PMC11263548 DOI: 10.1038/s41467-024-50469-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 07/05/2024] [Indexed: 07/24/2024] Open

Catacutan DB, Alexander J, Arnold A, Stokes JM. Machine learning in preclinical drug discovery. Nat Chem Biol 2024:10.1038/s41589-024-01679-1. [PMID: 39030362 DOI: 10.1038/s41589-024-01679-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 06/13/2024] [Indexed: 07/21/2024]

Xia X, Liu Y, Zheng C, Zhang X, Wu Q, Gao X, Zeng X, Su Y. Evolutionary Multiobjective Molecule Optimization in an Implicit Chemical Space. J Chem Inf Model 2024;64:5161-5174. [PMID: 38870455 PMCID: PMC11235097 DOI: 10.1021/acs.jcim.4c00031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 05/08/2024] [Accepted: 05/13/2024] [Indexed: 06/15/2024]

Thomas M, Ahmad M, Tresadern G, de Fabritiis G. PromptSMILES: prompting for scaffold decoration and fragment linking in chemical language models. J Cheminform 2024;16:77. [PMID: 38965600 PMCID: PMC11225391 DOI: 10.1186/s13321-024-00866-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Accepted: 06/04/2024] [Indexed: 07/06/2024] Open

Nguyen ATN, Nguyen DTN, Koh HY, Toskov J, MacLean W, Xu A, Zhang D, Webb GI, May LT, Halls ML. The application of artificial intelligence to accelerate G protein-coupled receptor drug discovery. Br J Pharmacol 2024;181:2371-2384. [PMID: 37161878 DOI: 10.1111/bph.16140] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 04/14/2023] [Accepted: 04/27/2023] [Indexed: 05/11/2023] Open

Guo J, Schwaller P. Augmented Memory: Sample-Efficient Generative Molecular Design with Reinforcement Learning. JACS AU 2024;4:2160-2172. [PMID: 38938817 PMCID: PMC11200228 DOI: 10.1021/jacsau.4c00066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 03/29/2024] [Accepted: 04/01/2024] [Indexed: 06/29/2024]

Dobberstein N, Maass A, Hamaekers J. Llamol: a dynamic multi-conditional generative transformer for de novo molecular design. J Cheminform 2024;16:73. [PMID: 38907298 PMCID: PMC11193239 DOI: 10.1186/s13321-024-00863-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 05/19/2024] [Indexed: 06/23/2024] Open

Wu JN, Wang T, Chen Y, Tang LJ, Wu HL, Yu RQ. t-SMILES: a fragment-based molecular representation framework for de novo ligand design. Nat Commun 2024;15:4993. [PMID: 38862578 PMCID: PMC11167009 DOI: 10.1038/s41467-024-49388-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 06/04/2024] [Indexed: 06/13/2024] Open

Gangwal A, Lavecchia A. Unleashing the power of generative AI in drug discovery. Drug Discov Today 2024;29:103992. [PMID: 38663579 DOI: 10.1016/j.drudis.2024.103992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 03/22/2024] [Accepted: 04/18/2024] [Indexed: 05/04/2024]

Alberga D, Lamanna G, Graziano G, Delre P, Lomuscio MC, Corriero N, Ligresti A, Siliqi D, Saviano M, Contino M, Stefanachi A, Mangiatordi GF. DeLA-DrugSelf: Empowering multi-objective de novo design through SELFIES molecular representation. Comput Biol Med 2024;175:108486. [PMID: 38653065 DOI: 10.1016/j.compbiomed.2024.108486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 04/08/2024] [Accepted: 04/15/2024] [Indexed: 04/25/2024]

Thomas M, O'Boyle NM, Bender A, De Graaf C. MolScore: a scoring, evaluation and benchmarking framework for generative models in de novo drug design. J Cheminform 2024;16:64. [PMID: 38816825 PMCID: PMC11141043 DOI: 10.1186/s13321-024-00861-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 05/15/2024] [Indexed: 06/01/2024] Open

Lim H. Development of scoring-assisted generative exploration (SAGE) and its application to dual inhibitor design for acetylcholinesterase and monoamine oxidase B. J Cheminform 2024;16:59. [PMID: 38790018 PMCID: PMC11127438 DOI: 10.1186/s13321-024-00845-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 04/26/2024] [Indexed: 05/26/2024] Open

Abstract

De novo molecular design is the process of searching chemical space for drug-like molecules with desired properties, and deep learning has been recognized as a promising solution. In this study, I developed an effective computational method called Scoring-Assisted Generative Exploration (SAGE) to enhance chemical diversity and property optimization through virtual synthesis simulation, the generation of bridged bicyclic rings, and multiple scoring models for drug-likeness. In six protein targets, SAGE generated molecules with high scores within reasonable numbers of steps by optimizing target specificity without a constraint and even with multiple constraints such as synthetic accessibility, solubility, and metabolic stability. Furthermore, I suggested a top-ranked molecule with SAGE as dual inhibitors of acetylcholinesterase and monoamine oxidase B through multiple desired property optimization. Therefore, SAGE can generate molecules with desired properties by optimizing multiple properties simultaneously, indicating the importance of de novo design strategies in the future of drug discovery and development. SCIENTIFIC CONTRIBUTION: The scientific contribution of this study lies in the development of the Scoring-Assisted Generative Exploration (SAGE) method, a novel computational approach that significantly enhances de novo molecular design. SAGE uniquely integrates virtual synthesis simulation, the generation of complex bridged bicyclic rings, and multiple scoring models to optimize drug-like properties comprehensively. By efficiently generating molecules that meet a broad spectrum of pharmacological criteria-including target specificity, synthetic accessibility, solubility, and metabolic stability-within a reasonable number of steps, SAGE represents a substantial advancement over traditional methods. Additionally, the application of SAGE to discover dual inhibitors for acetylcholinesterase and monoamine oxidase B not only demonstrates its potential to streamline and enhance the drug development process but also highlights its capacity to create more effective and precisely targeted therapies. This study emphasizes the critical and evolving role of de novo design strategies in reshaping the future of drug discovery and development, providing promising avenues for innovative therapeutic discoveries.

Collapse

Shen A, Yuan M, Ma Y, Du J, Wang M. Complementary multi-modality molecular self-supervised learning via non-overlapping masking for property prediction. Brief Bioinform 2024;25:bbae256. [PMID: 38801702 PMCID: PMC11129775 DOI: 10.1093/bib/bbae256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 04/25/2024] [Accepted: 05/15/2024] [Indexed: 05/29/2024] Open

Chandraghatgi R, Ji HF, Rosen GL, Sokhansanj BA. Streamlining Computational Fragment-Based Drug Discovery through Evolutionary Optimization Informed by Ligand-Based Virtual Prescreening. J Chem Inf Model 2024;64:3826-3840. [PMID: 38696451 PMCID: PMC11197033 DOI: 10.1021/acs.jcim.4c00234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 04/18/2024] [Accepted: 04/19/2024] [Indexed: 05/04/2024]

Abstract

Recent advances in computational methods provide the promise of dramatically accelerating drug discovery. While mathematical modeling and machine learning have become vital in predicting drug-target interactions and properties, there is untapped potential in computational drug discovery due to the vast and complex chemical space. This paper builds on our recently published computational fragment-based drug discovery (FBDD) method called fragment databases from screened ligand drug discovery (FDSL-DD). FDSL-DD uses in silico screening to identify ligands from a vast library, fragmenting them while attaching specific attributes based on predicted binding affinity and interaction with the target subdomain. In this paper, we further propose a two-stage optimization method that utilizes the information from prescreening to optimize computational ligand synthesis. We hypothesize that using prescreening information for optimization shrinks the search space and focuses on promising regions, thereby improving the optimization for candidate ligands. The first optimization stage assembles these fragments into larger compounds using genetic algorithms, followed by a second stage of iterative refinement to produce compounds with enhanced bioactivity. To demonstrate broad applicability, the methodology is demonstrated on three diverse protein targets found in human solid cancers, bacterial antimicrobial resistance, and the SARS-CoV-2 virus. Combined, the proposed FDSL-DD and a two-stage optimization approach yield high-affinity ligand candidates more efficiently than other state-of-the-art computational FBDD methods. We further show that a multiobjective optimization method accounting for drug-likeness can still produce potential candidate ligands with a high binding affinity. Overall, the results demonstrate that integrating detailed chemical information with a constrained search framework can markedly optimize the initial drug discovery process, offering a more precise and efficient route to developing new therapeutics.

Collapse

Munson BP, Chen M, Bogosian A, Kreisberg JF, Licon K, Abagyan R, Kuenzi BM, Ideker T. De novo generation of multi-target compounds using deep generative chemistry. Nat Commun 2024;15:3636. [PMID: 38710699 DOI: 10.1038/s41467-024-47120-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 03/18/2024] [Indexed: 05/08/2024] Open

Mauri A, Bertola M. AlvaBuilder: A Software for De Novo Molecular Design. J Chem Inf Model 2024;64:2136-2142. [PMID: 37399048 PMCID: PMC11005826 DOI: 10.1021/acs.jcim.3c00610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Indexed: 07/04/2023]

Pang C, Qiao J, Zeng X, Zou Q, Wei L. Deep Generative Models in De Novo Drug Molecule Generation. J Chem Inf Model 2024;64:2174-2194. [PMID: 37934070 DOI: 10.1021/acs.jcim.3c01496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2023]

Kneiding H, Nova A, Balcells D. Directional multiobjective optimization of metal complexes at the billion-system scale. NATURE COMPUTATIONAL SCIENCE 2024;4:263-273. [PMID: 38553635 DOI: 10.1038/s43588-024-00616-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Accepted: 02/29/2024] [Indexed: 04/14/2024]

Vogt M. Chemoinformatic approaches for navigating large chemical spaces. Expert Opin Drug Discov 2024;19:403-414. [PMID: 38300511 DOI: 10.1080/17460441.2024.2313475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 01/30/2024] [Indexed: 02/02/2024]

Wang C, Ong HH, Chiba S, Rajapakse JC. GLDM: hit molecule generation with constrained graph latent diffusion model. Brief Bioinform 2024;25:bbae142. [PMID: 38581415 PMCID: PMC10998532 DOI: 10.1093/bib/bbae142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Revised: 03/08/2024] [Accepted: 03/03/2024] [Indexed: 04/08/2024] Open

Jones J, Clark RD, Lawless MS, Miller DW, Waldman M. The AI-driven Drug Design (AIDD) platform: an interactive multi-parameter optimization system integrating molecular evolution with physiologically based pharmacokinetic simulations. J Comput Aided Mol Des 2024;38:14. [PMID: 38499823 DOI: 10.1007/s10822-024-00552-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 02/13/2024] [Indexed: 03/20/2024]

Moon SW, Min SK. Gaussian Process Regression-Based Near-Infrared d-Luciferin Analogue Design Using Mutation-Controlled Graph-Based Genetic Algorithm. J Chem Inf Model 2024;64:1522-1532. [PMID: 38365605 DOI: 10.1021/acs.jcim.3c00870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2024]

Buttenschoen M, Morris GM, Deane CM. PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences. Chem Sci 2024;15:3130-3139. [PMID: 38425520 PMCID: PMC10901501 DOI: 10.1039/d3sc04185a] [Citation(s) in RCA: 40] [Impact Index Per Article: 40.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 11/17/2023] [Indexed: 03/02/2024] Open

Abstract

The last few years have seen the development of numerous deep learning-based protein-ligand docking methods. They offer huge promise in terms of speed and accuracy. However, despite claims of state-of-the-art performance in terms of crystallographic root-mean-square deviation (RMSD), upon closer inspection, it has become apparent that they often produce physically implausible molecular structures. It is therefore not sufficient to evaluate these methods solely by RMSD to a native binding mode. It is vital, particularly for deep learning-based methods, that they are also evaluated on steric and energetic criteria. We present PoseBusters, a Python package that performs a series of standard quality checks using the well-established cheminformatics toolkit RDKit. The PoseBusters test suite validates chemical and geometric consistency of a ligand including its stereochemistry, and the physical plausibility of intra- and intermolecular measurements such as the planarity of aromatic rings, standard bond lengths, and protein-ligand clashes. Only methods that both pass these checks and predict native-like binding modes should be classed as having "state-of-the-art" performance. We use PoseBusters to compare five deep learning-based docking methods (DeepDock, DiffDock, EquiBind, TankBind, and Uni-Mol) and two well-established standard docking methods (AutoDock Vina and CCDC Gold) with and without an additional post-prediction energy minimisation step using a molecular mechanics force field. We show that both in terms of physical plausibility and the ability to generalise to examples that are distinct from the training data, no deep learning-based method yet outperforms classical docking tools. In addition, we find that molecular mechanics force fields contain docking-relevant physics missing from deep-learning methods. PoseBusters allows practitioners to assess docking and molecular generation methods and may inspire new inductive biases still required to improve deep learning-based methods, which will help drive the development of more accurate and more realistic predictions.

Collapse

Wang M, Wu Z, Wang J, Weng G, Kang Y, Pan P, Li D, Deng Y, Yao X, Bing Z, Hsieh CY, Hou T. Genetic Algorithm-Based Receptor Ligand: A Genetic Algorithm-Guided Generative Model to Boost the Novelty and Drug-Likeness of Molecules in a Sampling Chemical Space. J Chem Inf Model 2024;64:1213-1228. [PMID: 38302422 DOI: 10.1021/acs.jcim.3c01964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]

Loeffler HH, He J, Tibo A, Janet JP, Voronov A, Mervin LH, Engkvist O. Reinvent 4: Modern AI-driven generative molecule design. J Cheminform 2024;16:20. [PMID: 38383444 PMCID: PMC10882833 DOI: 10.1186/s13321-024-00812-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 02/09/2024] [Indexed: 02/23/2024] Open