51
|
Du BX, Qin Y, Jiang YF, Xu Y, Yiu SM, Yu H, Shi JY. Compound–protein interaction prediction by deep learning: Databases, descriptors and models. Drug Discov Today 2022; 27:1350-1366. [DOI: 10.1016/j.drudis.2022.02.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Revised: 11/19/2021] [Accepted: 02/28/2022] [Indexed: 11/24/2022]
|
52
|
Beck H, Härter M, Haß B, Schmeck C, Baerfacker L. Small molecules and their impact in drug discovery: A perspective on the occasion of the 125th anniversary of the Bayer Chemical Research Laboratory. Drug Discov Today 2022; 27:1560-1574. [PMID: 35202802 DOI: 10.1016/j.drudis.2022.02.015] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 01/13/2022] [Accepted: 02/17/2022] [Indexed: 02/07/2023]
Abstract
The year 2021 marks the 125th anniversary of the Bayer Chemical Research Laboratory in Wuppertal, Germany. A significant number of prominent small-molecule drugs, from aspirin to Xarelto, have emerged from this research site. In this review, we shed light on historic cornerstones of small-molecule drug research, discussing current and future trends in drug discovery as well as providing a personal outlook on the future of drug research with a focus on small molecules.
Collapse
Affiliation(s)
- Hartmut Beck
- Research & Development, Pharmaceuticals, Bayer AG, Wuppertal, Germany.
| | - Michael Härter
- Research & Development, Pharmaceuticals, Bayer AG, Wuppertal, Germany
| | - Bastian Haß
- Digital & Commercial Innovation, Pharmaceuticals, Bayer AG, Berlin, Germany
| | - Carsten Schmeck
- Research & Development, Pharmaceuticals, Bayer AG, Wuppertal, Germany
| | - Lars Baerfacker
- Research & Development, Pharmaceuticals, Bayer AG, Wuppertal, Germany
| |
Collapse
|
53
|
Pavan M, Bassani D, Bolcato G, Bissaro M, Sturles M, Moro S. Computational strategies to identify new drug candidates against neuroinflammation. Curr Med Chem 2022; 29:4756-4775. [PMID: 35135446 DOI: 10.2174/0929867329666220208095122] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 12/09/2021] [Accepted: 12/13/2021] [Indexed: 11/22/2022]
Abstract
The even more increasing application of computational approaches in these last decades has deeply modified the process of discovery and commercialization of new therapeutic entities. This is especially true in the field of neuroinflammation, in which both the peculiar anatomical localization and the presence of the blood-brain barrier makeit mandatory to finely tune the candidates' physicochemical properties from the early stages of the discovery pipeline. The aim of this review is therefore to provide a general overview to the readers about the topic of neuroinflammation, together with the most common computational strategies that can be exploited to discover and design small molecules controlling neuroinflammation, especially those based on the knowledge of the three-dimensional structure of the biological targets of therapeutic interest. The techniques used to describe the molecular recognition mechanisms, such as molecular docking and molecular dynamics, will therefore be eviscerated, highlighting their advantages and their limitations. Finally, we report several case studies in which computational methods have been applied in drug discovery on neuroinflammation, focusing on the last decade's research.
Collapse
Affiliation(s)
- Matteo Pavan
- Molecular Modeling Section (MMS), Department of Pharmaceutical and Pharmacological Sciences University of Padova, via Marzolo 5, 35131 Padova, Italy
| | - Davide Bassani
- Molecular Modeling Section (MMS), Department of Pharmaceutical and Pharmacological Sciences University of Padova, via Marzolo 5, 35131 Padova, Italy
- Molecular Modeling Section (MMS), Department of Pharmaceutical and Pharmacological Sciences University of Padova, via Marzolo 5, 35131 Padova, Italy
| | - Giovanni Bolcato
- Molecular Modeling Section (MMS), Department of Pharmaceutical and Pharmacological Sciences University of Padova, via Marzolo 5, 35131 Padova, Italy
| | - Maicol Bissaro
- Molecular Modeling Section (MMS), Department of Pharmaceutical and Pharmacological Sciences University of Padova, via Marzolo 5, 35131 Padova, Italy
| | - Mattia Sturles
- Molecular Modeling Section (MMS), Department of Pharmaceutical and Pharmacological Sciences University of Padova, via Marzolo 5, 35131 Padova, Italy
| | - Stefano Moro
- Molecular Modeling Section (MMS), Department of Pharmaceutical and Pharmacological Sciences University of Padova, via Marzolo 5, 35131 Padova, Italy
| |
Collapse
|
54
|
Discovery solubility measurement and assessment of small molecules with drug development in mind. Drug Discov Today 2022; 27:1315-1325. [PMID: 35114363 DOI: 10.1016/j.drudis.2022.01.017] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 01/04/2022] [Accepted: 01/27/2022] [Indexed: 12/24/2022]
Abstract
Solubility is a key physicochemical property for the success of any drug candidate. Although the methods used and their rationales for determining solubility are subject to project needs and stages along the drug discovery-drug development pipeline, an artificial boundary can exist at the discovery-development interface. This boundary results in less effective solubility knowledge sharing and data integration among scientists in both drug discovery and drug development. Herein, we present a refreshed perspective on solubility. Solubility experimentation is not a one-size-fits-all measurement; instead, we stress the importance of constructing a seamless solubility understanding of a molecule as it progresses from a new chemical entity into a drug product.
Collapse
|
55
|
Thomas M, Boardman A, Garcia-Ortegon M, Yang H, de Graaf C, Bender A. Applications of Artificial Intelligence in Drug Design: Opportunities and Challenges. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021; 2390:1-59. [PMID: 34731463 DOI: 10.1007/978-1-0716-1787-8_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Artificial intelligence (AI) has undergone rapid development in recent years and has been successfully applied to real-world problems such as drug design. In this chapter, we review recent applications of AI to problems in drug design including virtual screening, computer-aided synthesis planning, and de novo molecule generation, with a focus on the limitations of the application of AI therein and opportunities for improvement. Furthermore, we discuss the broader challenges imposed by AI in translating theoretical practice to real-world drug design; including quantifying prediction uncertainty and explaining model behavior.
Collapse
Affiliation(s)
- Morgan Thomas
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Andrew Boardman
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Miguel Garcia-Ortegon
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK.,Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, Cambridge, UK
| | - Hongbin Yang
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
| | | | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK.
| |
Collapse
|
56
|
Vijayan RSK, Kihlberg J, Cross JB, Poongavanam V. Enhancing preclinical drug discovery with artificial intelligence. Drug Discov Today 2021; 27:967-984. [PMID: 34838731 DOI: 10.1016/j.drudis.2021.11.023] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 10/15/2021] [Accepted: 11/19/2021] [Indexed: 12/14/2022]
Abstract
Artificial intelligence (AI) is becoming an integral part of drug discovery. It has the potential to deliver across the drug discovery and development value chain, starting from target identification and reaching through clinical development. In this review, we provide an overview of current AI technologies and a glimpse of how AI is reimagining preclinical drug discovery by highlighting examples where AI has made a real impact. Considering the excitement and hyperbole surrounding AI in drug discovery, we aim to present a realistic view by discussing both opportunities and challenges in adopting AI in drug discovery.
Collapse
Affiliation(s)
- R S K Vijayan
- Institute for Applied Cancer Science, MD Anderson Cancer Center, Houston, TX, USA
| | - Jan Kihlberg
- Department of Chemistry-BMC, Uppsala University, Uppsala, Sweden
| | - Jason B Cross
- Institute for Applied Cancer Science, MD Anderson Cancer Center, Houston, TX, USA.
| | | |
Collapse
|
57
|
Joshi RP, Kumar N. Artificial Intelligence for Autonomous Molecular Design: A Perspective. Molecules 2021; 26:6761. [PMID: 34833853 PMCID: PMC8619999 DOI: 10.3390/molecules26226761] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 10/23/2021] [Accepted: 10/29/2021] [Indexed: 11/23/2022] Open
Abstract
Domain-aware artificial intelligence has been increasingly adopted in recent years to expedite molecular design in various applications, including drug design and discovery. Recent advances in areas such as physics-informed machine learning and reasoning, software engineering, high-end hardware development, and computing infrastructures are providing opportunities to build scalable and explainable AI molecular discovery systems. This could improve a design hypothesis through feedback analysis, data integration that can provide a basis for the introduction of end-to-end automation for compound discovery and optimization, and enable more intelligent searches of chemical space. Several state-of-the-art ML architectures are predominantly and independently used for predicting the properties of small molecules, their high throughput synthesis, and screening, iteratively identifying and optimizing lead therapeutic candidates. However, such deep learning and ML approaches also raise considerable conceptual, technical, scalability, and end-to-end error quantification challenges, as well as skepticism about the current AI hype to build automated tools. To this end, synergistically and intelligently using these individual components along with robust quantum physics-based molecular representation and data generation tools in a closed-loop holds enormous promise for accelerated therapeutic design to critically analyze the opportunities and challenges for their more widespread application. This article aims to identify the most recent technology and breakthrough achieved by each of the components and discusses how such autonomous AI and ML workflows can be integrated to radically accelerate the protein target or disease model-based probe design that can be iteratively validated experimentally. Taken together, this could significantly reduce the timeline for end-to-end therapeutic discovery and optimization upon the arrival of any novel zoonotic transmission event. Our article serves as a guide for medicinal, computational chemistry and biology, analytical chemistry, and the ML community to practice autonomous molecular design in precision medicine and drug discovery.
Collapse
Affiliation(s)
| | - Neeraj Kumar
- Computational Biology Group, Biological Science Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA;
| |
Collapse
|
58
|
Machine Learning Applied to the Modeling of Pharmacological and ADMET Endpoints. Methods Mol Biol 2021. [PMID: 34731464 DOI: 10.1007/978-1-0716-1787-8_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2023]
Abstract
The well-known concept of quantitative structure-activity relationships (QSAR) has been gaining significant interest in the recent years. Data, descriptors, and algorithms are the main pillars to build useful models that support more efficient drug discovery processes with in silico methods. Significant advances in all three areas are the reason for the regained interest in these models. In this book chapter we review various machine learning (ML) approaches that make use of measured in vitro/in vivo data of many compounds. We put these in context with other digital drug discovery methods and present some application examples.
Collapse
|
59
|
MetaClass, a Comprehensive Classification System for Predicting the Occurrence of Metabolic Reactions Based on the MetaQSAR Database. Molecules 2021; 26:molecules26195857. [PMID: 34641400 PMCID: PMC8512547 DOI: 10.3390/molecules26195857] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 09/20/2021] [Accepted: 09/21/2021] [Indexed: 11/25/2022] Open
Abstract
(1) Background: Machine learning algorithms are finding fruitful applications in predicting the ADME profile of new molecules, with a particular focus on metabolism predictions. However, the development of comprehensive metabolism predictors is hampered by the lack of highly accurate metabolic resources. Hence, we recently proposed a manually curated metabolic database (MetaQSAR), the level of accuracy of which is well suited to the development of predictive models. (2) Methods: MetaQSAR was used to extract datasets to predict the metabolic reactions subdivided into major classes, classes and subclasses. The collected datasets comprised a total of 3788 first-generation metabolic reactions. Predictive models were developed by using standard random forest algorithms and sets of physicochemical, stereo-electronic and constitutional descriptors. (3) Results: The developed models showed satisfactory performance, especially for hydrolyses and conjugations, while redox reactions were predicted with greater difficulty, which was reasonable as they depend on many complex features that are not properly encoded by the included descriptors. (4) Conclusions: The generated models allowed a precise comparison of the propensity of each metabolic reaction to be predicted and the factors affecting their predictability were discussed in detail. Overall, the study led to the development of a freely downloadable global predictor, MetaClass, which correctly predicts 80% of the reported reactions, as assessed by an explorative validation analysis on an external dataset, with an overall MCC = 0.44.
Collapse
|
60
|
Development and implementation of an enterprise-wide predictive model for early absorption, distribution, metabolism and excretion properties. Future Med Chem 2021; 13:1639-1654. [PMID: 34528444 DOI: 10.4155/fmc-2021-0138] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Background: Accurate prediction of absorption, distribution, metabolism and excretion (ADME) properties can facilitate the identification of promising drug candidates. Methodology & Results: The authors present the Janssen generic Target Product Profile (gTPP) model, which predicts 18 early ADME properties, employs a graph convolutional neural network algorithm and was trained on between 1000-10,000 internal data points per predicted parameter. gTPP demonstrated stronger predictive power than pretrained commercial ADME models and automatic model builders. Through a novel logging method, the authors report gTPP usage for more than 200 Janssen drug discovery scientists. Conclusion: The investigators successfully enabled the rapid and systematic implementation of predictive ML tools across a drug discovery pipeline in all therapeutic areas. This experience provides useful guidance for other large-scale AI/ML deployment efforts.
Collapse
|
61
|
Aleksić S, Seeliger D, Brown JB. ADMET Predictability at Boehringer Ingelheim: State-of-the-Art, and Do Bigger Datasets or Algorithms Make a Difference? Mol Inform 2021; 41:e2100113. [PMID: 34473408 DOI: 10.1002/minf.202100113] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 08/21/2021] [Indexed: 11/08/2022]
Abstract
Computational methods assisting drug discovery and development are routine in the pharmaceutical industry. Digital recording of ADMET assays has provided a rich source of data for development of predictive models. Despite the accumulation of data and the public availability of advanced modeling algorithms, the utility of prediction in ADMET research is not clear. Here, we present a critical evaluation of the relationships between data volume, modeling algorithm, chemical representation and grouping, and temporal aspect (time sequence of assays) using an in-house ADMET database. We find no large difference in prediction algorithms nor any systemic and substantial gain from increasingly large datasets. Temporal-based data enlargement led to performance improvement in only in a limited number of assays, and with fractional improvement at best. Assays that are well-, intermediately-, or poorly-suited for ADMET predictions and reasons for such behavior are systematically identified, generating realistic expectations for areas in which computational models can be used to guide decision making in molecular design and development.
Collapse
Affiliation(s)
- Stevan Aleksić
- Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, 88397, Biberach, Germany
| | - Daniel Seeliger
- Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, 88397, Biberach, Germany
| | - J B Brown
- Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, 88397, Biberach, Germany
| |
Collapse
|
62
|
Fialková V, Zhao J, Papadopoulos K, Engkvist O, Bjerrum EJ, Kogej T, Patronov A. LibINVENT: Reaction-based Generative Scaffold Decoration for in Silico Library Design. J Chem Inf Model 2021; 62:2046-2063. [PMID: 34460269 DOI: 10.1021/acs.jcim.1c00469] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Because of the strong relationship between the desired molecular activity and its structural core, the screening of focused, core-sharing chemical libraries is a key step in lead optimization. Despite the plethora of current research focused on in silico methods for molecule generation, to our knowledge, no tool capable of designing such libraries has been proposed. In this work, we present a novel tool for de novo drug design called LibINVENT. It is capable of rapidly proposing chemical libraries of compounds sharing the same core while maximizing a range of desirable properties. To further help the process of designing focused libraries, the user can list specific chemical reactions that can be used for the library creation. LibINVENT is therefore a flexible tool for generating virtual chemical libraries for lead optimization in a broad range of scenarios. Additionally, the shared core ensures that the compounds in the library are similar, possess desirable properties, and can also be synthesized under the same or similar conditions. The LibINVENT code is freely available in our public repository at https://github.com/MolecularAI/Lib-INVENT. The code necessary for data preprocessing is further available at: https://github.com/MolecularAI/Lib-INVENT-dataset.
Collapse
Affiliation(s)
- Vendy Fialková
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 43183, Sweden
| | - Jiaxi Zhao
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 43183, Sweden.,Department of Pharmaceutical Biosciences, Uppsala University, Uppsala 75237, Sweden
| | - Kostas Papadopoulos
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 43183, Sweden
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 43183, Sweden.,Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg 41756, Sweden
| | | | - Thierry Kogej
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 43183, Sweden
| | - Atanas Patronov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 43183, Sweden
| |
Collapse
|
63
|
Carracedo-Reboredo P, Liñares-Blanco J, Rodríguez-Fernández N, Cedrón F, Novoa FJ, Carballal A, Maojo V, Pazos A, Fernandez-Lozano C. A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J 2021; 19:4538-4558. [PMID: 34471498 PMCID: PMC8387781 DOI: 10.1016/j.csbj.2021.08.011] [Citation(s) in RCA: 103] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 08/06/2021] [Accepted: 08/06/2021] [Indexed: 12/30/2022] Open
Abstract
Drug discovery aims at finding new compounds with specific chemical properties for the treatment of diseases. In the last years, the approach used in this search presents an important component in computer science with the skyrocketing of machine learning techniques due to its democratization. With the objectives set by the Precision Medicine initiative and the new challenges generated, it is necessary to establish robust, standard and reproducible computational methodologies to achieve the objectives set. Currently, predictive models based on Machine Learning have gained great importance in the step prior to preclinical studies. This stage manages to drastically reduce costs and research times in the discovery of new drugs. This review article focuses on how these new methodologies are being used in recent years of research. Analyzing the state of the art in this field will give us an idea of where cheminformatics will be developed in the short term, the limitations it presents and the positive results it has achieved. This review will focus mainly on the methods used to model the molecular data, as well as the biological problems addressed and the Machine Learning algorithms used for drug discovery in recent years.
Collapse
Key Words
- ADMET, Absorption, distribution, metabolism, elimination and toxicity
- ADR, Adverse Drug Reaction
- AI, Artificial Intelligence
- ANN, Artificial Neural Networks
- APFP, Atom Pairs 2d FingerPrint
- AUC, Area under the Curve
- BBB, Blood–Brain barrier
- CDK, Chemical Development Kit
- CNN, Convolutional Neural Networks
- CNS, Central Nervous System
- CPI, Compound-protein interaction
- CV, Cross Validation
- Cheminformatics
- DL, Deep Learning
- DNA, Deoxyribonucleic acid
- Deep Learning
- Drug Discovery
- ECFP, Extended Connectivity Fingerprints
- FDA, Food and Drug Administration
- FNN, Fully Connected Neural Networks
- FP, Fringerprints
- FS, Feature Selection
- GCN, Graph Convolutional Networks
- GEO, Gene Expression Omnibus
- GNN, Graph Neural Networks
- GO, Gene Ontology
- KEGG, Kyoto Encyclopedia of Genes and Genomes
- MACCS, Molecular ACCess System
- MCC, Matthews correlation coefficient
- MD, Molecular Descriptors
- MKL, Multiple Kernel Learning
- ML, Machine Learning
- Machine Learning
- Molecular Descriptors
- NB, Naive Bayes
- OOB, Out of Bag
- PCA, Principal Component Analyisis
- QSAR
- QSAR, Quantitative structure–activity relationship
- RF, Random Forest
- RNA, Ribonucleic Acid
- SMILES, simplified molecular-input line-entry system
- SVM, Support Vector Machines
- TCGA, The Cancer Genome Atlas
- WHO, World Health Organization
- t-SNE, t-Distributed Stochastic Neighbor Embedding
Collapse
Affiliation(s)
- Paula Carracedo-Reboredo
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Jose Liñares-Blanco
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
| | - Nereida Rodríguez-Fernández
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Francisco Cedrón
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Francisco J. Novoa
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Adrian Carballal
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Victor Maojo
- Biomedical Informatics Group, Artificial Intelligence Department, Polytechnic University of Madrid, Calle de los Ciruelos, Boadilla del Monte, Madrid 28660, Spain
| | - Alejandro Pazos
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| | - Carlos Fernandez-Lozano
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| |
Collapse
|
64
|
Xiong G, Wu Z, Yi J, Fu L, Yang Z, Hsieh C, Yin M, Zeng X, Wu C, Lu A, Chen X, Hou T, Cao D. ADMETlab 2.0: an integrated online platform for accurate and comprehensive predictions of ADMET properties. Nucleic Acids Res 2021; 49:W5-W14. [PMID: 33893803 PMCID: PMC8262709 DOI: 10.1093/nar/gkab255] [Citation(s) in RCA: 749] [Impact Index Per Article: 249.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Revised: 03/20/2021] [Accepted: 03/30/2021] [Indexed: 02/06/2023] Open
Abstract
Because undesirable pharmacokinetics and toxicity of candidate compounds are the main reasons for the failure of drug development, it has been widely recognized that absorption, distribution, metabolism, excretion and toxicity (ADMET) should be evaluated as early as possible. In silico ADMET evaluation models have been developed as an additional tool to assist medicinal chemists in the design and optimization of leads. Here, we announced the release of ADMETlab 2.0, a completely redesigned version of the widely used AMDETlab web server for the predictions of pharmacokinetics and toxicity properties of chemicals, of which the supported ADMET-related endpoints are approximately twice the number of the endpoints in the previous version, including 17 physicochemical properties, 13 medicinal chemistry properties, 23 ADME properties, 27 toxicity endpoints and 8 toxicophore rules (751 substructures). A multi-task graph attention framework was employed to develop the robust and accurate models in ADMETlab 2.0. The batch computation module was provided in response to numerous requests from users, and the representation of the results was further optimized. The ADMETlab 2.0 server is freely available, without registration, at https://admetmesh.scbdd.com/.
Collapse
Affiliation(s)
- Guoli Xiong
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China
| | - Zhenxing Wu
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jiacai Yi
- College of Computer, National University of Defense Technology, Changsha 410073, Hunan, China
| | - Li Fu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China
| | - Zhijiang Yang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China
| | - Changyu Hsieh
- Tencent Quantum Laboratory, Tencent, Shenzhen 518057, Guangdong, China
| | - Mingzhu Yin
- Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Hunan Key Laboratory of Skin Cancer and Psoriasis, Xiangya Hospital, Central South University, Changsha 410008, Hunan, China
| | - Xiangxiang Zeng
- Deparment of Computer Science, Hunan University, Changsha 410082, Hunan, China
| | - Chengkun Wu
- College of Computer, National University of Defense Technology, Changsha 410073, Hunan, China
| | - Aiping Lu
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
| | - Xiang Chen
- Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Hunan Key Laboratory of Skin Cancer and Psoriasis, Xiangya Hospital, Central South University, Changsha 410008, Hunan, China
| | - Tingjun Hou
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China.,Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
| |
Collapse
|
65
|
Multitask machine learning models for predicting lipophilicity (logP) in the SAMPL7 challenge. J Comput Aided Mol Des 2021; 35:901-909. [PMID: 34273053 PMCID: PMC8367913 DOI: 10.1007/s10822-021-00405-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 06/22/2021] [Indexed: 12/22/2022]
Abstract
Accurate prediction of lipophilicity—logP—based on molecular structures is a well-established field. Predictions of logP are often used to drive forward drug discovery projects. Driven by the SAMPL7 challenge, in this manuscript we describe the steps that were taken to construct a novel machine learning model that can predict and generalize well. This model is based on the recently described Directed-Message Passing Neural Networks (D-MPNNs). Further enhancements included: both the inclusion of additional datasets from ChEMBL (RMSE improvement of 0.03), and the addition of helper tasks (RMSE improvement of 0.04). To the best of our knowledge, the concept of adding predictions from other models (Simulations Plus logP and logD@pH7.4, respectively) as helper tasks is novel and could be applied in a broader context. The final model that we constructed and used to participate in the challenge ranked 2/17 ranked submissions with an RMSE of 0.66, and an MAE of 0.48 (submission: Chemprop). On other datasets the model also works well, especially retrospectively applied to the SAMPL6 challenge where it would have ranked number one out of all submissions (RMSE of 0.35). Despite the fact that our model works well, we conclude with suggestions that are expected to improve the model even further.
Collapse
|
66
|
Siedhoff NE, Illig AM, Schwaneberg U, Davari MD. PyPEF-An Integrated Framework for Data-Driven Protein Engineering. J Chem Inf Model 2021; 61:3463-3476. [PMID: 34260225 DOI: 10.1021/acs.jcim.1c00099] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Data-driven strategies are gaining increased attention in protein engineering due to recent advances in access to large experimental databanks of proteins, next-generation sequencing (NGS), high-throughput screening (HTS) methods, and the development of artificial intelligence algorithms. However, the reliable prediction of beneficial amino acid substitutions, their combination, and the effect on functional properties remain the most significant challenges in protein engineering, which is applied to develop proteins and enzymes for biocatalysis, biomedicine, and life sciences. Here, we present a general-purpose framework (PyPEF: pythonic protein engineering framework) for performing data-driven protein engineering using machine learning methods combined with techniques from signal processing and statistical physics. PyPEF guides the identification and selection of beneficial proteins of a defined sequence space by systematically or randomly exploring the fitness of variants and by sampling random evolution pathways. The performance of PyPEF was evaluated concerning its predictive accuracy and throughput on four public protein and enzyme data sets using common regression models. It was proved that the program could efficiently predict the fitness of protein sequences for different target properties (predictive models with coefficient of determination values ranging from 0.58 to 0.92). By combining machine learning and protein evolution, PyPEF enabled the screening of proteins with various functions, reaching a screening capacity of more than 500,000 protein sequence variants in the timeframe of only a few minutes on a personal computer. PyPEF displayed significant accuracies on four public data sets (different proteins and properties) and underlined the potential of integrating data-driven technologies for covering different philosophies by either predicting the fitness of the variants to the highest accuracy accounting for epistatic effects or capturing the general trend of introduced mutations on the fitness in directed protein evolution campaigns. In essence, PyPEF can provide a powerful solution to current sequence exploration and combinatorial problems faced in protein engineering through exhaustive in silico screening of the sequence space.
Collapse
Affiliation(s)
- Niklas E Siedhoff
- Institute of Biotechnology, RWTH Aachen University, Worringer Weg 3, 52074 Aachen, Germany
| | | | - Ulrich Schwaneberg
- Institute of Biotechnology, RWTH Aachen University, Worringer Weg 3, 52074 Aachen, Germany.,DWI-Leibniz Institute for Interactive Materials, Forckenbeckstraße 50, 52074 Aachen, Germany
| | - Mehdi D Davari
- Institute of Biotechnology, RWTH Aachen University, Worringer Weg 3, 52074 Aachen, Germany
| |
Collapse
|
67
|
Molecular insights on ABL kinase activation using tree-based machine learning models and molecular docking. Mol Divers 2021; 25:1301-1314. [PMID: 34191245 PMCID: PMC8241884 DOI: 10.1007/s11030-021-10261-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 06/18/2021] [Indexed: 12/14/2022]
Abstract
Abelson kinase (c-Abl) is a non-receptor tyrosine kinase involved in several biological processes essential for cell differentiation, migration, proliferation, and survival. This enzyme's activation might be an alternative strategy for treating diseases such as neutropenia induced by chemotherapy, prostate, and breast cancer. Recently, a series of compounds that promote the activation of c-Abl has been identified, opening a promising ground for c-Abl drug development. Structure-based drug design (SBDD) and ligand-based drug design (LBDD) methodologies have significantly impacted recent drug development initiatives. Here, we combined SBDD and LBDD approaches to characterize critical chemical properties and interactions of identified c-Abl's activators. We used molecular docking simulations combined with tree-based machine learning models—decision tree, AdaBoost, and random forest to understand the c-Abl activators' structural features required for binding to myristoyl pocket, and consequently, to promote enzyme and cellular activation. We obtained predictive and robust models with Matthews correlation coefficient values higher than 0.4 for all endpoints and identified characteristics that led to constructing a structure–activity relationship model (SAR).
Collapse
|
68
|
Ghislat G, Rahman T, Ballester PJ. Recent progress on the prospective application of machine learning to structure-based virtual screening. Curr Opin Chem Biol 2021; 65:28-34. [PMID: 34052776 DOI: 10.1016/j.cbpa.2021.04.009] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 04/13/2021] [Accepted: 04/23/2021] [Indexed: 12/30/2022]
Abstract
As more bioactivity and protein structure data become available, scoring functions (SFs) using machine learning (ML) to leverage these data sets continue to gain further accuracy and broader applicability. Advances in our understanding of the optimal ways to train and evaluate these ML-based SFs have introduced further improvements. One of these advances is how to select the most suitable decoys (molecules assumed inactive) to train or test an ML-based SF on a given target. We also review the latest applications of ML-based SFs for prospective structure-based virtual screening (SBVS), with a focus on the observed improvement over those using classical SFs. Finally, we provide recommendations for future prospective SBVS studies based on the findings of recent methodological studies.
Collapse
Affiliation(s)
- Ghita Ghislat
- U1104, CNRS UMR7280, Centre D'Immunologie de Marseille-Luminy, Inserm, Marseille, France
| | - Taufiq Rahman
- Department of Pharmacology, University of Cambridge, Cambridge, CB2 1PD, UK
| | - Pedro J Ballester
- Centre de Recherche en Cancérologie de Marseille (CRCM), Inserm, U1068, Marseille, F-13009, France; CNRS, UMR7258, Marseille, F-13009, France; Institut Paoli-Calmettes, Marseille, F-13009, France; Aix-Marseille University, UM 105, F-13284, Marseille, France.
| |
Collapse
|
69
|
Rybenkov VV, Zgurskaya HI, Ganguly C, Leus IV, Zhang Z, Moniruzzaman M. The Whole Is Bigger than the Sum of Its Parts: Drug Transport in the Context of Two Membranes with Active Efflux. Chem Rev 2021; 121:5597-5631. [PMID: 33596653 DOI: 10.1021/acs.chemrev.0c01137] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Cell envelope plays a dual role in the life of bacteria by simultaneously protecting it from a hostile environment and facilitating access to beneficial molecules. At the heart of this ability lie the restrictive properties of the cellular membrane augmented by efflux transporters, which preclude intracellular penetration of most molecules except with the help of specialized uptake mediators. Recently, kinetic properties of the cell envelope came into focus driven on one hand by the urgent need in new antibiotics and, on the other hand, by experimental and theoretical advances in studies of transmembrane transport. A notable result from these studies is the development of a kinetic formalism that integrates the Michaelis-Menten behavior of individual transporters with transmembrane diffusion and offers a quantitative basis for the analysis of intracellular penetration of bioactive compounds. This review surveys key experimental and computational approaches to the investigation of transport by individual translocators and in whole cells, summarizes key findings from these studies and outlines implications for antibiotic discovery. Special emphasis is placed on Gram-negative bacteria, whose envelope contains two separate membranes. This feature sets these organisms apart from Gram-positive bacteria and eukaryotic cells by providing them with full benefits of the synergy between slow transmembrane diffusion and active efflux.
Collapse
Affiliation(s)
- Valentin V Rybenkov
- Department of Chemistry and Biochemistry, Stephenson Life Sciences Research Center, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| | - Helen I Zgurskaya
- Department of Chemistry and Biochemistry, Stephenson Life Sciences Research Center, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| | - Chhandosee Ganguly
- Department of Chemistry and Biochemistry, Stephenson Life Sciences Research Center, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| | - Inga V Leus
- Department of Chemistry and Biochemistry, Stephenson Life Sciences Research Center, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| | - Zhen Zhang
- Department of Chemistry and Biochemistry, Stephenson Life Sciences Research Center, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| | - Mohammad Moniruzzaman
- Department of Chemistry and Biochemistry, Stephenson Life Sciences Research Center, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| |
Collapse
|
70
|
Serafim MSM, Dos Santos Júnior VS, Gertrudes JC, Maltarollo VG, Honorio KM. Machine learning techniques applied to the drug design and discovery of new antivirals: a brief look over the past decade. Expert Opin Drug Discov 2021; 16:961-975. [PMID: 33957833 DOI: 10.1080/17460441.2021.1918098] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Introduction: Drug design and discovery of new antivirals will always be extremely important in medicinal chemistry, taking into account known and new viral diseases that are yet to come. Although machine learning (ML) have shown to improve predictions on the biological potential of chemicals and accelerate the discovery of drugs over the past decade, new methods and their combinations have improved their performance and established promising perspectives regarding ML in the search for new antivirals.Areas covered: The authors consider some interesting areas that deal with different ML techniques applied to antivirals. Recent innovative studies on ML and antivirals were selected and analyzed in detail. Also, the authors provide a brief look at the past to the present to detect advances and bottlenecks in the area.Expert opinion: From classical ML techniques, it was possible to boost the searches for antivirals. However, from the emergence of new algorithms and the improvement in old approaches, promising results will be achieved every day, as we have observed in the case of SARS-CoV-2. Recent experience has shown that it is possible to use ML to discover new antiviral candidates from virtual screening and drug repurposing.
Collapse
Affiliation(s)
- Mateus Sá Magalhães Serafim
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | | | - Jadson Castro Gertrudes
- Departamento de Computação, Instituto de Ciências Exatas e Biológicas, Universidade Federal de Ouro Preto (UFOP), Ouro Preto, Brazil
| | - Vinícius Gonçalves Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Kathia Maria Honorio
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo (USP), São Paulo, Brazil.,Centro de Ciências Naturais e Humanas, Universidade Federal do ABC (UFABC), Santo André, Brazil
| |
Collapse
|
71
|
Vatansever S, Schlessinger A, Wacker D, Kaniskan HÜ, Jin J, Zhou M, Zhang B. Artificial intelligence and machine learning-aided drug discovery in central nervous system diseases: State-of-the-arts and future directions. Med Res Rev 2021; 41:1427-1473. [PMID: 33295676 PMCID: PMC8043990 DOI: 10.1002/med.21764] [Citation(s) in RCA: 95] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 10/30/2020] [Accepted: 11/20/2020] [Indexed: 01/11/2023]
Abstract
Neurological disorders significantly outnumber diseases in other therapeutic areas. However, developing drugs for central nervous system (CNS) disorders remains the most challenging area in drug discovery, accompanied with the long timelines and high attrition rates. With the rapid growth of biomedical data enabled by advanced experimental technologies, artificial intelligence (AI) and machine learning (ML) have emerged as an indispensable tool to draw meaningful insights and improve decision making in drug discovery. Thanks to the advancements in AI and ML algorithms, now the AI/ML-driven solutions have an unprecedented potential to accelerate the process of CNS drug discovery with better success rate. In this review, we comprehensively summarize AI/ML-powered pharmaceutical discovery efforts and their implementations in the CNS area. After introducing the AI/ML models as well as the conceptualization and data preparation, we outline the applications of AI/ML technologies to several key procedures in drug discovery, including target identification, compound screening, hit/lead generation and optimization, drug response and synergy prediction, de novo drug design, and drug repurposing. We review the current state-of-the-art of AI/ML-guided CNS drug discovery, focusing on blood-brain barrier permeability prediction and implementation into therapeutic discovery for neurological diseases. Finally, we discuss the major challenges and limitations of current approaches and possible future directions that may provide resolutions to these difficulties.
Collapse
Affiliation(s)
- Sezen Vatansever
- Department of Genetics and Genomic SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Transformative Disease ModelingIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Icahn Institute for Data Science and Genomic TechnologyIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Avner Schlessinger
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Daniel Wacker
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of NeuroscienceIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - H. Ümit Kaniskan
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Jian Jin
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Ming‐Ming Zhou
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Bin Zhang
- Department of Genetics and Genomic SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Transformative Disease ModelingIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Icahn Institute for Data Science and Genomic TechnologyIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| |
Collapse
|
72
|
Kimber TB, Chen Y, Volkamer A. Deep Learning in Virtual Screening: Recent Applications and Developments. Int J Mol Sci 2021; 22:4435. [PMID: 33922714 PMCID: PMC8123040 DOI: 10.3390/ijms22094435] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 04/13/2021] [Accepted: 04/14/2021] [Indexed: 01/03/2023] Open
Abstract
Drug discovery is a cost and time-intensive process that is often assisted by computational methods, such as virtual screening, to speed up and guide the design of new compounds. For many years, machine learning methods have been successfully applied in the context of computer-aided drug discovery. Recently, thanks to the rise of novel technologies as well as the increasing amount of available chemical and bioactivity data, deep learning has gained a tremendous impact in rational active compound discovery. Herein, recent applications and developments of machine learning, with a focus on deep learning, in virtual screening for active compound design are reviewed. This includes introducing different compound and protein encodings, deep learning techniques as well as frequently used bioactivity and benchmark data sets for model training and testing. Finally, the present state-of-the-art, including the current challenges and emerging problems, are examined and discussed.
Collapse
Affiliation(s)
| | | | - Andrea Volkamer
- In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité-Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany; (T.B.K.); (Y.C.)
| |
Collapse
|
73
|
Jiménez-Luna J, Grisoni F, Weskamp N, Schneider G. Artificial intelligence in drug discovery: recent advances and future perspectives. Expert Opin Drug Discov 2021; 16:949-959. [PMID: 33779453 DOI: 10.1080/17460441.2021.1909567] [Citation(s) in RCA: 83] [Impact Index Per Article: 27.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Introduction: Artificial intelligence (AI) has inspired computer-aided drug discovery. The widespread adoption of machine learning, in particular deep learning, in multiple scientific disciplines, and the advances in computing hardware and software, among other factors, continue to fuel this development. Much of the initial skepticism regarding applications of AI in pharmaceutical discovery has started to vanish, consequently benefitting medicinal chemistry.Areas covered: The current status of AI in chemoinformatics is reviewed. The topics discussed herein include quantitative structure-activity/property relationship and structure-based modeling, de novo molecular design, and chemical synthesis prediction. Advantages and limitations of current deep learning applications are highlighted, together with a perspective on next-generation AI for drug discovery.Expert opinion: Deep learning-based approaches have only begun to address some fundamental problems in drug discovery. Certain methodological advances, such as message-passing models, spatial-symmetry-preserving networks, hybrid de novo design, and other innovative machine learning paradigms, will likely become commonplace and help address some of the most challenging questions. Open data sharing and model development will play a central role in the advancement of drug discovery with AI.
Collapse
Affiliation(s)
- José Jiménez-Luna
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Francesca Grisoni
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Nils Weskamp
- Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an Der Riss, Germany
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
74
|
Chakravarty K, Antontsev V, Bundey Y, Varshney J. Driving success in personalized medicine through AI-enabled computational modeling. Drug Discov Today 2021; 26:1459-1465. [PMID: 33609781 DOI: 10.1016/j.drudis.2021.02.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 01/26/2021] [Accepted: 02/10/2021] [Indexed: 12/29/2022]
Abstract
The development of successful drugs is expensive and time-consuming because of high clinical attrition rates. This is caused partially by the rupture seen in the translatability of the drug from the bench to the clinic in the context of personalized medicine. Artificial intelligence (AI)-driven platforms integrated with mechanistic modeling have become instrumental in accelerating the drug development process by leveraging data ubiquitously across the various phases. AI can counter the deficiencies and ambiguities that arise during the classical drug development process while reducing human intervention and bridging the translational gap in discovering the connections between drugs and diseases.
Collapse
Affiliation(s)
| | - Victor Antontsev
- VeriSIM Life Inc., 1 Sansome St. Suite 3500, San Francisco, CA 94104, USA
| | - Yogesh Bundey
- VeriSIM Life Inc., 1 Sansome St. Suite 3500, San Francisco, CA 94104, USA
| | - Jyotika Varshney
- VeriSIM Life Inc., 1 Sansome St. Suite 3500, San Francisco, CA 94104, USA.
| |
Collapse
|
75
|
Eyke NS, Koscher BA, Jensen KF. Toward Machine Learning-Enhanced High-Throughput Experimentation. TRENDS IN CHEMISTRY 2021. [DOI: 10.1016/j.trechm.2020.12.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
76
|
Bender A, Cortes-Ciriano I. Artificial intelligence in drug discovery: what is realistic, what are illusions? Part 2: a discussion of chemical and biological data. Drug Discov Today 2021; 26:1040-1052. [PMID: 33508423 PMCID: PMC8132984 DOI: 10.1016/j.drudis.2020.11.037] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 11/07/2020] [Accepted: 11/30/2020] [Indexed: 12/11/2022]
Abstract
'Artificial Intelligence' (AI) has recently had a profound impact on areas such as image and speech recognition, and this progress has already translated into practical applications. However, in the drug discovery field, such advances remains scarce, and one of the reasons is intrinsic to the data used. In this review, we discuss aspects of, and differences in, data from different domains, namely the image, speech, chemical, and biological domains, the amounts of data available, and how relevant they are to drug discovery. Improvements in the future are needed with respect to our understanding of biological systems, and the subsequent generation of practically relevant data in sufficient quantities, to truly advance the field of AI in drug discovery, to enable the discovery of novel chemistry, with novel modes of action, which shows desirable efficacy and safety in the clinic.
Collapse
Affiliation(s)
- Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK; Imaging and Data Analytics, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, Cambridge, UK.
| | - Isidro Cortes-Ciriano
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|
77
|
Modeling and Simulation of Process Technology for Nanoparticulate Drug Formulations-A Particle Technology Perspective. Pharmaceutics 2020; 13:pharmaceutics13010022. [PMID: 33374375 PMCID: PMC7823784 DOI: 10.3390/pharmaceutics13010022] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 12/09/2020] [Accepted: 12/14/2020] [Indexed: 11/17/2022] Open
Abstract
Crystalline organic nanoparticles and their amorphous equivalents (ONP) have the potential to become a next-generation formulation technology for dissolution-rate limited biopharmaceutical classification system (BCS) class IIa molecules if the following requisites are met: (i) a quantitative understanding of the bioavailability enhancement benefit versus established formulation technologies and a reliable track record of successful case studies are available; (ii) efficient experimentation workflows with a minimum amount of active ingredient and a high degree of digitalization via, e.g., automation and computer-based experimentation planning are implemented; (iii) the scalability of the nanoparticle-based oral delivery formulation technology from the lab to manufacturing is ensured. Modeling and simulation approaches informed by the pharmaceutical material science paradigm can help to meet these requisites, especially if the entire value chain from formulation to oral delivery is covered. Any comprehensive digitalization of drug formulation requires combining pharmaceutical materials science with the adequate formulation and process technologies on the one hand and quantitative pharmacokinetics and drug administration dynamics in the human body on the other hand. Models for the technical realization of the drug production and the distribution of the pharmaceutical compound in the human body are coupled via the central objective, namely bioavailability. The underlying challenges can only be addressed by hierarchical approaches for property and process design. The tools for multiscale modeling of the here-considered particle processes (e.g., by coupled computational fluid dynamics, population balance models, Noyes–Whitney dissolution kinetics) and physiologically based absorption modeling are available. Significant advances are being made in enhancing the bioavailability of hydrophobic compounds by applying innovative solutions. As examples, the predictive modeling of anti-solvent precipitation is presented, and options for the model development of comminution processes are discussed.
Collapse
|
78
|
Schuffenhauer A, Schneider N, Hintermann S, Auld D, Blank J, Cotesta S, Engeloch C, Fechner N, Gaul C, Giovannoni J, Jansen J, Joslin J, Krastel P, Lounkine E, Manchester J, Monovich LG, Pelliccioli AP, Schwarze M, Shultz MD, Stiefl N, Baeschlin DK. Evolution of Novartis' Small Molecule Screening Deck Design. J Med Chem 2020; 63:14425-14447. [PMID: 33140646 DOI: 10.1021/acs.jmedchem.0c01332] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
This article summarizes the evolution of the screening deck at the Novartis Institutes for BioMedical Research (NIBR). Historically, the screening deck was an assembly of all available compounds. In 2015, we designed a first deck to facilitate access to diverse subsets with optimized properties. We allocated the compounds as plated subsets on a 2D grid with property based ranking in one dimension and increasing structural redundancy in the other. The learnings from the 2015 screening deck were applied to the design of a next generation in 2019. We found that using traditional leadlikeness criteria (mainly MW, clogP) reduces the hit rates of attractive chemical starting points in subset screening. Consequently, the 2019 deck relies on solubility and permeability to select preferred compounds. The 2019 design also uses NIBR's experimental assay data and inferred biological activity profiles in addition to structural diversity to define redundancy across the compound sets.
Collapse
Affiliation(s)
- Ansgar Schuffenhauer
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Nadine Schneider
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Samuel Hintermann
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Douglas Auld
- Novartis Institutes for BioMedical Research Inc., 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Jutta Blank
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Simona Cotesta
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Caroline Engeloch
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Nikolas Fechner
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Christoph Gaul
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Jerome Giovannoni
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Johanna Jansen
- Novartis Institutes for BioMedical Research-Emeryville, 5300 Chiron Way, Emeryville, California 94608-2916, United States
| | - John Joslin
- Genomics Institute of the Novartis Foundation, San Diego, California 92121, United States
| | - Philipp Krastel
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Eugen Lounkine
- Novartis Institutes for BioMedical Research Inc., 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - John Manchester
- Novartis Institutes for BioMedical Research Inc., 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Lauren G Monovich
- Novartis Institutes for BioMedical Research Inc., 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Anna Paola Pelliccioli
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Manuel Schwarze
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Michael D Shultz
- Novartis Institutes for BioMedical Research Inc., 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Nikolaus Stiefl
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Daniel K Baeschlin
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| |
Collapse
|