Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Mervin LH, Johansson S, Semenova E, Giblin KA, Engkvist O. Uncertainty quantification in drug design. Drug Discov Today 2020;26:474-489. [PMID: 33253918 DOI: 10.1016/j.drudis.2020.11.027] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Revised: 07/13/2020] [Accepted: 11/23/2020] [Indexed: 01/03/2023]

For:	Mervin LH, Johansson S, Semenova E, Giblin KA, Engkvist O. Uncertainty quantification in drug design. Drug Discov Today 2020;26:474-489. [PMID: 33253918 DOI: 10.1016/j.drudis.2020.11.027] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Revised: 07/13/2020] [Accepted: 11/23/2020] [Indexed: 01/03/2023]

Number

Cited by Other Article(s)

Xu Y, Liaw A, Sheridan RP, Svetnik V. Development and Evaluation of Conformal Prediction Methods for Quantitative Structure-Activity Relationship. ACS OMEGA 2024;9:29478-29490. [PMID: 39005801 PMCID: PMC11238240 DOI: 10.1021/acsomega.4c02017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 06/10/2024] [Accepted: 06/12/2024] [Indexed: 07/16/2024]

Mervin L, Voronov A, Kabeshov M, Engkvist O. QSARtuna: An Automated QSAR Modeling Platform for Molecular Property Prediction in Drug Design. J Chem Inf Model 2024. [PMID: 38950185 DOI: 10.1021/acs.jcim.4c00457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]

Dutschmann TM, Schlenker V, Baumann K. Chemoinformatic regression methods and their applicability domain. Mol Inform 2024;43:e202400018. [PMID: 38803302 DOI: 10.1002/minf.202400018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 03/24/2024] [Accepted: 03/25/2024] [Indexed: 05/29/2024]

Fan Z, Yu J, Zhang X, Chen Y, Sun S, Zhang Y, Chen M, Xiao F, Wu W, Li X, Zheng M, Luo X, Wang D. Reducing overconfident errors in molecular property classification using Posterior Network. PATTERNS (NEW YORK, N.Y.) 2024;5:100991. [PMID: 39005492 PMCID: PMC11240180 DOI: 10.1016/j.patter.2024.100991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 12/20/2023] [Accepted: 04/15/2024] [Indexed: 07/16/2024]

Affiliation(s)

Zhehuan Fan Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China
Jie Yu Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China
Xiang Zhang School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
Yijie Chen School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
Shihui Sun School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
Yuanyuan Zhang Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China
Mingan Chen Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China School of Physical Science and Technology, ShanghaiTech University, Shanghai 201210, China Lingang Laboratory, Shanghai 200031, China
Fu Xiao School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
Wenyong Wu Lingang Laboratory, Shanghai 200031, China
Xutong Li Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China
Mingyue Zheng Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
Xiaomin Luo Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
Dingyan Wang Lingang Laboratory, Shanghai 200031, China

Collapse

Kudryavtseva V, Sukhorukov GB. Features of Anisotropic Drug Delivery Systems. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024;36:e2307675. [PMID: 38158786 DOI: 10.1002/adma.202307675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 12/17/2023] [Indexed: 01/03/2024]

Rayka M, Mirzaei M, Mohammad Latifi A. An ensemble-based approach to estimate confidence of predicted protein-ligand binding affinity values. Mol Inform 2024;43:e202300292. [PMID: 38358080 DOI: 10.1002/minf.202300292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 01/22/2024] [Accepted: 02/02/2024] [Indexed: 02/16/2024]

Sandström H, Rissanen M, Rousu J, Rinke P. Data-Driven Compound Identification in Atmospheric Mass Spectrometry. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024;11:e2306235. [PMID: 38095508 PMCID: PMC10885664 DOI: 10.1002/advs.202306235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/04/2023] [Indexed: 02/24/2024]

Wang R, Liu Z, Gong J, Zhou Q, Guan X, Ge G. An Uncertainty-Guided Deep Learning Method Facilitates Rapid Screening of CYP3A4 Inhibitors. J Chem Inf Model 2023;63:7699-7710. [PMID: 38055780 DOI: 10.1021/acs.jcim.3c01241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]

Abstract

Cytochrome P450 3A4 (CYP3A4), a prominent member of the P450 enzyme superfamily, plays a crucial role in metabolizing various xenobiotics, including over 50% of clinically significant drugs. Evaluating CYP3A4 inhibition before drug approval is essential to avoiding potentially harmful pharmacokinetic drug-drug interactions (DDIs) and adverse drug reactions (ADRs). Despite the development of several CYP inhibitor prediction models, the primary approach for screening CYP inhibitors still relies on experimental methods. This might stem from the limitations of existing models, which only provide deterministic classification outcomes instead of precise inhibition intensity (e.g., IC50) and often suffer from inadequate prediction reliability. To address this challenge, we propose an uncertainty-guided regression model to accurately predict the IC50 values of anti-CYP3A4 activities. First, a comprehensive data set of CYP3A4 inhibitors was compiled, consisting of 27,045 compounds with classification labels, including 4395 compounds with explicit IC50 values. Second, by integrating the predictions of the classification model trained on a larger data set and introducing an evidential uncertainty method to rank prediction confidence, we obtained a high-precision and reliable regression model. Finally, we use the evidential uncertainty values as a trustworthy indicator to perform a virtual screening of an in-house compound set. The in vitro experiment results revealed that this new indicator significantly improved the hit ratio and reduced false positives among the top-ranked compounds. Specifically, among the top 20 compounds ranked with uncertainty, 15 compounds were identified as novel CYP3A4 inhibitors, and three of them exhibited activities less than 1 μM. In summary, our findings highlight the effectiveness of incorporating uncertainty in compound screening, providing a promising strategy for drug discovery and development.

Collapse

Ma C, Wolfinger R. A prediction model for blood-brain barrier penetrating peptides based on masked peptide transformers with dynamic routing. Brief Bioinform 2023;24:bbad399. [PMID: 37985456 DOI: 10.1093/bib/bbad399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 09/26/2023] [Accepted: 10/17/2023] [Indexed: 11/22/2023] Open

Fan YJ, Allen JE, McLoughlin KS, Shi D, Bennion BJ, Zhang X, Lightstone FC. Evaluating point-prediction uncertainties in neural networks for protein-ligand binding prediction. ARTIFICIAL INTELLIGENCE CHEMISTRY 2023;1:100004. [PMID: 37583465 PMCID: PMC10426331 DOI: 10.1016/j.aichem.2023.100004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]

Janet JP, Mervin L, Engkvist O. Artificial intelligence in molecular de novo design: Integration with experiment. Curr Opin Struct Biol 2023;80:102575. [PMID: 36966692 DOI: 10.1016/j.sbi.2023.102575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 02/09/2023] [Accepted: 02/18/2023] [Indexed: 06/04/2023]

Dutschmann TM, Kinzel L, Ter Laak A, Baumann K. Large-scale evaluation of k-fold cross-validation ensembles for uncertainty estimation. J Cheminform 2023;15:49. [PMID: 37118768 PMCID: PMC10142532 DOI: 10.1186/s13321-023-00709-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2022] [Accepted: 03/10/2023] [Indexed: 04/30/2023] Open

Luukkonen S, van den Maagdenberg HW, Emmerich MTM, van Westen GJP. Artificial intelligence in multi-objective drug design. Curr Opin Struct Biol 2023;79:102537. [PMID: 36774727 DOI: 10.1016/j.sbi.2023.102537] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 12/21/2022] [Accepted: 01/03/2023] [Indexed: 02/12/2023]

Obrezanova O. Artificial intelligence for compound pharmacokinetics prediction. Curr Opin Struct Biol 2023;79:102546. [PMID: 36804676 DOI: 10.1016/j.sbi.2023.102546] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 01/04/2023] [Accepted: 01/13/2023] [Indexed: 02/17/2023]

New avenues in artificial-intelligence-assisted drug discovery. Drug Discov Today 2023;28:103516. [PMID: 36736583 DOI: 10.1016/j.drudis.2023.103516] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Revised: 12/08/2022] [Accepted: 01/26/2023] [Indexed: 02/05/2023]

Stoyanova R, Katzberger PM, Komissarov L, Khadhraoui A, Sach-Peltason L, Groebke Zbinden K, Schindler T, Manevski N. Computational Predictions of Nonclinical Pharmacokinetics at the Drug Design Stage. J Chem Inf Model 2023;63:442-458. [PMID: 36595708 DOI: 10.1021/acs.jcim.2c01134] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Abstract

Although computational predictions of pharmacokinetics (PK) are desirable at the drug design stage, existing approaches are often limited by prediction accuracy and human interpretability. Using a discovery data set of mouse and rat PK studies at Roche (9,685 unique compounds), we performed a proof-of-concept study to predict key PK properties from chemical structure alone, including plasma clearance (CLp), volume of distribution at steady-state (Vss), and oral bioavailability (F). Ten machine learning (ML) models were evaluated, including Single-Task, Multitask, and transfer learning approaches (i.e., pretraining with in vitro data). In addition to prediction accuracy, we emphasized human interpretability of outcomes, especially the quantification of uncertainty, applicability domains, and explanations of predictions in terms of molecular features. Results show that intravenous (IV) PK properties (CLp and Vss) can be predicted with good precision (average absolute fold error, AAFE of 1.96-2.84 depending on data split) and low bias (average fold error, AFE of 0.98-1.36), with AutoGluon, Gaussian Process Regressor (GP), and ChemProp displaying the best performance. Driven by higher complexity of oral PK studies, predictions of F were more challenging, with the best AAFE values of 2.35-2.60 and higher overprediction bias (AFE of 1.45-1.62). Multi-Task approaches and pretraining of ChemProp neural networks with in vitro data showed similar precision to Single-Task models but helped reduce the bias and increase correlations between observations and predictions. A combination of GP-computed prediction variance, molecular clustering, and dimensionality-reduction provided valuable quantitative insights into prediction uncertainty and applicability domains. SHAPley Additive exPlanations (SHAPs) highlighted molecular features contributing to prediction outcomes of Vss, providing explanations that could aid drug design. Combined results show that computational predictions of PK are feasible at the drug design stage, with several ML technologies converging to successfully leverage historical PK data sets. Further studies are needed to unlock the full potential of this approach, especially with respect to data set sizes and quality, transfer learning between in vitro and in vivo data sets, model-independent quantification of uncertainty, and explainability of predictions.

Collapse

Wang D, Wu Z, Shen C, Bao L, Luo H, Wang Z, Yao H, Kong DX, Luo C, Hou T. Learning with uncertainty to accelerate the discovery of histone lysine-specific demethylase 1A (KDM1A/LSD1) inhibitors. Brief Bioinform 2023;24:6961473. [PMID: 36573494 DOI: 10.1093/bib/bbac592] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 12/01/2022] [Accepted: 12/02/2022] [Indexed: 12/28/2022] Open

Affiliation(s)

Dong Wang Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, China
Zhenxing Wu Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, China
Chao Shen Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, China.,CarbonSilicon AI Technology Co., Ltd, Hangzhou 310018, Zhejiang, China
Lingjie Bao Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, China
Hao Luo Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, China
Zhe Wang Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, China
Hucheng Yao State Key Laboratory of Agricultural Microbiology, Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
De-Xin Kong State Key Laboratory of Agricultural Microbiology, Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
Cheng Luo The Center for Chemical Biology, Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203 China
Tingjun Hou Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, China

Collapse

Rodríguez-Pérez R, Trunzer M, Schneider N, Faller B, Gerebtzoff G. Multispecies Machine Learning Predictions of In Vitro Intrinsic Clearance with Uncertainty Quantification Analyses. Mol Pharm 2023;20:383-394. [PMID: 36437712 DOI: 10.1021/acs.molpharmaceut.2c00680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Korolev V, Nevolin I, Protsenko P. A universal similarity based approach for predictive uncertainty quantification in materials science. Sci Rep 2022;12:14931. [PMID: 36056050 PMCID: PMC9440040 DOI: 10.1038/s41598-022-19205-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 08/25/2022] [Indexed: 11/08/2022] Open

Chuah J, Kruger U, Wang G, Yan P, Hahn J. Framework for Testing Robustness of Machine Learning-Based Classifiers. J Pers Med 2022;12:1314. [PMID: 36013263 PMCID: PMC9409965 DOI: 10.3390/jpm12081314] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 08/05/2022] [Accepted: 08/12/2022] [Indexed: 02/07/2023] Open

Yu J, Wang D, Zheng M. Uncertainty quantification: Can we trust artificial intelligence in drug discovery? iScience 2022;25:104814. [PMID: 35996575 PMCID: PMC9391523 DOI: 10.1016/j.isci.2022.104814] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Obrezanova O, Martinsson A, Whitehead T, Mahmoud S, Bender A, Miljković F, Grabowski P, Irwin B, Oprisiu I, Conduit G, Segall M, Smith GF, Williamson B, Winiwarter S, Greene N. Prediction of In Vivo Pharmacokinetic Parameters and Time-Exposure Curves in Rats Using Machine Learning from the Chemical Structure. Mol Pharm 2022;19:1488-1504. [PMID: 35412314 DOI: 10.1021/acs.molpharmaceut.2c00027] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Affiliation(s)

Olga Obrezanova Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge CB4 0FZ, U.K
Anton Martinsson Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gothenburg SE-43183, Sweden
Tom Whitehead Intellegens Ltd., Eagle Labs, Cambridge CB4 3AZ, U.K
Samar Mahmoud Optibrium Ltd., Cambridge Innovation Park, Cambridge CB25 9PB, U.K
Andreas Bender Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge CB4 0FZ, U.K.,Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Cambridge CB2 1EW, U.K
Filip Miljković Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gothenburg SE-43183, Sweden
Piotr Grabowski Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge CB4 0FZ, U.K
Ben Irwin Optibrium Ltd., Cambridge Innovation Park, Cambridge CB25 9PB, U.K
Ioana Oprisiu Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gothenburg SE-43183, Sweden
Gareth Conduit Intellegens Ltd., Eagle Labs, Cambridge CB4 3AZ, U.K
Matthew Segall Optibrium Ltd., Cambridge Innovation Park, Cambridge CB25 9PB, U.K
Graham F Smith Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge CB4 0FZ, U.K
Beth Williamson Drug Metabolism and Pharmacokinetics, Research and Early Development, Oncology R&D, AstraZeneca, Cambridge CB10 1XL, U.K
Susanne Winiwarter Drug Metabolism and Pharmacokinetics, Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), Biopharmaceutical R&D, AstraZeneca, Gothenburg SE-43183, Sweden
Nigel Greene Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Waltham, Massachusetts 02451, United States

Collapse

Martinelli DD. Generative machine learning for de novo drug discovery: A systematic review. Comput Biol Med 2022;145:105403. [PMID: 35339849 DOI: 10.1016/j.compbiomed.2022.105403] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 03/10/2022] [Accepted: 03/11/2022] [Indexed: 02/08/2023]

Abstract

Recent research on artificial intelligence indicates that machine learning algorithms can auto-generate novel drug-like molecules. Generative models have revolutionized de novo drug discovery, rendering the explorative process more efficient. Several model frameworks and input formats have been proposed to enhance the performance of intelligent algorithms in generative molecular design. In this systematic literature review of experimental articles and reviews over the last five years, machine learning models, challenges associated with computational molecule design along with proposed solutions, and molecular encoding methods are discussed. A query-based search of the PubMed, ScienceDirect, Springer, Wiley Online Library, arXiv, MDPI, bioRxiv, and IEEE Xplore databases yielded 87 studies. Twelve additional studies were identified via citation searching. Of the articles in which machine learning was implemented, six prominent algorithms were identified: long short-term memory recurrent neural networks (LSTM-RNNs), variational autoencoders (VAEs), generative adversarial networks (GANs), adversarial autoencoders (AAEs), evolutionary algorithms, and gated recurrent unit (GRU-RNNs). Furthermore, eight central challenges were designated: homogeneity of generated molecular libraries, deficient synthesizability, limited assay data, model interpretability, incapacity for multi-property optimization, incomparability, restricted molecule size, and uncertainty in model evaluation. Molecules were encoded either as strings, which were occasionally augmented using randomization, as 2D graphs, or as 3D graphs. Statistical analysis and visualization are performed to illustrate how approaches to machine learning in de novo drug design have evolved over the past five years. Finally, future opportunities and reservations are discussed.

Collapse

Bilodeau C, Jin W, Jaakkola T, Barzilay R, Jensen KF. Generative models for molecular discovery: Recent advances and challenges. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1608] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Miljković F, Rodríguez-Pérez R, Bajorath J. Impact of Artificial Intelligence on Compound Discovery, Design, and Synthesis. ACS OMEGA 2021;6:33293-33299. [PMID: 34926881 PMCID: PMC8674916 DOI: 10.1021/acsomega.1c05512] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 11/18/2021] [Indexed: 05/17/2023]

Thomas M, Boardman A, Garcia-Ortegon M, Yang H, de Graaf C, Bender A. Applications of Artificial Intelligence in Drug Design: Opportunities and Challenges. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021;2390:1-59. [PMID: 34731463 DOI: 10.1007/978-1-0716-1787-8_1] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Norinder U, Spjuth O, Svensson F. Synergy conformal prediction applied to large-scale bioactivity datasets and in federated learning. J Cheminform 2021;13:77. [PMID: 34600569 PMCID: PMC8487527 DOI: 10.1186/s13321-021-00555-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Accepted: 09/15/2021] [Indexed: 12/05/2022] Open

Wang D, Yu J, Chen L, Li X, Jiang H, Chen K, Zheng M, Luo X. A hybrid framework for improving uncertainty quantification in deep learning-based QSAR regression modeling. J Cheminform 2021;13:69. [PMID: 34544485 PMCID: PMC8454160 DOI: 10.1186/s13321-021-00551-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 09/05/2021] [Indexed: 11/24/2022] Open

Affiliation(s)

Dingyan Wang Shanghai Key Laboratory of Forensic Medicine, Academy of Forensic Science, Shanghai, 200063, China University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
Jie Yu University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
Lifan Chen University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
Xutong Li University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
Hualiang Jiang University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
Kaixian Chen University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
Mingyue Zheng University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China. Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.
Xiaomin Luo Shanghai Key Laboratory of Forensic Medicine, Academy of Forensic Science, Shanghai, 200063, China. University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China. Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.

Collapse

Mervin LH, Trapotsi MA, Afzal AM, Barrett IP, Bender A, Engkvist O. Probabilistic Random Forest improves bioactivity predictions close to the classification threshold by taking into account experimental uncertainty. J Cheminform 2021;13:62. [PMID: 34412708 PMCID: PMC8375213 DOI: 10.1186/s13321-021-00539-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Accepted: 07/30/2021] [Indexed: 11/24/2022] Open

Abstract

Measurements of protein–ligand interactions have reproducibility limits due to experimental errors. Any model based on such assays will consequentially have such unavoidable errors influencing their performance which should ideally be factored into modelling and output predictions, such as the actual standard deviation of experimental measurements (σ) or the associated comparability of activity values between the aggregated heterogenous activity units (i.e., K_i versus IC₅₀ values) during dataset assimilation. However, experimental errors are usually a neglected aspect of model generation. In order to improve upon the current state-of-the-art, we herein present a novel approach toward predicting protein–ligand interactions using a Probabilistic Random Forest (PRF) classifier. The PRF algorithm was applied toward in silico protein target prediction across ~ 550 tasks from ChEMBL and PubChem. Predictions were evaluated by taking into account various scenarios of experimental standard deviations in both training and test sets and performance was assessed using fivefold stratified shuffled splits for validation. The largest benefit in incorporating the experimental deviation in PRF was observed for data points close to the binary threshold boundary, when such information was not considered in any way in the original RF algorithm. For example, in cases when σ ranged between 0.4–0.6 log units and when ideal probability estimates between 0.4–0.6, the PRF outperformed RF with a median absolute error margin of ~ 17%. In comparison, the baseline RF outperformed PRF for cases with high confidence to belong to the active class (far from the binary decision threshold), although the RF models gave errors smaller than the experimental uncertainty, which could indicate that they were overtrained and/or over-confident. Finally, the PRF models trained with putative inactives decreased the performance compared to PRF models without putative inactives and this could be because putative inactives were not assigned an experimental pXC₅₀ value, and therefore they were considered inactives with a low uncertainty (which in practice might not be true). In conclusion, PRF can be useful for target prediction models in particular for data where class boundaries overlap with the measurement uncertainty, and where a substantial part of the training data is located close to the classification threshold.

Collapse