51
|
Basciu A, Callea L, Motta S, Bonvin AM, Bonati L, Vargiu AV. No dance, no partner! A tale of receptor flexibility in docking and virtual screening. VIRTUAL SCREENING AND DRUG DOCKING 2022. [DOI: 10.1016/bs.armc.2022.08.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
52
|
Diaz-Flores E, Meyer T, Giorkallos A. Evolution of Artificial Intelligence-Powered Technologies in Biomedical Research and Healthcare. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2022; 182:23-60. [DOI: 10.1007/10_2021_189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
53
|
Krishna Deepak RNV, Verma RK, Hartono YD, Yew WS, Fan H. Recent Advances in Structure, Function, and Pharmacology of Class A Lipid GPCRs: Opportunities and Challenges for Drug Discovery. Pharmaceuticals (Basel) 2021; 15:12. [PMID: 35056070 PMCID: PMC8779880 DOI: 10.3390/ph15010012] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 12/17/2021] [Accepted: 12/17/2021] [Indexed: 01/01/2023] Open
Abstract
Great progress has been made over the past decade in understanding the structural, functional, and pharmacological diversity of lipid GPCRs. From the first determination of the crystal structure of bovine rhodopsin in 2000, much progress has been made in the field of GPCR structural biology. The extraordinary progress in structural biology and pharmacology of GPCRs, coupled with rapid advances in computational approaches to study receptor dynamics and receptor-ligand interactions, has broadened our comprehension of the structural and functional facets of the receptor family members and has helped usher in a modern age of structure-based drug design and development. First, we provide a primer on lipid mediators and lipid GPCRs and their role in physiology and diseases as well as their value as drug targets. Second, we summarize the current advancements in the understanding of structural features of lipid GPCRs, such as the structural variation of their extracellular domains, diversity of their orthosteric and allosteric ligand binding sites, and molecular mechanisms of ligand binding. Third, we close by collating the emerging paradigms and opportunities in targeting lipid GPCRs, including a brief discussion on current strategies, challenges, and the future outlook.
Collapse
Affiliation(s)
- R. N. V. Krishna Deepak
- Bioinformatics Institute, A*STAR, 30 Biopolis Street, Matrix #07-01, Singapore 138671, Singapore; (R.K.V.); (Y.D.H.)
| | - Ravi Kumar Verma
- Bioinformatics Institute, A*STAR, 30 Biopolis Street, Matrix #07-01, Singapore 138671, Singapore; (R.K.V.); (Y.D.H.)
| | - Yossa Dwi Hartono
- Bioinformatics Institute, A*STAR, 30 Biopolis Street, Matrix #07-01, Singapore 138671, Singapore; (R.K.V.); (Y.D.H.)
- Synthetic Biology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, 14 Medical Drive, Singapore 117599, Singapore;
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, 8 Medical Drive, Singapore 117597, Singapore
| | - Wen Shan Yew
- Synthetic Biology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, 14 Medical Drive, Singapore 117599, Singapore;
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, 8 Medical Drive, Singapore 117597, Singapore
| | - Hao Fan
- Bioinformatics Institute, A*STAR, 30 Biopolis Street, Matrix #07-01, Singapore 138671, Singapore; (R.K.V.); (Y.D.H.)
- Synthetic Biology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, 14 Medical Drive, Singapore 117599, Singapore;
| |
Collapse
|
54
|
Dong L, Qu X, Zhao Y, Wang B. Prediction of Binding Free Energy of Protein-Ligand Complexes with a Hybrid Molecular Mechanics/Generalized Born Surface Area and Machine Learning Method. ACS OMEGA 2021; 6:32938-32947. [PMID: 34901645 PMCID: PMC8655939 DOI: 10.1021/acsomega.1c04996] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 11/10/2021] [Indexed: 06/14/2023]
Abstract
Accurate prediction of protein-ligand binding free energies is important in enzyme engineering and drug discovery. The molecular mechanics/generalized Born surface area (MM/GBSA) approach is widely used to estimate ligand-binding affinities, but its performance heavily relies on the accuracy of its energy components. A hybrid strategy combining MM/GBSA and machine learning (ML) has been developed to predict the binding free energies of protein-ligand systems. Based on the MM/GBSA energy terms and several features associated with protein-ligand interactions, our ML-based scoring function, GXLE, shows much better performance than MM/GBSA without entropy. In particular, the good transferability of the GXLE model is highlighted by its good performance in ranking power for prediction of the binding affinity of different ligands for either the docked structures or crystal structures. The GXLE scoring function and its code are freely available and can be used to correct the binding free energies computed by MM/GBSA.
Collapse
Affiliation(s)
- Lina Dong
- State
Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry,
iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 360015, P. R. China
| | - Xiaoyang Qu
- State
Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry,
College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 360015, P. R. China
| | - Yuan Zhao
- The
Key Laboratory of Natural Medicine and Immuno-Engineering, Henan University, Kaifeng 475004, P. R.
China
| | - Binju Wang
- State
Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry,
College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 360015, P. R. China
| |
Collapse
|
55
|
Abstract
Stochastic computing is an emerging scientific field pushed by the need for developing high-performance artificial intelligence systems in hardware to quickly solve complex data processing problems. This is the case of virtual screening, a computational task aimed at searching across huge molecular databases for new drug leads. In this work, we show a classification framework in which molecules are described by an energy-based vector. This vector is then processed by an ultra-fast artificial neural network implemented through FPGA by using stochastic computing techniques. Compared to other previously published virtual screening methods, this proposal provides similar or higher accuracy, while it improves processing speed by about two or three orders of magnitude.
Collapse
|
56
|
Wang Y, Wu S, Duan Y, Huang Y. A point cloud-based deep learning strategy for protein-ligand binding affinity prediction. Brief Bioinform 2021; 23:6440132. [PMID: 34849569 DOI: 10.1093/bib/bbab474] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 09/21/2021] [Accepted: 10/15/2021] [Indexed: 01/14/2023] Open
Abstract
There is great interest to develop artificial intelligence-based protein-ligand binding affinity models due to their immense applications in drug discovery. In this paper, PointNet and PointTransformer, two pointwise multi-layer perceptrons have been applied for protein-ligand binding affinity prediction for the first time. Three-dimensional point clouds could be rapidly generated from PDBbind-2016 with 3772 and 11 327 individual point clouds derived from the refined or/and general sets, respectively. These point clouds (the refined or the extended set) were used to train PointNet or PointTransformer, resulting in protein-ligand binding affinity prediction models with Pearson correlation coefficients R = 0.795 or 0.833 from the extended data set, respectively, based on the CASF-2016 benchmark test. The analysis of parameters suggests that the two deep learning models were capable to learn many interactions between proteins and their ligands, and some key atoms for the interactions could be visualized. The protein-ligand interaction features learned by PointTransformer could be further adapted for the XGBoost-based machine learning algorithm, resulting in prediction models with an average Rp of 0.827, which is on par with state-of-the-art machine learning models. These results suggest that the point clouds derived from PDBbind data sets are useful to evaluate the performance of 3D point clouds-centered deep learning algorithms, which could learn atomic features of protein-ligand interactions from natural evolution or medicinal chemistry and thus have wide applications in chemistry and biology.
Collapse
Affiliation(s)
- Yeji Wang
- Xiangya International Academy of Translational Medicine, Central South University, Changsha, Hunan 410013, China
| | - Shuo Wu
- Xiangya International Academy of Translational Medicine, Central South University, Changsha, Hunan 410013, China
| | - Yanwen Duan
- Xiangya International Academy of Translational Medicine, Central South University, Changsha, Hunan 410013, China.,Hunan Engineering Research Center of Combinatorial Biosynthesis and Natural Product Drug Discover, Changsha, Hunan 410011, China.,National Engineering Research Center of Combinatorial Biosynthesis for Drug Discovery, Changsha, Hunan 410011, China
| | - Yong Huang
- Xiangya International Academy of Translational Medicine, Central South University, Changsha, Hunan 410013, China.,National Engineering Research Center of Combinatorial Biosynthesis for Drug Discovery, Changsha, Hunan 410011, China
| |
Collapse
|
57
|
Vijayan RSK, Kihlberg J, Cross JB, Poongavanam V. Enhancing preclinical drug discovery with artificial intelligence. Drug Discov Today 2021; 27:967-984. [PMID: 34838731 DOI: 10.1016/j.drudis.2021.11.023] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 10/15/2021] [Accepted: 11/19/2021] [Indexed: 12/14/2022]
Abstract
Artificial intelligence (AI) is becoming an integral part of drug discovery. It has the potential to deliver across the drug discovery and development value chain, starting from target identification and reaching through clinical development. In this review, we provide an overview of current AI technologies and a glimpse of how AI is reimagining preclinical drug discovery by highlighting examples where AI has made a real impact. Considering the excitement and hyperbole surrounding AI in drug discovery, we aim to present a realistic view by discussing both opportunities and challenges in adopting AI in drug discovery.
Collapse
Affiliation(s)
- R S K Vijayan
- Institute for Applied Cancer Science, MD Anderson Cancer Center, Houston, TX, USA
| | - Jan Kihlberg
- Department of Chemistry-BMC, Uppsala University, Uppsala, Sweden
| | - Jason B Cross
- Institute for Applied Cancer Science, MD Anderson Cancer Center, Houston, TX, USA.
| | | |
Collapse
|
58
|
Ricci-Lopez J, Aguila SA, Gilson MK, Brizuela CA. Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning. J Chem Inf Model 2021; 61:5362-5376. [PMID: 34652141 DOI: 10.1021/acs.jcim.1c00511] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
One of the main challenges of structure-based virtual screening (SBVS) is the incorporation of the receptor's flexibility, as its explicit representation in every docking run implies a high computational cost. Therefore, a common alternative to include the receptor's flexibility is the approach known as ensemble docking. Ensemble docking consists of using a set of receptor conformations and performing the docking assays over each of them. However, there is still no agreement on how to combine the ensemble docking results to obtain the final ligand ranking. A common choice is to use consensus strategies to aggregate the ensemble docking scores, but these strategies exhibit slight improvement regarding the single-structure approach. Here, we claim that using machine learning (ML) methodologies over the ensemble docking results could improve the predictive power of SBVS. To test this hypothesis, four proteins were selected as study cases: CDK2, FXa, EGFR, and HSP90. Protein conformational ensembles were built from crystallographic structures, whereas the evaluated compound library comprised up to three benchmarking data sets (DUD, DEKOIS 2.0, and CSAR-2012) and cocrystallized molecules. Ensemble docking results were processed through 30 repetitions of 4-fold cross-validation to train and validate two ML classifiers: logistic regression and gradient boosting trees. Our results indicate that the ML classifiers significantly outperform traditional consensus strategies and even the best performance case achieved with single-structure docking. We provide statistical evidence that supports the effectiveness of ML to improve the ensemble docking performance.
Collapse
Affiliation(s)
- Joel Ricci-Lopez
- Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California C.P. 22860, Mexico.,Centro de Nanociencias y Nanotecnología, Universidad Nacional Autónoma de México (UNAM), Ensenada, Baja California C.P. 22860, Mexico
| | - Sergio A Aguila
- Centro de Nanociencias y Nanotecnología, Universidad Nacional Autónoma de México (UNAM), Ensenada, Baja California C.P. 22860, Mexico
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, La Jolla, San Diego, California 92093, United States
| | - Carlos A Brizuela
- Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California C.P. 22860, Mexico
| |
Collapse
|
59
|
Meng J, Zhang L, Wang L, Li S, Xie D, Zhang Y, Liu H. TSSF-hERG: A machine-learning-based hERG potassium channel-specific scoring function for chemical cardiotoxicity prediction. Toxicology 2021; 464:153018. [PMID: 34757159 DOI: 10.1016/j.tox.2021.153018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 10/15/2021] [Accepted: 10/26/2021] [Indexed: 11/27/2022]
Abstract
The human ether-à-go-go-related gene (hERG) encodes the Kv11.1 voltage-gated potassium ion (K+) channel that conducts the rapidly activating delayed rectifier current (IKr) in cardiomyocytes to regulate the repolarization process. Some drugs, as blockers of hERG potassium channels, cannot be marketed due to prolonged QT intervals, as well known as cardiotoxicity. Predetermining the binding affinity values between drugs and hERG through in silico methods can greatly reduce the time and cost required for experimental verification. In this study, we collected 9,215 compounds with AutoDock Vina's docking structures as training set, and collected compounds from four references as test sets. A series of models for predicting the binding affinities of hERG blockers were built based on five machine learning algorithms and combinations of interaction features and ligand features. The model built by support vector regression (SVR) using the combination of all features achieved the best performance on both tenfold cross-validation and external verification, which was selected and named as TSSF-hERG (target-specific scoring function for hERG). TSSF-hERG is more accurate than the classic scoring function of AutoDock Vina and the machine-learning-based generic scoring function RF-Score, with a Pearson's correlation coefficient (Rp) of 0.765, a Spearman's rank correlation coefficient (Rs) of 0.757, a root-mean-square error (RMSE) of 0.585 in a tenfold cross-validation study. All results demonstrated that TSSF-hERG would be useful for improving the power of binding affinity prediction between hERG and compounds, which can be further used for prediction or virtual screening of the hERG-related cardiotoxicity of drug candidates.
Collapse
Affiliation(s)
- Jinhui Meng
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Li Zhang
- School of Life Science, Liaoning University, Shenyang, 110036, China; Technology Innovation Center for Computer Simulating and Information Processing of Bio-macromolecules of Liaoning Province, Shenyang, 110036, China; Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Shenyang, 110036, China
| | - Lianxin Wang
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Shimeng Li
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Di Xie
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Yuxi Zhang
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Hongsheng Liu
- Technology Innovation Center for Computer Simulating and Information Processing of Bio-macromolecules of Liaoning Province, Shenyang, 110036, China; Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Shenyang, 110036, China; School of Pharmacy, Liaoning University, Shenyang, 110036, China.
| |
Collapse
|
60
|
Li H, Lu G, Sze KH, Su X, Chan WY, Leung KS. Machine-learning scoring functions trained on complexes dissimilar to the test set already outperform classical counterparts on a blind benchmark. Brief Bioinform 2021; 22:bbab225. [PMID: 34169324 PMCID: PMC8575004 DOI: 10.1093/bib/bbab225] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 04/27/2021] [Accepted: 05/23/2021] [Indexed: 11/12/2022] Open
Abstract
The superior performance of machine-learning scoring functions for docking has caused a series of debates on whether it is due to learning knowledge from training data that are similar in some sense to the test data. With a systematically revised methodology and a blind benchmark realistically mimicking the process of prospective prediction of binding affinity, we have evaluated three broadly used classical scoring functions and five machine-learning counterparts calibrated with both random forest and extreme gradient boosting using both solo and hybrid features, showing for the first time that machine-learning scoring functions trained exclusively on a proportion of as low as 8% complexes dissimilar to the test set already outperform classical scoring functions, a percentage that is far lower than what has been recently reported on all the three CASF benchmarks. The performance of machine-learning scoring functions is underestimated due to the absence of similar samples in some artificially created training sets that discard the full spectrum of complexes to be found in a prospective environment. Given the inevitability of any degree of similarity contained in a large dataset, the criteria for scoring function selection depend on which one can make the best use of all available materials. Software code and data are provided at https://github.com/cusdulab/MLSF for interested readers to rapidly rebuild the scoring functions and reproduce our results, even to make extended analyses on their own benchmarks.
Collapse
Affiliation(s)
| | - Gang Lu
- School of Biomedical Sciences, Chinese University of Hong Kong, Hong Kong
| | - Kam-Heung Sze
- Bioinformatics Unit, Hong Kong Medical Technology Institute, Hong Kong
| | - Xianwei Su
- Chinese University of Hong Kong, Hong Kong
| | - Wai-Yee Chan
- CUHK-SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences, Chinese University of Hong Kong, Hong Kong
| | - Kwong-Sak Leung
- Computer Science and Engineering in the Chinese University of Hong Kong, Hong Kong
| |
Collapse
|
61
|
Di Filippo JI, Cavasotto CN. Guided structure-based ligand identification and design via artificial intelligence modeling. Expert Opin Drug Discov 2021; 17:71-78. [PMID: 34544293 DOI: 10.1080/17460441.2021.1979514] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
INTRODUCTION The implementation of Artificial Intelligence (AI) methodologies to drug discovery (DD) are on the rise. Several applications have been developed for structure-based DD, where AI methods provide an alternative framework for the identification of ligands for validated therapeutic targets, as well as the de novo design of ligands through generative models. AREAS COVERED Herein, the authors review the contributions between the 2019 to present period regarding the application of AI methods to structure-based virtual screening (SBVS) which encompasses mainly molecular docking applications - binding pose prediction and binary classification for ligand or hit identification-, as well as de novo drug design driven by machine learning (ML) generative models, and the validation of AI models in structure-based screening. Studies are reviewed in terms of their main objective, used databases, implemented methodology, input and output, and key results . EXPERT OPINION More profound analyses regarding the validity and applicability of AI methods in DD have begun to appear. In the near future, we expect to see more structure-based generative models- which are scarce in comparison to ligand-based generative models-, the implementation of standard guidelines for validating the generated structures, and more analyses regarding the validation of AI methods in structure-based DD.
Collapse
Affiliation(s)
- Juan I Di Filippo
- Computational Drug Design and Biomedical Informatics Laboratory, Instituto de Investigaciones en Medicina Traslacional (IIMT), CONICET-Universidad Austral, Pilar, Buenos Aires, Argentina.,Facultad de Ciencias Biomédicas, and Facultad de Ingeniería, Universidad Austral, Pilar, Buenos Aires, Argentina.,Austral Institute for Applied Artificial Intelligence, Universidad Austral, Pilar, Buenos Aires, Argentina
| | - Claudio N Cavasotto
- Computational Drug Design and Biomedical Informatics Laboratory, Instituto de Investigaciones en Medicina Traslacional (IIMT), CONICET-Universidad Austral, Pilar, Buenos Aires, Argentina.,Facultad de Ciencias Biomédicas, and Facultad de Ingeniería, Universidad Austral, Pilar, Buenos Aires, Argentina.,Austral Institute for Applied Artificial Intelligence, Universidad Austral, Pilar, Buenos Aires, Argentina
| |
Collapse
|
62
|
Xiong G, Shen C, Yang Z, Jiang D, Liu S, Lu A, Chen X, Hou T, Cao D. Featurization strategies for protein–ligand interactions and their applications in scoring function development. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2021. [DOI: 10.1002/wcms.1567] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Guoli Xiong
- Xiangya School of Pharmaceutical Sciences Central South University Changsha China
| | - Chao Shen
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences Zhejiang University Hangzhou China
| | - Ziyi Yang
- Xiangya School of Pharmaceutical Sciences Central South University Changsha China
| | - Dejun Jiang
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences Zhejiang University Hangzhou China
- College of Computer Science and Technology Zhejiang University Hangzhou China
| | - Shao Liu
- Department of Pharmacy Xiangya Hospital, Central South University Changsha China
| | - Aiping Lu
- Institute for Advancing Translational Medicine in Bone & Joint Diseases, School of Chinese Medicine Hong Kong Baptist University Hong Kong SAR China
| | - Xiang Chen
- Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Hunan Key Laboratory of Skin Cancer and Psoriasis Xiangya Hospital, Central South University Changsha China
| | - Tingjun Hou
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences Zhejiang University Hangzhou China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences Central South University Changsha China
- Institute for Advancing Translational Medicine in Bone & Joint Diseases, School of Chinese Medicine Hong Kong Baptist University Hong Kong SAR China
| |
Collapse
|
63
|
Qin T, Zhu Z, Wang XS, Xia J, Wu S. Computational representations of protein-ligand interfaces for structure-based virtual screening. Expert Opin Drug Discov 2021; 16:1175-1192. [PMID: 34011222 DOI: 10.1080/17460441.2021.1929921] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Introduction: Structure-based virtual screening (SBVS) is an essential strategy for hit identification. SBVS primarily uses molecular docking, which exploits the protein-ligand binding mode and associated affinity score for compound ranking. Previous studies have shown that computational representation of protein-ligand interfaces and the later establishment of machine learning models are efficacious in improving the accuracy of SBVS.Areas covered: The authors review the computational methods for representing protein-ligand interfaces, which include the traditional ones that use deliberately designed fingerprints and descriptors and the more recent methods that automatically extract features with deep learning. The effects of these methods on the performance of machine learning models are briefly discussed. Additionally, case studies that applied various computational representations to machine learning are cited with remarks.Expert opinion: It has become a trend to extract binding features automatically by deep learning, which uses a completely end-to-end representation. However, there is still plenty of scope for improvement . The interpretability of deep-learning models, the organization of data management, the quantity and quality of available data, and the optimization of hyperparameters could impact the accuracy of feature extraction. In addition, other important structural factors such as water molecules and protein flexibility should be considered.
Collapse
Affiliation(s)
- Tong Qin
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Department of New Drug Research and Development, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Zihao Zhu
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Department of New Drug Research and Development, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Xiang Simon Wang
- Artificial Intelligence and Drug Discovery Core Laboratory for District of Columbia Center for AIDS Research (DC CFAR), Department of Pharmaceutical Sciences, College of Pharmacy, Howard University, U.S.A
| | - Jie Xia
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Department of New Drug Research and Development, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Song Wu
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Department of New Drug Research and Development, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
64
|
Perez JJ, Perez RA, Perez A. Computational Modeling as a Tool to Investigate PPI: From Drug Design to Tissue Engineering. Front Mol Biosci 2021; 8:681617. [PMID: 34095231 PMCID: PMC8173110 DOI: 10.3389/fmolb.2021.681617] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 05/05/2021] [Indexed: 12/13/2022] Open
Abstract
Protein-protein interactions (PPIs) mediate a large number of important regulatory pathways. Their modulation represents an important strategy for discovering novel therapeutic agents. However, the features of PPI binding surfaces make the use of structure-based drug discovery methods very challenging. Among the diverse approaches used in the literature to tackle the problem, linear peptides have demonstrated to be a suitable methodology to discover PPI disruptors. Unfortunately, the poor pharmacokinetic properties of linear peptides prevent their direct use as drugs. However, they can be used as models to design enzyme resistant analogs including, cyclic peptides, peptide surrogates or peptidomimetics. Small molecules have a narrower set of targets they can bind to, but the screening technology based on virtual docking is robust and well tested, adding to the computational tools used to disrupt PPI. We review computational approaches used to understand and modulate PPI and highlight applications in a few case studies involved in physiological processes such as cell growth, apoptosis and intercellular communication.
Collapse
Affiliation(s)
- Juan J Perez
- Department of Chemical Engineering, Universitat Politecnica de Catalunya, Barcelona, Spain
| | - Roman A Perez
- Bioengineering Institute of Technology, Universitat Internacional de Catalunya, Sant Cugat, Spain
| | - Alberto Perez
- The Quantum Theory Project, Department of Chemistry, University of Florida, Gainesville, FL, United States
| |
Collapse
|
65
|
Santana K, do Nascimento LD, Lima e Lima A, Damasceno V, Nahum C, Braga RC, Lameira J. Applications of Virtual Screening in Bioprospecting: Facts, Shifts, and Perspectives to Explore the Chemo-Structural Diversity of Natural Products. Front Chem 2021; 9:662688. [PMID: 33996755 PMCID: PMC8117418 DOI: 10.3389/fchem.2021.662688] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 02/25/2021] [Indexed: 12/22/2022] Open
Abstract
Natural products are continually explored in the development of new bioactive compounds with industrial applications, attracting the attention of scientific research efforts due to their pharmacophore-like structures, pharmacokinetic properties, and unique chemical space. The systematic search for natural sources to obtain valuable molecules to develop products with commercial value and industrial purposes remains the most challenging task in bioprospecting. Virtual screening strategies have innovated the discovery of novel bioactive molecules assessing in silico large compound libraries, favoring the analysis of their chemical space, pharmacodynamics, and their pharmacokinetic properties, thus leading to the reduction of financial efforts, infrastructure, and time involved in the process of discovering new chemical entities. Herein, we discuss the computational approaches and methods developed to explore the chemo-structural diversity of natural products, focusing on the main paradigms involved in the discovery and screening of bioactive compounds from natural sources, placing particular emphasis on artificial intelligence, cheminformatics methods, and big data analyses.
Collapse
Affiliation(s)
- Kauê Santana
- Instituto de Biodiversidade, Universidade Federal do Oeste do Pará, Santarém, Brazil
| | | | - Anderson Lima e Lima
- Instituto de Ciências Exatas e Naturais, Universidade Federal do Pará, Belém, Brazil
| | - Vinícius Damasceno
- Instituto de Ciências Exatas e Naturais, Universidade Federal do Pará, Belém, Brazil
| | - Claudio Nahum
- Instituto de Ciências Exatas e Naturais, Universidade Federal do Pará, Belém, Brazil
| | | | - Jerônimo Lameira
- Instituto de Ciências Biológicas, Universidade Federal do Pará, Belém, Brazil
| |
Collapse
|
66
|
Kimber TB, Chen Y, Volkamer A. Deep Learning in Virtual Screening: Recent Applications and Developments. Int J Mol Sci 2021; 22:4435. [PMID: 33922714 PMCID: PMC8123040 DOI: 10.3390/ijms22094435] [Citation(s) in RCA: 71] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 04/13/2021] [Accepted: 04/14/2021] [Indexed: 01/03/2023] Open
Abstract
Drug discovery is a cost and time-intensive process that is often assisted by computational methods, such as virtual screening, to speed up and guide the design of new compounds. For many years, machine learning methods have been successfully applied in the context of computer-aided drug discovery. Recently, thanks to the rise of novel technologies as well as the increasing amount of available chemical and bioactivity data, deep learning has gained a tremendous impact in rational active compound discovery. Herein, recent applications and developments of machine learning, with a focus on deep learning, in virtual screening for active compound design are reviewed. This includes introducing different compound and protein encodings, deep learning techniques as well as frequently used bioactivity and benchmark data sets for model training and testing. Finally, the present state-of-the-art, including the current challenges and emerging problems, are examined and discussed.
Collapse
Affiliation(s)
| | | | - Andrea Volkamer
- In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité-Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany; (T.B.K.); (Y.C.)
| |
Collapse
|
67
|
Andrianov AM, Nikolaev GI, Shuldov NA, Bosko IP, Anischenko AI, Tuzikov AV. Application of deep learning and molecular modeling to identify small drug-like compounds as potential HIV-1 entry inhibitors. J Biomol Struct Dyn 2021; 40:7555-7573. [PMID: 33855929 DOI: 10.1080/07391102.2021.1905559] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
A generative adversarial autoencoder for the rational design of potential HIV-1 entry inhibitors able to block CD4-binding site of the viral envelope protein gp120 was developed. To do this, the following studies were carried out: (i) an autoencoder architecture was constructed; (ii) a virtual compound library of potential anti-HIV-1 agents for training the neural network was formed by the concept of click chemistry allowing one to generate a large number of drug candidates by their assembly from small modular units; (iii) molecular docking of all compounds from this library with gp120 was made and calculations of the values of binding free energy were performed; (iv) molecular fingerprints of chemical compounds from the training dataset were generated; (v) training of the developed autoencoder was implemented followed by the validation of this neural network using more than 21 million molecules from the ZINC15 database. As a result, three small drug-like compounds that exhibited the high-affinity binding to gp120 were identified. According to the data from molecular docking, machine learning, quantum chemical calculations, and molecular dynamics simulations, these compounds show the low values of binding free energy in the complexes with gp120 similar to those calculated using the same computational protocols for the HIV-1 entry inhibitors NBD-11021 and NBD-14010, highly potent and broad anti-HIV-1 agents presenting a new generation of the viral CD4 antagonists. The identified CD4-mimetic candidates are suggested to present good scaffolds for the design of novel antiviral drugs inhibiting the early stages of HIV-1 infection.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Alexander M Andrianov
- Institute of Bioorganic Chemistry, National Academy of Sciences of Belarus, Minsk, Republic of Belarus
| | - Grigory I Nikolaev
- United Institute of Informatics Problems, National Academy of Sciences of Belarus, Minsk, Republic of Belarus
| | - Nikita A Shuldov
- Faculty of Applied Mathematics & Computer Science, Belarusian State University, Minsk, Republic of Belarus
| | - Ivan P Bosko
- United Institute of Informatics Problems, National Academy of Sciences of Belarus, Minsk, Republic of Belarus
| | - Arseny I Anischenko
- Faculty of Applied Mathematics & Computer Science, Belarusian State University, Minsk, Republic of Belarus
| | - Alexander V Tuzikov
- United Institute of Informatics Problems, National Academy of Sciences of Belarus, Minsk, Republic of Belarus
| |
Collapse
|
68
|
Jiménez-Luna J, Grisoni F, Weskamp N, Schneider G. Artificial intelligence in drug discovery: recent advances and future perspectives. Expert Opin Drug Discov 2021; 16:949-959. [PMID: 33779453 DOI: 10.1080/17460441.2021.1909567] [Citation(s) in RCA: 115] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Introduction: Artificial intelligence (AI) has inspired computer-aided drug discovery. The widespread adoption of machine learning, in particular deep learning, in multiple scientific disciplines, and the advances in computing hardware and software, among other factors, continue to fuel this development. Much of the initial skepticism regarding applications of AI in pharmaceutical discovery has started to vanish, consequently benefitting medicinal chemistry.Areas covered: The current status of AI in chemoinformatics is reviewed. The topics discussed herein include quantitative structure-activity/property relationship and structure-based modeling, de novo molecular design, and chemical synthesis prediction. Advantages and limitations of current deep learning applications are highlighted, together with a perspective on next-generation AI for drug discovery.Expert opinion: Deep learning-based approaches have only begun to address some fundamental problems in drug discovery. Certain methodological advances, such as message-passing models, spatial-symmetry-preserving networks, hybrid de novo design, and other innovative machine learning paradigms, will likely become commonplace and help address some of the most challenging questions. Open data sharing and model development will play a central role in the advancement of drug discovery with AI.
Collapse
Affiliation(s)
- José Jiménez-Luna
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Francesca Grisoni
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Nils Weskamp
- Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an Der Riss, Germany
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
69
|
Stein RM, Yang Y, Balius TE, O'Meara MJ, Lyu J, Young J, Tang K, Shoichet BK, Irwin JJ. Property-Unmatched Decoys in Docking Benchmarks. J Chem Inf Model 2021; 61:699-714. [PMID: 33494610 PMCID: PMC7913603 DOI: 10.1021/acs.jcim.0c00598] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Enrichment of ligands versus property-matched decoys is widely used to test and optimize docking library screens. However, the unconstrained optimization of enrichment alone can mislead, leading to false confidence in prospective performance. This can arise by over-optimizing for enrichment against property-matched decoys, without considering the full spectrum of molecules to be found in a true large library screen. Adding decoys representing charge extrema helps mitigate over-optimizing for electrostatic interactions. Adding decoys that represent the overall characteristics of the library to be docked allows one to sample molecules not represented by ligands and property-matched decoys but that one will encounter in a prospective screen. An optimized version of the DUD-E set (DUDE-Z), as well as Extrema and sets representing broad features of the library (Goldilocks), is developed here. We also explore the variability that one can encounter in enrichment calculations and how that can temper one's confidence in small enrichment differences. The new tools and new decoy sets are freely available at http://tldr.docking.org and http://dudez.docking.org.
Collapse
Affiliation(s)
- Reed M Stein
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California 94158, United States
| | - Ying Yang
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California 94158, United States
| | - Trent E Balius
- Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., P.O. Box B, Frederick, Maryland 21702, United States
| | - Matt J O'Meara
- Department of Computational Medicine and Bioinformatics, University of Michigan, Palmer Commons, 100 Washtenaw Ave. #2017, Ann Arbor, Michigan 48109, United States
| | - Jiankun Lyu
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California 94158, United States
| | - Jennifer Young
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California 94158, United States
| | - Khanh Tang
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California 94158, United States
| | - Brian K Shoichet
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California 94158, United States
| | - John J Irwin
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California 94158, United States
| |
Collapse
|
70
|
Kwon Y, Shin WH, Ko J, Lee J. AK-Score: Accurate Protein-Ligand Binding Affinity Prediction Using an Ensemble of 3D-Convolutional Neural Networks. Int J Mol Sci 2020; 21:E8424. [PMID: 33182567 PMCID: PMC7697539 DOI: 10.3390/ijms21228424] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2020] [Revised: 10/24/2020] [Accepted: 11/07/2020] [Indexed: 02/04/2023] Open
Abstract
Accurate prediction of the binding affinity of a protein-ligand complex is essential for efficient and successful rational drug design. Therefore, many binding affinity prediction methods have been developed. In recent years, since deep learning technology has become powerful, it is also implemented to predict affinity. In this work, a new neural network model that predicts the binding affinity of a protein-ligand complex structure is developed. Our model predicts the binding affinity of a complex using the ensemble of multiple independently trained networks that consist of multiple channels of 3-D convolutional neural network layers. Our model was trained using the 3772 protein-ligand complexes from the refined set of the PDBbind-2016 database and tested using the core set of 285 complexes. The benchmark results show that the Pearson correlation coefficient between the predicted binding affinities by our model and the experimental data is 0.827, which is higher than the state-of-the-art binding affinity prediction scoring functions. Additionally, our method ranks the relative binding affinities of possible multiple binders of a protein quite accurately, comparable to the other scoring functions. Last, we measured which structural information is critical for predicting binding affinity and found that the complementarity between the protein and ligand is most important.
Collapse
Affiliation(s)
- Yongbeom Kwon
- Department of Chemistry, Kangwon National University, Gangwon-do, Chuncheon 24341, Korea;
| | - Woong-Hee Shin
- Department of Chemical Science Education, Sunchon National University, Jeollanam-do, Suncheon 57922, Korea
| | - Junsu Ko
- Arontier, 241 Gangnam-daero, Seocho-gu, Seoul 06735, Korea
| | - Juyong Lee
- Department of Chemistry, Kangwon National University, Gangwon-do, Chuncheon 24341, Korea;
| |
Collapse
|
71
|
Yang ZY, Yang ZJ, Lu AP, Hou TJ, Cao DS. Scopy: an integrated negative design python library for desirable HTS/VS database design. Brief Bioinform 2020; 22:5901981. [PMID: 32892221 DOI: 10.1093/bib/bbaa194] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Revised: 07/27/2020] [Accepted: 07/28/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND High-throughput screening (HTS) and virtual screening (VS) have been widely used to identify potential hits from large chemical libraries. However, the frequent occurrence of 'noisy compounds' in the screened libraries, such as compounds with poor drug-likeness, poor selectivity or potential toxicity, has greatly weakened the enrichment capability of HTS and VS campaigns. Therefore, the development of comprehensive and credible tools to detect noisy compounds from chemical libraries is urgently needed in early stages of drug discovery. RESULTS In this study, we developed a freely available integrated python library for negative design, called Scopy, which supports the functions of data preparation, calculation of descriptors, scaffolds and screening filters, and data visualization. The current version of Scopy can calculate 39 basic molecular properties, 3 comprehensive molecular evaluation scores, 2 types of molecular scaffolds, 6 types of substructure descriptors and 2 types of fingerprints. A number of important screening rules are also provided by Scopy, including 15 drug-likeness rules (13 drug-likeness rules and 2 building block rules), 8 frequent hitter rules (four assay interference substructure filters and four promiscuous compound substructure filters), and 11 toxicophore filters (five human-related toxicity substructure filters, three environment-related toxicity substructure filters and three comprehensive toxicity substructure filters). Moreover, this library supports four different visualization functions to help users to gain a better understanding of the screened data, including basic feature radar chart, feature-feature-related scatter diagram, functional group marker gram and cloud gram. CONCLUSION Scopy provides a comprehensive Python package to filter out compounds with undesirable properties or substructures, which will benefit the design of high-quality chemical libraries for drug design and discovery. It is freely available at https://github.com/kotori-y/Scopy.
Collapse
Affiliation(s)
- Zi-Yi Yang
- Xiangya School of Pharmaceutical Sciences, Central South University (Changsha)
| | - Zhi-Jiang Yang
- Xiangya School of Pharmaceutical Sciences, Central South University
| | - Ai-Ping Lu
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong
| | - Ting-Jun Hou
- College of Pharmaceutical Sciences, Zhejiang University, China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, China
| |
Collapse
|