51
|
Patil SP, Fattakhova E, Hofer J, Oravic M, Bender A, Brearey J, Parker D, Radnoff M, Smith Z. Machine-Learning Guided Discovery of Bioactive Inhibitors of PD1-PDL1 Interaction. Pharmaceuticals (Basel) 2022; 15:ph15050613. [PMID: 35631439 PMCID: PMC9145945 DOI: 10.3390/ph15050613] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 05/06/2022] [Accepted: 05/13/2022] [Indexed: 02/06/2023] Open
Abstract
The selective activation of the innate immune system through blockade of immune checkpoint PD1-PDL1 interaction has proven effective against a variety of cancers. In contrast to six antibody therapies approved and several under clinical investigation, the development of small-molecule PD1-PDL1 inhibitors is still in its infancy with no such drugs approved yet. Nevertheless, a promising series of small molecules inducing PDL1 dimerization has revealed important spatio-chemical features required for effective PD1-PDL1 inhibition through PDL1 sequestration. In the present study, we utilized these features for developing machine-learning (ML) classifiers by fitting Random Forest models to six 2D fingerprint descriptors. A focused database of ~16 K bioactive molecules, including approved and experimental drugs, was screened using these ML models, leading to classification of 361 molecules as potentially active. These ML hits were subjected to molecular docking studies to further shortlist them based on their binding interactions within the PDL1 dimer pocket. The top 20 molecules with favorable interactions were experimentally tested using HTRF human PD1-PDL1 binding assays, leading to the identification of two active molecules, CRT5 and P053, with the IC50 values of 22.35 and 33.65 µM, respectively. Owing to their bioactive nature, our newly discovered molecules may prove suitable for further medicinal chemistry optimization, leading to more potent and selective PD1-PDL1 inhibitors. Finally, our ML models and the integrated screening protocol may prove useful for screening larger libraries for novel PD1-PDL1 inhibitors.
Collapse
Affiliation(s)
- Sachin P. Patil
- NanoBio Lab, School of Engineering, Widener University, Chester, PA 19013, USA
- Department of Chemical Engineering, Widener University, Chester, PA 19013, USA; (E.F.); (A.B.); (J.B.); (D.P.); (M.R.); (Z.S.)
- Correspondence: ; Tel.: +1-610-499-4492
| | - Elena Fattakhova
- Department of Chemical Engineering, Widener University, Chester, PA 19013, USA; (E.F.); (A.B.); (J.B.); (D.P.); (M.R.); (Z.S.)
| | - Jeremy Hofer
- Department of Computer Science, Widener University, Chester, PA 19013, USA;
| | - Michael Oravic
- Department of Biomedical Engineering, Widener University, Chester, PA 19013, USA;
| | - Autumn Bender
- Department of Chemical Engineering, Widener University, Chester, PA 19013, USA; (E.F.); (A.B.); (J.B.); (D.P.); (M.R.); (Z.S.)
| | - Jason Brearey
- Department of Chemical Engineering, Widener University, Chester, PA 19013, USA; (E.F.); (A.B.); (J.B.); (D.P.); (M.R.); (Z.S.)
| | - Daniel Parker
- Department of Chemical Engineering, Widener University, Chester, PA 19013, USA; (E.F.); (A.B.); (J.B.); (D.P.); (M.R.); (Z.S.)
| | - Madison Radnoff
- Department of Chemical Engineering, Widener University, Chester, PA 19013, USA; (E.F.); (A.B.); (J.B.); (D.P.); (M.R.); (Z.S.)
| | - Zackary Smith
- Department of Chemical Engineering, Widener University, Chester, PA 19013, USA; (E.F.); (A.B.); (J.B.); (D.P.); (M.R.); (Z.S.)
| |
Collapse
|
52
|
Polyakov VR, Alexandrov V, Maderna A, Bajjuri K, Li X, Zhou S. Indexing Ultrafast Shape-Based Descriptors in MongoDB to Identify TLR4 Pathway Agonists. J Chem Inf Model 2022; 62:2446-2455. [PMID: 35522137 DOI: 10.1021/acs.jcim.2c00156] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
A method is presented for an ultrafast shape-based search workflow for the screening of large compound collections, i.e., those of vendors. The three-dimensional shape of a molecule dictates its biological activity by enabling the molecule to fit into binding pockets of proteins. Quite often, distinctly different chemical compounds that have similar shapes can bind in a similar way. OpenEye pioneered an algorithm for comparing shapes of molecules by overlaying them in a computer and measuring differences between a query molecule and a target molecule. Overlaying shapes is a computationally intensive process and represents a bottleneck in searching for similar molecules. More recent publications describe alternative methods of overlaying molecules, which are accomplished by comparing shape-based descriptors. These methods were implemented in the Open Drug Discovery Toolkit (ODDT) package. We utilized a combination of open-source software packages like ODDT and RDkit to implement a workflow for ultrafast conformer generation and matching that does not require storing precomputed conformers on the file system or in memory. Moreover, the generated descriptors could be optionally stored in MongoDB for performing searches in the future. To speed up the search, we created a set of indexes from the transformed shape-based descriptors. We are in the process of calculating descriptors for multiple vendors, including Enamine's "REAL" collection of 1.2 billion compounds. Currently, the shape similarity search on more than 70 million compounds takes less than 8 s! We exemplified our methodology with the screen of compounds that can act as putative TLR4 agonists. The search was based on a literature-known small-molecule TLR4 agonist series. In due course, we identified compounds with novel structural motifs that were active in mouse and human TLR4 reporter cell lines.
Collapse
Affiliation(s)
- Valery R Polyakov
- Sutro Biopharma, 111 Oyster Point Blvd, South San Francisco, California 94080, United States
| | - Vadim Alexandrov
- Liquid Algo LLC, 85 Thistle Ln, Hopewell Junction, New York 12533, United States
| | - Andreas Maderna
- Sutro Biopharma, 111 Oyster Point Blvd, South San Francisco, California 94080, United States
| | - Krishna Bajjuri
- Sutro Biopharma, 111 Oyster Point Blvd, South San Francisco, California 94080, United States
| | - Xiaofan Li
- Sutro Biopharma, 111 Oyster Point Blvd, South San Francisco, California 94080, United States
| | - Sihong Zhou
- Sutro Biopharma, 111 Oyster Point Blvd, South San Francisco, California 94080, United States
| |
Collapse
|
53
|
Venkatraman V, Colligan TH, Lesica GT, Olson DR, Gaiser J, Copeland CJ, Wheeler TJ, Roy A. Drugsniffer: An Open Source Workflow for Virtually Screening Billions of Molecules for Binding Affinity to Protein Targets. Front Pharmacol 2022; 13:874746. [PMID: 35559261 PMCID: PMC9086895 DOI: 10.3389/fphar.2022.874746] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2022] [Accepted: 04/04/2022] [Indexed: 11/13/2022] Open
Abstract
The SARS-CoV2 pandemic has highlighted the importance of efficient and effective methods for identification of therapeutic drugs, and in particular has laid bare the need for methods that allow exploration of the full diversity of synthesizable small molecules. While classical high-throughput screening methods may consider up to millions of molecules, virtual screening methods hold the promise of enabling appraisal of billions of candidate molecules, thus expanding the search space while concurrently reducing costs and speeding discovery. Here, we describe a new screening pipeline, called drugsniffer, that is capable of rapidly exploring drug candidates from a library of billions of molecules, and is designed to support distributed computation on cluster and cloud resources. As an example of performance, our pipeline required ∼40,000 total compute hours to screen for potential drugs targeting three SARS-CoV2 proteins among a library of ∼3.7 billion candidate molecules.
Collapse
Affiliation(s)
- Vishwesh Venkatraman
- Department of Chemistry, Norwegian University of Science and Technology, Trondheim, Norway
| | - Thomas H. Colligan
- Department of Computer Science, University of Montana, Missoula, MT, United States
| | - George T. Lesica
- Department of Computer Science, University of Montana, Missoula, MT, United States
| | - Daniel R. Olson
- Department of Computer Science, University of Montana, Missoula, MT, United States
| | - Jeremiah Gaiser
- Department of Computer Science, University of Montana, Missoula, MT, United States
| | - Conner J. Copeland
- Department of Computer Science, University of Montana, Missoula, MT, United States
| | - Travis J. Wheeler
- Department of Computer Science, University of Montana, Missoula, MT, United States
| | - Amitava Roy
- Department of Computer Science, University of Montana, Missoula, MT, United States
- Rocky Mountain Laboratories, Bioinformatics and Computational Biosciences Branch, Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Hamilton, MT, United States
| |
Collapse
|
54
|
Galaxy workflows for fragment-based virtual screening: a case study on the SARS-CoV-2 main protease. J Cheminform 2022; 14:22. [PMID: 35414112 PMCID: PMC9003163 DOI: 10.1186/s13321-022-00588-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 02/09/2022] [Indexed: 12/03/2022] Open
Abstract
We present several workflows for protein-ligand docking and free energy calculation for use in the workflow management system Galaxy. The workflows are composed of several widely used open-source tools, including rDock and GROMACS, and can be executed on public infrastructure using either Galaxy’s graphical interface or the command line. We demonstrate the utility of the workflows by running a high-throughput virtual screening of around 50000 compounds against the SARS-CoV-2 main protease, a system which has been the subject of intense study in the last year.
Collapse
|
55
|
Seo D, Ansari R, Lee K, Kieffer J, Kim J. Amplifying the Sensitivity of Polydiacetylene Sensors: The Dummy Molecule Approach. ACS APPLIED MATERIALS & INTERFACES 2022; 14:14561-14567. [PMID: 35293721 DOI: 10.1021/acsami.1c25066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
There is an increasing need for fast and accurate assessment of various health conditions, where polydiacetylenes (PDA), having unique stress-sensitive optical properties, have great potential. When the conjugated backbone of PDA is disturbed by steric repulsion between the receptor-target complexes formed at the PDA surface via specific recognition events, the bandgap of PDA increases and produces color change and fluorescent emission as a dual sensory signal. However, this detection mechanism suggests an intrinsic sensitivity limit of PDA platform because unless adjacent receptors are occupied by target molecules no signal is anticipated. A novel approach to improve the sensitivity and limit of detection of PDA sensors has been developed by preoccupying the surface of PDA liposomes with an optimized amount of artificial target molecules named as dummy molecules. The sensitivity and limit of detection (LOD) showed large improvement by the surface-bound dummy molecules. In addition, the dummy strategy was synergically integrated with another sensitivity enhancing method with a different working mechanism in a PDA sensor for Neomycin detection. When optimized, the LOD of the PDA sensor was improved to 7 nM from 80 nM of the control and the signal intensity increased consistently throughout the entire tested concentration range of the target Neomycin. Finally, the general applicability of the dummy strategy to other target molecules was successfully confirmed by implementing the dummy strategy in a PDA sensor for Surfactin detection.
Collapse
Affiliation(s)
- Deokwon Seo
- Program in Nanoscience and Technology, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, 08826, Republic of Korea
- Department of Materials Science and Engineering, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Ramin Ansari
- Department of Chemical Engineering, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Kangwon Lee
- Program in Nanoscience and Technology, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, 08826, Republic of Korea
| | - John Kieffer
- Department of Materials Science and Engineering, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Jinsang Kim
- Department of Materials Science and Engineering, University of Michigan, Ann Arbor, Michigan 48109, United States
- Department of Chemical Engineering, University of Michigan, Ann Arbor, Michigan 48109, United States
- Macromolecular Science and Engineering, University of Michigan, Ann Arbor, Michigan 48109, United States
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
- Biointerfaces Institute, University of Michigan, Ann Arbor, Michigan 48109, United States
| |
Collapse
|
56
|
Ding K, Yin S, Li Z, Jiang S, Yang Y, Zhou W, Zhang Y, Huang B. Observing Noncovalent Interactions in Experimental Electron Density for Macromolecular Systems: A Novel Perspective for Protein–Ligand Interaction Research. J Chem Inf Model 2022; 62:1734-1743. [DOI: 10.1021/acs.jcim.1c01406] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Kang Ding
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| | - Shiqiu Yin
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| | - Zhongwei Li
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| | - Shiju Jiang
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| | - Yang Yang
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| | - Wenbiao Zhou
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| | - Yingsheng Zhang
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| | - Bo Huang
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| |
Collapse
|
57
|
Chalcones from Angelica keiskei (ashitaba) inhibit key Zika virus replication proteins. Bioorg Chem 2022; 120:105649. [PMID: 35124513 PMCID: PMC9187613 DOI: 10.1016/j.bioorg.2022.105649] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 01/21/2022] [Accepted: 01/25/2022] [Indexed: 12/25/2022]
Abstract
Zika virus (ZIKV) is a dangerous human pathogen and no antiviral drugs have been approved to date. The chalcones are a group of small molecules that are found in a number of different plants, including Angelica keiskei Koidzumi, also known as ashitaba. To examine chalcone anti-ZIKV activity, three chalcones, 4-hydroxyderricin (4HD), xanthoangelol (XA), and xanthoangelol-E (XA-E), were purified from a methanol-ethyl acetate extract from A. keiskei. Molecular and ensemble docking predicted that these chalcones would establish multiple interactions with residues in the catalytic and allosteric sites of ZIKV NS2B-NS3 protease, and in the allosteric site of the NS5 RNA-dependent RNA-polymerase (RdRp). Machine learning models also predicted 4HD, XA and XA-E as potential anti-ZIKV inhibitors. Enzymatic and kinetic assays confirmed chalcone inhibition of the ZIKV NS2B-NS3 protease allosteric site with IC50s from 18 to 50 µM. Activity assays also revealed that XA, but not 4HD or XA-E, inhibited the allosteric site of the RdRp, with an IC50 of 6.9 µM. Finally, we tested these chalcones for their anti-viral activity in vitro with Vero cells. 4HD and XA-E displayed anti-ZIKV activity with EC50 values of 6.6 and 22.0 µM, respectively, while XA displayed relatively weak anti-ZIKV activity with whole cells. With their simple structures and relative ease of modification, the chalcones represent attractive candidates for hit-to-lead optimization in the search of new anti-ZIKV therapeutics.
Collapse
|
58
|
Young J, Garikipati N, Durrant JD. BINANA 2: Characterizing Receptor/Ligand Interactions in Python and JavaScript. J Chem Inf Model 2022; 62:753-760. [PMID: 35129332 PMCID: PMC8889568 DOI: 10.1021/acs.jcim.1c01461] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
![]()
BINding ANAlyzer
(BINANA) is an algorithm for identifying and characterizing
receptor/ligand interactions and other factors that contribute to
binding. We recently updated BINANA to make the algorithm more accessible
to a broader audience. We have also ported the Python3 codebase to
JavaScript, thus enabling BINANA analysis in the web browser. As proof
of principle, we created a web-browser application so students and
chemical-biology researchers can quickly visualize receptor/ligand
complexes and their unique binding interactions.
Collapse
Affiliation(s)
- Jade Young
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Neerja Garikipati
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Jacob D Durrant
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| |
Collapse
|
59
|
Oguike OE, Ugwuishiwu CH, Asogwa CN, Nnadi CO, Obonga WO, Attama AA. Systematic review on the application of machine learning to quantitative structure-activity relationship modeling against Plasmodium falciparum. Mol Divers 2022; 26:3447-3462. [PMID: 35064444 PMCID: PMC8782692 DOI: 10.1007/s11030-022-10380-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Accepted: 01/07/2022] [Indexed: 11/29/2022]
Abstract
Malaria accounts for over two million deaths globally. To flatten this curve, there is a need to develop new and high potent drugs against Plasmodium falciparum. Some major challenges include the dearth of suitable animal models for anti-P. falciparum assays, resistance to first-line drugs, lack of vaccines and the complex life cycle of Plasmodium. Gladly, newer approaches to antimalarial drug discovery have emerged due to the release of large datasets by pharmaceutical companies. This review provides insights into these new approaches to drug discovery covering different machine learning tools, which enhance the development of new compounds. It provides a systematic review on the use and prospects of machine learning in predicting, classifying and clustering IC50 values of bioactive compounds against P. falciparum. The authors identified many machine learning tools yet to be applied for this purpose. However, Random Forest and Support Vector Machines have been extensively applied though on a limited dataset of compounds.
Collapse
Affiliation(s)
- Osondu Everestus Oguike
- Machine Learning Research Group, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria.,Department of Computer Science, Faculty of Physical Sciences, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria
| | - Chikodili Helen Ugwuishiwu
- Machine Learning Research Group, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria.,Department of Computer Science, Faculty of Physical Sciences, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria
| | - Caroline Ngozi Asogwa
- Machine Learning Research Group, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria.,Department of Computer Science, Faculty of Physical Sciences, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria
| | - Charles Okeke Nnadi
- Machine Learning Research Group, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria. .,Deprtment of Pharmaceutical and Medicinal Chemistry, Faculty of Pharmaceutical Sciences, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria.
| | - Wilfred Ofem Obonga
- Machine Learning Research Group, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria.,Deprtment of Pharmaceutical and Medicinal Chemistry, Faculty of Pharmaceutical Sciences, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria
| | - Anthony Amaechi Attama
- Machine Learning Research Group, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria.,Department of Pharmaceutics, Faculty of Pharmaceutical Sciences, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria
| |
Collapse
|
60
|
Jiang D, Hsieh CY, Wu Z, Kang Y, Wang J, Wang E, Liao B, Shen C, Xu L, Wu J, Cao D, Hou T. InteractionGraphNet: A Novel and Efficient Deep Graph Representation Learning Framework for Accurate Protein-Ligand Interaction Predictions. J Med Chem 2021; 64:18209-18232. [PMID: 34878785 DOI: 10.1021/acs.jmedchem.1c01830] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Accurate quantification of protein-ligand interactions remains a key challenge to structure-based drug design. However, traditional machine learning (ML)-based methods based on handcrafted descriptors, one-dimensional protein sequences, and/or two-dimensional graph representations limit their capability to learn the generalized molecular interactions in 3D space. Here, we proposed a novel deep graph representation learning framework named InteractionGraphNet (IGN) to learn the protein-ligand interactions from the 3D structures of protein-ligand complexes. In IGN, two independent graph convolution modules were stacked to sequentially learn the intramolecular and intermolecular interactions, and the learned intermolecular interactions can be efficiently used for subsequent tasks. Extensive binding affinity prediction, large-scale structure-based virtual screening, and pose prediction experiments demonstrated that IGN achieved better or competitive performance against other state-of-the-art ML-based baselines and docking programs. More importantly, such state-of-the-art performance was proven from the successful learning of the key features in protein-ligand interactions instead of just memorizing certain biased patterns from data.
Collapse
Affiliation(s)
- Dejun Jiang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,College of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China.,State Key Laboratory of CAD&CG, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Chang-Yu Hsieh
- Tencent Quantum Laboratory, Tencent, Shenzhen 518057, Guangdong, China
| | - Zhenxing Wu
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jike Wang
- School of Computer Science, Wuhan University, Wuhan 430072, Hubei, China
| | - Ercheng Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Ben Liao
- Tencent Quantum Laboratory, Tencent, Shenzhen 518057, Guangdong, China
| | - Chao Shen
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Jian Wu
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410004, Hunan, China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Laboratory of CAD&CG, Zhejiang University, Hangzhou 310058, Zhejiang, China
| |
Collapse
|
61
|
Wang DD, Chan MT, Yan H. Structure-based protein-ligand interaction fingerprints for binding affinity prediction. Comput Struct Biotechnol J 2021; 19:6291-6300. [PMID: 34900139 PMCID: PMC8637032 DOI: 10.1016/j.csbj.2021.11.018] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 11/09/2021] [Accepted: 11/13/2021] [Indexed: 11/17/2022] Open
Abstract
Binding affinity prediction (BAP) using protein–ligand complex structures is crucial to computer-aided drug design, but remains a challenging problem. To achieve efficient and accurate BAP, machine-learning scoring functions (SFs) based on a wide range of descriptors have been developed. Among those descriptors, protein–ligand interaction fingerprints (IFPs) are competitive due to their simple representations, elaborate profiles of key interactions and easy collaborations with machine-learning algorithms. In this paper, we have adopted a building-block-based taxonomy to review a broad range of IFP models, and compared representative IFP-based SFs in target-specific and generic scoring tasks. Atom-pair-counts-based and substructure-based IFPs show great potential in these tasks.
Collapse
Affiliation(s)
- Debby D Wang
- School of Health Science and Engineering, University of Shanghai for Science and Technology, 516 Jungong Rd, Shanghai 200093, China
| | - Moon-Tong Chan
- School of Science and Technology, Hong Kong Metropolitan University, 30 Good Shepherd St, Ho Man Tin, Hong Kong
| | - Hong Yan
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
| |
Collapse
|
62
|
Nguyen TB, Pires DEV, Ascher DB. CSM-carbohydrate: protein-carbohydrate binding affinity prediction and docking scoring function. Brief Bioinform 2021; 23:6457169. [PMID: 34882232 DOI: 10.1093/bib/bbab512] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 11/06/2021] [Accepted: 11/08/2021] [Indexed: 12/29/2022] Open
Abstract
Protein-carbohydrate interactions are crucial for many cellular processes but can be challenging to biologically characterise. To improve our understanding and ability to model these molecular interactions, we used a carefully curated set of 370 protein-carbohydrate complexes with experimental structural and biophysical data in order to train and validate a new tool, cutoff scanning matrix (CSM)-carbohydrate, using machine learning algorithms to accurately predict their binding affinity and rank docking poses as a scoring function. Information on both protein and carbohydrate complementarity, in terms of shape and chemistry, was captured using graph-based structural signatures. Across both training and independent test sets, we achieved comparable Pearson's correlations of 0.72 under cross-validation [root mean square error (RMSE) of 1.58 Kcal/mol] and 0.67 on the independent test (RMSE of 1.72 Kcal/mol), providing confidence in the generalisability and robustness of the final model. Similar performance was obtained across mono-, di- and oligosaccharides, further highlighting the applicability of this approach to the study of larger complexes. We show CSM-carbohydrate significantly outperformed previous approaches and have implemented our method and make all data freely available through both a user-friendly web interface and application programming interface, to facilitate programmatic access at http://biosig.unimelb.edu.au/csm_carbohydrate/. We believe CSM-carbohydrate will be an invaluable tool for helping assess docking poses and the effects of mutations on protein-carbohydrate affinity, unravelling important aspects that drive binding recognition.
Collapse
Affiliation(s)
- Thanh Binh Nguyen
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia.,Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
63
|
Ricci-Lopez J, Aguila SA, Gilson MK, Brizuela CA. Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning. J Chem Inf Model 2021; 61:5362-5376. [PMID: 34652141 DOI: 10.1021/acs.jcim.1c00511] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
One of the main challenges of structure-based virtual screening (SBVS) is the incorporation of the receptor's flexibility, as its explicit representation in every docking run implies a high computational cost. Therefore, a common alternative to include the receptor's flexibility is the approach known as ensemble docking. Ensemble docking consists of using a set of receptor conformations and performing the docking assays over each of them. However, there is still no agreement on how to combine the ensemble docking results to obtain the final ligand ranking. A common choice is to use consensus strategies to aggregate the ensemble docking scores, but these strategies exhibit slight improvement regarding the single-structure approach. Here, we claim that using machine learning (ML) methodologies over the ensemble docking results could improve the predictive power of SBVS. To test this hypothesis, four proteins were selected as study cases: CDK2, FXa, EGFR, and HSP90. Protein conformational ensembles were built from crystallographic structures, whereas the evaluated compound library comprised up to three benchmarking data sets (DUD, DEKOIS 2.0, and CSAR-2012) and cocrystallized molecules. Ensemble docking results were processed through 30 repetitions of 4-fold cross-validation to train and validate two ML classifiers: logistic regression and gradient boosting trees. Our results indicate that the ML classifiers significantly outperform traditional consensus strategies and even the best performance case achieved with single-structure docking. We provide statistical evidence that supports the effectiveness of ML to improve the ensemble docking performance.
Collapse
Affiliation(s)
- Joel Ricci-Lopez
- Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California C.P. 22860, Mexico.,Centro de Nanociencias y Nanotecnología, Universidad Nacional Autónoma de México (UNAM), Ensenada, Baja California C.P. 22860, Mexico
| | - Sergio A Aguila
- Centro de Nanociencias y Nanotecnología, Universidad Nacional Autónoma de México (UNAM), Ensenada, Baja California C.P. 22860, Mexico
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, La Jolla, San Diego, California 92093, United States
| | - Carlos A Brizuela
- Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California C.P. 22860, Mexico
| |
Collapse
|
64
|
Asai A, Konno M, Taniguchi M, Vecchione A, Ishii H. Computational healthcare: Present and future perspectives (Review). Exp Ther Med 2021; 22:1351. [PMID: 34659497 PMCID: PMC8515560 DOI: 10.3892/etm.2021.10786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 07/19/2021] [Indexed: 12/05/2022] Open
Abstract
Artificial intelligence (AI) has been developed through repeated new discoveries since around 1960. The use of AI is now becoming widespread within society and our daily lives. AI is also being introduced into healthcare, such as medicine and drug development; however, it is currently biased towards specific domains. The present review traces the history of the development of various AI-based applications in healthcare and compares AI-based healthcare with conventional healthcare to show the future prospects for this type of care. Knowledge of the past and present development of AI-based applications would be useful for the future utilization of novel AI approaches in healthcare.
Collapse
Affiliation(s)
- Ayumu Asai
- Center of Medical Innovation and Translational Research, Department of Medical Data Science, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan.,Artificial Intelligence Research Center, Osaka University, Ibaraki, Osaka 567-0047, Japan.,The Institute of Scientific and Industrial Research, Osaka University, Ibaraki, Osaka 567-0047, Japan
| | - Masamitsu Konno
- Center of Medical Innovation and Translational Research, Department of Medical Data Science, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan
| | - Masateru Taniguchi
- The Institute of Scientific and Industrial Research, Osaka University, Ibaraki, Osaka 567-0047, Japan
| | - Andrea Vecchione
- Department of Clinical and Molecular Medicine, University of Rome 'Sapienza', Santo Andrea Hospital, I-1035-00189 Rome, Italy
| | - Hideshi Ishii
- Center of Medical Innovation and Translational Research, Department of Medical Data Science, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan
| |
Collapse
|
65
|
Bouysset C, Fiorucci S. ProLIF: a library to encode molecular interactions as fingerprints. J Cheminform 2021; 13:72. [PMID: 34563256 PMCID: PMC8466659 DOI: 10.1186/s13321-021-00548-6] [Citation(s) in RCA: 87] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 08/30/2021] [Indexed: 12/21/2022] Open
Abstract
Interaction fingerprints are vector representations that summarize the three-dimensional nature of interactions in molecular complexes, typically formed between a protein and a ligand. This kind of encoding has found many applications in drug-discovery projects, from structure-based virtual-screening to machine-learning. Here, we present ProLIF, a Python library designed to generate interaction fingerprints for molecular complexes extracted from molecular dynamics trajectories, experimental structures, and docking simulations. It can handle complexes formed of any combination of ligand, protein, DNA, or RNA molecules. The available interaction types can be fully reparametrized or extended by user-defined ones. Several tutorials that cover typical use-case scenarios are available, and the documentation is accompanied with code snippets showcasing the integration with other data-analysis libraries for a more seamless user-experience. The library can be freely installed from our GitHub repository (https://github.com/chemosim-lab/ProLIF).
Collapse
Affiliation(s)
- Cédric Bouysset
- Institut de Chimie de Nice UMR7272, Université Côte d'Azur, CNRS, Nice, France.
| | - Sébastien Fiorucci
- Institut de Chimie de Nice UMR7272, Université Côte d'Azur, CNRS, Nice, France.
| |
Collapse
|
66
|
Méndez-Álvarez D, Herrera-Mayorga V, Juárez-Saldivar A, Paz-González AD, Ortiz-Pérez E, Bandyopadhyay D, Pérez-Sánchez H, Rivera G. Ligand-based virtual screening, molecular docking, and molecular dynamics of eugenol analogs as potential acetylcholinesterase inhibitors with biological activity against Spodoptera frugiperda. Mol Divers 2021; 26:2025-2037. [PMID: 34529209 DOI: 10.1007/s11030-021-10312-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 09/02/2021] [Indexed: 11/26/2022]
Abstract
The development of new, more selective, environmental-friendly insecticide alternatives is in high demand for the control of Spodoptera frugiperda (S. frugiperda). The major objective of this work was to search for new potential S. frugiperda acetylcholinesterase (AChE) inhibitors. A ligand-based virtual screening was initially carried out considering six scaffolds derived from eugenol and the ZINC15, PubChem, and MolPort databases. Subsequently, molecular docking analysis of the selected compounds on the active site and a second region (determined by blind molecular docking) of the AChE of S. frugiperda was performed. Molecular dynamics and Molecular Mechanics Poisson-Boltzmann Surface Area analyses were also applied to improve the docking results. Finally, three new eugenol analogs were evaluated in vitro against S. frugiperda larvae. The virtual screening identified 1609 compounds from the chemical libraries. Control compounds were selected from the interaction fingerprint by molecular docking. Only three new eugenol analogs (1, 3, and 4) were stable at 50 ns by molecular dynamics. Compounds 1 and 4 had the best biological activity by diet (LC50 = 0.042 mg/mL) and by topical route (LC50 = 0.027 mg/mL), respectively. At least three new eugenol derivatives possessed good-to-excellent insecticidal activity against S. frugiperda.
Collapse
Affiliation(s)
- Domingo Méndez-Álvarez
- Laboratorio de Biotecnología Farmacéutica, Centro de Biotecnología Genómica, Instituto Politécnico Nacional, 88710, Reynosa, Tamaulipas, México
| | - Verónica Herrera-Mayorga
- Departamento de Ingeniería Bioquímica, Unidad Académica Multidisciplinaria Mante, Universidad Autónoma de Tamaulipas, 89840, Mante, Tamaulipas, México
| | - Alfredo Juárez-Saldivar
- Laboratorio de Biotecnología Farmacéutica, Centro de Biotecnología Genómica, Instituto Politécnico Nacional, 88710, Reynosa, Tamaulipas, México
| | - Alma D Paz-González
- Laboratorio de Biotecnología Farmacéutica, Centro de Biotecnología Genómica, Instituto Politécnico Nacional, 88710, Reynosa, Tamaulipas, México
| | - Eyra Ortiz-Pérez
- Laboratorio de Biotecnología Farmacéutica, Centro de Biotecnología Genómica, Instituto Politécnico Nacional, 88710, Reynosa, Tamaulipas, México
| | - Debasish Bandyopadhyay
- Department of Chemistry and SEEMS, University of Texas Rio Grande Valley, Edinburg, TX, 78539, USA
| | - Horacio Pérez-Sánchez
- Structural Bioinformatics and High-Performance Computing Research Group (BIO-HPC), Computer Engineering Department, Universidad Católica San Antonio De Murcia (UCAM), 30107, Murcia, Spain
| | - Gildardo Rivera
- Laboratorio de Biotecnología Farmacéutica, Centro de Biotecnología Genómica, Instituto Politécnico Nacional, 88710, Reynosa, Tamaulipas, México.
| |
Collapse
|
67
|
Xiong G, Shen C, Yang Z, Jiang D, Liu S, Lu A, Chen X, Hou T, Cao D. Featurization strategies for protein–ligand interactions and their applications in scoring function development. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2021. [DOI: 10.1002/wcms.1567] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Guoli Xiong
- Xiangya School of Pharmaceutical Sciences Central South University Changsha China
| | - Chao Shen
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences Zhejiang University Hangzhou China
| | - Ziyi Yang
- Xiangya School of Pharmaceutical Sciences Central South University Changsha China
| | - Dejun Jiang
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences Zhejiang University Hangzhou China
- College of Computer Science and Technology Zhejiang University Hangzhou China
| | - Shao Liu
- Department of Pharmacy Xiangya Hospital, Central South University Changsha China
| | - Aiping Lu
- Institute for Advancing Translational Medicine in Bone & Joint Diseases, School of Chinese Medicine Hong Kong Baptist University Hong Kong SAR China
| | - Xiang Chen
- Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Hunan Key Laboratory of Skin Cancer and Psoriasis Xiangya Hospital, Central South University Changsha China
| | - Tingjun Hou
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences Zhejiang University Hangzhou China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences Central South University Changsha China
- Institute for Advancing Translational Medicine in Bone & Joint Diseases, School of Chinese Medicine Hong Kong Baptist University Hong Kong SAR China
| |
Collapse
|
68
|
Coelho FS, Oliveira MM, Vieira DP, Torres PHM, Moreira ICF, Martins-Duarte ES, Gonçalves IC, Cabanelas A, Pascutti PG, Fragoso SP, Lopes AH. A novel receptor for platelet-activating factor and lysophosphatidylcholine in Trypanosoma cruzi. Mol Microbiol 2021; 116:890-908. [PMID: 34184334 DOI: 10.1111/mmi.14778] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Revised: 06/24/2021] [Accepted: 06/26/2021] [Indexed: 01/12/2023]
Abstract
The lipid mediators, platelet-activating factor (PAF) and lysophosphatidylcholine (LPC), play relevant pathophysiological roles in Trypanosoma cruzi infection. Several species of LPC, including C18:1 LPC, which mimics the effects of PAF, are synthesized by T. cruzi. The present study identified a receptor in T. cruzi, which was predicted to bind to PAF, and found it to be homologous to members of the progestin and adiponectin family of receptors (PAQRs). We constructed a three-dimensional model of the T. cruzi PAQR (TcPAQR) and performed molecular docking to predict the interactions of the TcPAQR model with C16:0 PAF and C18:1 LPC. We knocked out T. cruzi PAQR (TcPAQR) gene and confirmed the identity of the expressed protein through immunoblotting and immunofluorescence assays using an anti-human PAQR antibody. Wild-type and knockout (KO) parasites were also used to investigate the in vitro cell differentiation and interactions with peritoneal mouse macrophages; TcPAQR KO parasites were unable to react to C16:0 PAF or C18:1 LPC. Our data are highly suggestive that PAF and LPC act through TcPAQR in T. cruzi, triggering its cellular differentiation and ability to infect macrophages.
Collapse
Affiliation(s)
- Felipe S Coelho
- Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Mauricio M Oliveira
- Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | | | - Pedro H M Torres
- Instituto de Biofísica Carlos Chagas Filho, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Isabel C F Moreira
- Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Erica S Martins-Duarte
- Departmento de Parasitologia, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Inês C Gonçalves
- Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Adriana Cabanelas
- Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Pedro G Pascutti
- Instituto de Biofísica Carlos Chagas Filho, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Stenio P Fragoso
- Laboratório de Biologia Molecular e Sistêmica de Tripanossomatídeos, Instituto Carlos Chagas, Curitiba, Brazil
| | - Angela H Lopes
- Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
69
|
De-Simone SG, Lechuga GC, Napoleão-Pêgo P, Gomes LR, Provance DW, Nirello VD, Sodero ACR, Guedes HLDM. Small Angle X-ray Scattering, Molecular Modeling, and Chemometric Studies from a Thrombin-Like (Lmr-47) Enzyme of Lachesis m. rhombeata Venom. Molecules 2021; 26:3930. [PMID: 34203140 PMCID: PMC8271572 DOI: 10.3390/molecules26133930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 05/31/2021] [Accepted: 06/06/2021] [Indexed: 11/17/2022] Open
Abstract
INTRODUCTION Snakebite envenomation is considered a neglected tropical disease, and SVTLEs critical elements are involved in serious coagulopathies that occur on envenoming. Although some enzymes of this group have been structurally investigated, it is essential to characterize other proteins to better understand their unique properties such as the Lachesis muta rhombeata 47 kDa (Lmr-47) venom serine protease. METHODS The structure of Lmr-47 was studied in solution, using SAXS, DLS, CD, and in silico by homology modeling. Molecular docking experiments simulated 21 competitive inhibitors. RESULTS At pH 8.0, Lmr-47 has an Rg of 34.5 ± 0.6 Å, Dmax of 130 Å, and SR of 50 Å, according to DLS data. Kratky plot analysis indicates a rigid shape at pH 8.0. Conversely, the pH variation does not change the center of mass's intrinsic fluorescence, possibly indicating the absence of fluorescent amino acids in the regions affected by pH variation. CD experiments show a substantially random coiled secondary structure not affected by pH. The low-resolution model of Lmr-47 presented a prolate elongated shape at pH 8.0. Using the 3D structure obtained by molecular modeling, docking experiments identified five good and three suitable competitive inhibitors. CONCLUSION Together, our work provided insights into the structure of the Lmr-47 and identified inhibitors that may enhance our understanding of thrombin-like family proteins.
Collapse
Affiliation(s)
- Salvatore Giovanni De-Simone
- FIOCRUZ, Center of Technological Development in Health (CDTS), National Institute of Science and Technology for Innovation on Neglected Diseases Population (INCT-IDPN), Rio de Janeiro 21040-900, Brazil; (G.C.L.); (P.N.-P.); (L.R.G.); (D.W.P.J.)
- Department of Cellular and Molecular Biology, Biology Institute, Federal Fluminense University, Niterói 24020-141, Brazil
| | - Guilherme Curty Lechuga
- FIOCRUZ, Center of Technological Development in Health (CDTS), National Institute of Science and Technology for Innovation on Neglected Diseases Population (INCT-IDPN), Rio de Janeiro 21040-900, Brazil; (G.C.L.); (P.N.-P.); (L.R.G.); (D.W.P.J.)
| | - Paloma Napoleão-Pêgo
- FIOCRUZ, Center of Technological Development in Health (CDTS), National Institute of Science and Technology for Innovation on Neglected Diseases Population (INCT-IDPN), Rio de Janeiro 21040-900, Brazil; (G.C.L.); (P.N.-P.); (L.R.G.); (D.W.P.J.)
| | - Larissa Rodrigues Gomes
- FIOCRUZ, Center of Technological Development in Health (CDTS), National Institute of Science and Technology for Innovation on Neglected Diseases Population (INCT-IDPN), Rio de Janeiro 21040-900, Brazil; (G.C.L.); (P.N.-P.); (L.R.G.); (D.W.P.J.)
| | - David William Provance
- FIOCRUZ, Center of Technological Development in Health (CDTS), National Institute of Science and Technology for Innovation on Neglected Diseases Population (INCT-IDPN), Rio de Janeiro 21040-900, Brazil; (G.C.L.); (P.N.-P.); (L.R.G.); (D.W.P.J.)
- Interdisciplinary Medical Research Laboratory, Oswaldo Cruz Institute/FIOCRUZ, Rio de Janeiro 21040-900, Brazil;
| | - Vinícius Dias Nirello
- Faculty of Pharmacy, Federal of Rio de Janeiro University, Rio de Janeiro 21949-900, Brazil; (V.D.N.); (A.C.R.S.)
| | - Ana Carolina Rennó Sodero
- Faculty of Pharmacy, Federal of Rio de Janeiro University, Rio de Janeiro 21949-900, Brazil; (V.D.N.); (A.C.R.S.)
| | - Herbert Leonel de Mattos Guedes
- Interdisciplinary Medical Research Laboratory, Oswaldo Cruz Institute/FIOCRUZ, Rio de Janeiro 21040-900, Brazil;
- Laboratory of Immunopharmacology, Federal of Rio de Janeiro University, Duque de Caxias 25245-390, Brazil
| |
Collapse
|
70
|
Kingdon ADH, Alderwick LJ. Structure-based in silico approaches for drug discovery against Mycobacterium tuberculosis. Comput Struct Biotechnol J 2021; 19:3708-3719. [PMID: 34285773 PMCID: PMC8258792 DOI: 10.1016/j.csbj.2021.06.034] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 06/22/2021] [Accepted: 06/22/2021] [Indexed: 12/12/2022] Open
Abstract
Mycobacterium tuberculosis is the causative agent of TB and was estimated to cause 1.4 million death in 2019, alongside 10 million new infections. Drug resistance is a growing issue, with multi-drug resistant infections representing 3.3% of all new infections, hence novel antimycobacterial drugs are urgently required to combat this growing health emergency. Alongside this, increased knowledge of gene essentiality in the pathogenic organism and larger compound databases can aid in the discovery of new drug compounds. The number of protein structures, X-ray based and modelled, is increasing and now accounts for greater than > 80% of all predicted M. tuberculosis proteins; allowing novel targets to be investigated. This review will focus on structure-based in silico approaches for drug discovery, covering a range of complexities and computational demands, with associated antimycobacterial examples. This includes molecular docking, molecular dynamic simulations, ensemble docking and free energy calculations. Applications of machine learning onto each of these approaches will be discussed. The need for experimental validation of computational hits is an essential component, which is unfortunately missing from many current studies. The future outlooks of these approaches will also be discussed.
Collapse
Key Words
- CV, collective variable
- Docking
- Drug discovery
- In silico
- LIE, Linear Interaction Energy
- MD, Molecular Dynamic
- MDR, multi-drug resistant
- MMPB(GB)SA, Molecular Mechanics with Poisson Boltzmann (or generalised Born) and Surface Area solvation
- Machine learning
- Mt, Mycobacterium tuberculosis
- Mycobacterium tuberculosis
- PTC, peptidyl transferase centre
- RMSD, root-mean square-deviation
- Tuberculosis, TB
- cMD, Classical Molecular Dynamic
- cryo-EM, cryogenic electron microscopy
- ns, nanosecond
Collapse
Affiliation(s)
- Alexander D H Kingdon
- Institute of Microbiology and Infection, School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom
| | - Luke J Alderwick
- Institute of Microbiology and Infection, School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom
| |
Collapse
|
71
|
Tripathi MK, Nath A, Singh TP, Ethayathulla AS, Kaur P. Evolving scenario of big data and Artificial Intelligence (AI) in drug discovery. Mol Divers 2021; 25:1439-1460. [PMID: 34159484 PMCID: PMC8219515 DOI: 10.1007/s11030-021-10256-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 06/14/2021] [Indexed: 12/24/2022]
Abstract
The accumulation of massive data in the plethora of Cheminformatics databases has made the role of big data and artificial intelligence (AI) indispensable in drug design. This has necessitated the development of newer algorithms and architectures to mine these databases and fulfil the specific needs of various drug discovery processes such as virtual drug screening, de novo molecule design and discovery in this big data era. The development of deep learning neural networks and their variants with the corresponding increase in chemical data has resulted in a paradigm shift in information mining pertaining to the chemical space. The present review summarizes the role of big data and AI techniques currently being implemented to satisfy the ever-increasing research demands in drug discovery pipelines.
Collapse
Affiliation(s)
- Manish Kumar Tripathi
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi, 110029, India
| | - Abhigyan Nath
- Department of Biochemistry, Pt. Jawahar Lal Nehru Memorial Medical College, Raipur, 492001, India
| | - Tej P Singh
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi, 110029, India
| | - A S Ethayathulla
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi, 110029, India
| | - Punit Kaur
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi, 110029, India.
| |
Collapse
|
72
|
Qin T, Zhu Z, Wang XS, Xia J, Wu S. Computational representations of protein-ligand interfaces for structure-based virtual screening. Expert Opin Drug Discov 2021; 16:1175-1192. [PMID: 34011222 DOI: 10.1080/17460441.2021.1929921] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Introduction: Structure-based virtual screening (SBVS) is an essential strategy for hit identification. SBVS primarily uses molecular docking, which exploits the protein-ligand binding mode and associated affinity score for compound ranking. Previous studies have shown that computational representation of protein-ligand interfaces and the later establishment of machine learning models are efficacious in improving the accuracy of SBVS.Areas covered: The authors review the computational methods for representing protein-ligand interfaces, which include the traditional ones that use deliberately designed fingerprints and descriptors and the more recent methods that automatically extract features with deep learning. The effects of these methods on the performance of machine learning models are briefly discussed. Additionally, case studies that applied various computational representations to machine learning are cited with remarks.Expert opinion: It has become a trend to extract binding features automatically by deep learning, which uses a completely end-to-end representation. However, there is still plenty of scope for improvement . The interpretability of deep-learning models, the organization of data management, the quantity and quality of available data, and the optimization of hyperparameters could impact the accuracy of feature extraction. In addition, other important structural factors such as water molecules and protein flexibility should be considered.
Collapse
Affiliation(s)
- Tong Qin
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Department of New Drug Research and Development, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Zihao Zhu
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Department of New Drug Research and Development, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Xiang Simon Wang
- Artificial Intelligence and Drug Discovery Core Laboratory for District of Columbia Center for AIDS Research (DC CFAR), Department of Pharmaceutical Sciences, College of Pharmacy, Howard University, U.S.A
| | - Jie Xia
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Department of New Drug Research and Development, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Song Wu
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Department of New Drug Research and Development, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
73
|
Kimber TB, Chen Y, Volkamer A. Deep Learning in Virtual Screening: Recent Applications and Developments. Int J Mol Sci 2021; 22:4435. [PMID: 33922714 PMCID: PMC8123040 DOI: 10.3390/ijms22094435] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 04/13/2021] [Accepted: 04/14/2021] [Indexed: 01/03/2023] Open
Abstract
Drug discovery is a cost and time-intensive process that is often assisted by computational methods, such as virtual screening, to speed up and guide the design of new compounds. For many years, machine learning methods have been successfully applied in the context of computer-aided drug discovery. Recently, thanks to the rise of novel technologies as well as the increasing amount of available chemical and bioactivity data, deep learning has gained a tremendous impact in rational active compound discovery. Herein, recent applications and developments of machine learning, with a focus on deep learning, in virtual screening for active compound design are reviewed. This includes introducing different compound and protein encodings, deep learning techniques as well as frequently used bioactivity and benchmark data sets for model training and testing. Finally, the present state-of-the-art, including the current challenges and emerging problems, are examined and discussed.
Collapse
Affiliation(s)
| | | | - Andrea Volkamer
- In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité-Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany; (T.B.K.); (Y.C.)
| |
Collapse
|
74
|
Kumar S, Kim MH. SMPLIP-Score: predicting ligand binding affinity from simple and interpretable on-the-fly interaction fingerprint pattern descriptors. J Cheminform 2021; 13:28. [PMID: 33766140 PMCID: PMC7993508 DOI: 10.1186/s13321-021-00507-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 03/16/2021] [Indexed: 12/13/2022] Open
Abstract
In drug discovery, rapid and accurate prediction of protein–ligand binding affinities is a pivotal task for lead optimization with acceptable on-target potency as well as pharmacological efficacy. Furthermore, researchers hope for a high correlation between docking score and pose with key interactive residues, although scoring functions as free energy surrogates of protein–ligand complexes have failed to provide collinearity. Recently, various machine learning or deep learning methods have been proposed to overcome the drawbacks of scoring functions. Despite being highly accurate, their featurization process is complex and the meaning of the embedded features cannot directly be interpreted by human recognition without an additional feature analysis. Here, we propose SMPLIP-Score (Substructural Molecular and Protein–Ligand Interaction Pattern Score), a direct interpretable predictor of absolute binding affinity. Our simple featurization embeds the interaction fingerprint pattern on the ligand-binding site environment and molecular fragments of ligands into an input vectorized matrix for learning layers (random forest or deep neural network). Despite their less complex features than other state-of-the-art models, SMPLIP-Score achieved comparable performance, a Pearson’s correlation coefficient up to 0.80, and a root mean square error up to 1.18 in pK units with several benchmark datasets (PDBbind v.2015, Astex Diverse Set, CSAR NRC HiQ, FEP, PDBbind NMR, and CASF-2016). For this model, generality, predictive power, ranking power, and robustness were examined using direct interpretation of feature matrices for specific targets. ![]()
Collapse
Affiliation(s)
- Surendra Kumar
- Gachon Institute of Pharmaceutical Science & Department of Pharmacy, College of Pharmacy, Gachon University, 191 Hambakmoeiro, Yeonsu-gu, Incheon, Republic of Korea
| | - Mi-Hyun Kim
- Gachon Institute of Pharmaceutical Science & Department of Pharmacy, College of Pharmacy, Gachon University, 191 Hambakmoeiro, Yeonsu-gu, Incheon, Republic of Korea.
| |
Collapse
|
75
|
Williams JC, Kalyaanamoorthy S. PoseFilter: A PyMOL Plugin for filtering and analyzing small molecule docking in symmetric binding sites. Bioinformatics 2021; 37:3367-3368. [PMID: 33742661 DOI: 10.1093/bioinformatics/btab188] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 02/18/2021] [Accepted: 03/17/2021] [Indexed: 11/13/2022] Open
Abstract
SUMMARY 'PoseFilter' is a PyMOL plugin that assists in analyses and filtering of docked poses. PoseFilter enables automatic detection of symmetric poses from docking outputs and can be accessed using both graphical user interface and command-line options within the PyMOL program. Two methods of analyses, root mean square deviations (RMSD) and interaction fingerprints, are available from this plugin. The capabilities of the plugin are demonstrated using docking outputs from different oligomeric protein-ligand complexes. AVAILABILITY AND IMPLEMENTATION The plugin can be downloaded from the GitHub page, https://github.com/skalyaanamoorthy/PoseFilter. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
76
|
Folmsbee DL, Koes DR, Hutchison GR. Evaluation of Thermochemical Machine Learning for Potential Energy Curves and Geometry Optimization. J Phys Chem A 2021; 125:1987-1993. [PMID: 33630611 DOI: 10.1021/acs.jpca.0c10147] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
While many machine learning (ML) methods, particularly deep neural networks, have been trained for density functional and quantum chemical energies and properties, the vast majority of these methods focus on single-point energies. In principle, such ML methods, once trained, offer thermochemical accuracy on par with density functional and wave function methods but at speeds comparable to traditional force fields or approximate semiempirical methods. So far, most efforts have focused on optimized equilibrium single-point energies and properties. In this work, we evaluate the accuracy of several leading ML methods across a range of bond potential energy curves and torsional potentials. The methods were trained on the existing ANI-1 training set, calculated using the ωB97X/6-31G(d) single points at nonequilibrium geometries. We find that across a range of small molecules, several methods offer both qualitative accuracy (e.g., correct minima, both repulsive and attractive bond regions, anharmonic shape, and single minima) and quantitative accuracy in terms of the mean absolute percent error near the minima. At the moment, ANI-2x, FCHL, and a new libmolgrid-based convolutional neural net, the Colorful CNN, show good performance.
Collapse
Affiliation(s)
- Dakota L Folmsbee
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
| | - David R Koes
- Department of Computational & Systems Biology, School of Medicine, University of Pittsburgh, 3420 Forbes Avenue, Pittsburgh, Pennsylvania 15260, United States
| | - Geoffrey R Hutchison
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States.,Department of Chemical and Petroleum Engineering, University of Pittsburgh, 3700 O'Hara Street, Pittsburgh, Pennsylvania 15261, United States
| |
Collapse
|
77
|
Tong J, Zhao S. Large-Scale Analysis of Bioactive Ligand Conformational Strain Energy by Ab Initio Calculation. J Chem Inf Model 2021; 61:1180-1192. [PMID: 33630603 DOI: 10.1021/acs.jcim.0c01197] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Ligand conformational strain energy (LCSE) plays an important role in virtual screening and lead optimization. While various studies have provided insights into LCSE for small-molecule ligands in the Protein Data Bank (PDB), conclusions are inconsistent mainly due to small datasets, poor quality control of crystal structures, and molecular mechanics (MM) or low-level quantum mechanics (QM) calculations. Here, we built a high-quality dataset (LigBoundConf) of 8145 ligand-bound conformations from PDB crystal structures and calculated LCSE at the M062X-D3/ma-TZVPP (SMD)//M062X-D3/def2-SVP(SMD) level for each case in the dataset. The mean/median LCSE is 4.6/3.7 kcal/mol for 6672 successfully calculated cases, which is significantly lower than the estimates based on molecular mechanics in many previous analyses. Especially, when removing ligands with nonaromatic ring(s) that are prone to have large LCSEs due to electron density overfitting, the mean/median LCSE was reduced to 3.3/2.5 kcal/mol. We further reveal that LCSE is correlated with several ligand properties, including formal atomic charge, molecular weight, number of rotatable bonds, and number of hydrogen-bond donors and acceptors. In addition, our results show that although summation of torsion strains is a good approximation of LCSE for most cases, for a small fraction (about 6%) of our dataset, it underestimates LCSEs if ligands could form nonlocal intramolecular interactions in the unbound state. Taken together, our work provides a comprehensive profile of LCSE for ligands in PDB, which could help ligand conformation generation, ligand docking pose evaluation, and lead optimization.
Collapse
Affiliation(s)
- Jiahui Tong
- iHuman Institute, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai 201210, China.,School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai 201210, China.,University of Chinese Academy of Sciences, No. 19A, Yuquan Road, Beijing 100049, China.,Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China
| | - Suwen Zhao
- iHuman Institute, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai 201210, China.,School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai 201210, China
| |
Collapse
|
78
|
Stefaniak F, Bujnicki JM. AnnapuRNA: A scoring function for predicting RNA-small molecule binding poses. PLoS Comput Biol 2021; 17:e1008309. [PMID: 33524009 PMCID: PMC7877745 DOI: 10.1371/journal.pcbi.1008309] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Revised: 02/11/2021] [Accepted: 12/16/2020] [Indexed: 11/22/2022] Open
Abstract
RNA is considered as an attractive target for new small molecule drugs. Designing active compounds can be facilitated by computational modeling. Most of the available tools developed for these prediction purposes, such as molecular docking or scoring functions, are parametrized for protein targets. The performance of these methods, when applied to RNA-ligand systems, is insufficient. To overcome these problems, we developed AnnapuRNA, a new knowledge-based scoring function designed to evaluate RNA-ligand complex structures, generated by any computational docking method. We also evaluated three main factors that may influence the structure prediction, i.e., the starting conformer of a ligand, the docking program, and the scoring function used. We applied the AnnapuRNA method for a post-hoc study of the recently published structures of the FMN riboswitch. Software is available at https://github.com/filipspl/AnnapuRNA. Drug development is a lengthy and complicated process, which requires costly experiments on a very large number of chemical compounds. The identification of chemical molecules with desired properties can be facilitated by computational methods. Several methods were developed for computer-aided design of drugs that target protein molecules. However, recently the ribonucleic acid (RNA) emerged as an attractive target for the development of new drugs. Unfortunately, the portfolio of the computer methods that can be applied to study RNA and its interactions with small chemical molecules is very limited. This situation motivated us to develop a new computational method, with which to predict RNA-small molecule interactions. To this end, we collected the information on the statistics of interactions in experimentally determined structures of complexes formed by RNA with small molecules. We then used the statistical data to train machine learning methods aiming to distinguish between RNA-ligand interactions observed experimentally and other interactions that can be observed in theoretical analyses, but are not observed in nature. The resulting method called AnnapuRNA is superior to other similar tools and can be used to predict preferred ligands of RNA molecules and how RNA and small molecules interact with each other.
Collapse
Affiliation(s)
- Filip Stefaniak
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Warsaw, Poland
- * E-mail: (FS); (JMB)
| | - Janusz M. Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Warsaw, Poland
- Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Poznan, Poland
- * E-mail: (FS); (JMB)
| |
Collapse
|
79
|
Gally JM, Bourg S, Fogha J, Do QT, Aci-Sèche S, Bonnet P. VSPrep: A KNIME Workflow for the Preparation of Molecular Databases for Virtual Screening. Curr Med Chem 2021; 27:6480-6494. [PMID: 31242833 DOI: 10.2174/0929867326666190614160451] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2018] [Revised: 04/11/2019] [Accepted: 05/24/2019] [Indexed: 01/21/2023]
Abstract
Drug discovery is a challenging and expensive field. Hence, novel in silico tools have been developed in early discovery stage to identify and prioritize novel molecules with suitable physicochemical properties. In many in silico drug design projects, molecular databases are screened by virtual screening tools to search for potential bioactive molecules. The preparation of the molecules is therefore a key step in the success of well-established techniques such as docking, similarity or pharmacophore searching. We review here the lists of several toolkits used in different steps during the cleaning of molecular databases, integrated within a KNIME workflow. During the first step of the automatic workflow, salts are removed, and mixtures are split to get one compound per entry. Then compounds with unwanted features are filtered. Duplicated entries are then deleted while considering stereochemistry. As a compromise between exhaustiveness and computational time, most distributed tautomers at physiological pH are computed. Additionally, various flags are applied to molecules by using either classical molecular descriptors, similarity search to known libraries or substructure search rules. Moreover, stereoisomers are enumerated depending on the unassigned chiral centers. Then, three-dimensional coordinates, and optionally conformers, are generated. This workflow has been already applied to several drug design projects and can be used for molecular database preparation upon request.
Collapse
Affiliation(s)
- José-Manuel Gally
- Institut de Chimie Organique et Analytique (ICOA), Universite d'Orleans, UMR CNRS 7311, BP 6759, 45067 Orleans, France
| | - Stéphane Bourg
- Institut de Chimie Organique et Analytique (ICOA), Universite d'Orleans, UMR CNRS 7311, BP 6759, 45067 Orleans, France
| | - Jade Fogha
- Institut de Chimie Organique et Analytique (ICOA), Universite d'Orleans, UMR CNRS 7311, BP 6759, 45067 Orleans, France
| | - Quoc-Tuan Do
- Greenpharma S.A.S. 3, allee du Titane, 45100 Orleans, France
| | - Samia Aci-Sèche
- Institut de Chimie Organique et Analytique (ICOA), Universite d'Orleans, UMR CNRS 7311, BP 6759, 45067 Orleans, France
| | - Pascal Bonnet
- Institut de Chimie Organique et Analytique (ICOA), Universite d'Orleans, UMR CNRS 7311, BP 6759, 45067 Orleans, France
| |
Collapse
|
80
|
Amangeldiuly N, Karlov D, Fedorov MV. Baseline Model for Predicting Protein–Ligand Unbinding Kinetics through Machine Learning. J Chem Inf Model 2020; 60:5946-5956. [DOI: 10.1021/acs.jcim.0c00450] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Nurlybek Amangeldiuly
- Center for Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, Moscow 121205, Russia
| | - Dmitry Karlov
- Center for Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, Moscow 121205, Russia
| | - Maxim V. Fedorov
- Center for Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, Moscow 121205, Russia
- Department of Physics, Scottish Universities Physics Alliance (SUPA), University of Strathclyde, Glasgow G4 0NG, U.K
| |
Collapse
|
81
|
Selecting machine-learning scoring functions for structure-based virtual screening. DRUG DISCOVERY TODAY. TECHNOLOGIES 2020; 32-33:81-87. [PMID: 33386098 DOI: 10.1016/j.ddtec.2020.09.001] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 09/02/2020] [Accepted: 09/07/2020] [Indexed: 12/27/2022]
Abstract
Interest in docking technologies has grown parallel to the ever increasing number and diversity of 3D models for macromolecular therapeutic targets. Structure-Based Virtual Screening (SBVS) aims at leveraging these experimental structures to discover the necessary starting points for the drug discovery process. It is now established that Machine Learning (ML) can strongly enhance the predictive accuracy of scoring functions for SBVS by exploiting large datasets from targets, molecules and their associations. However, with greater choice, the question of which ML-based scoring function is the most suitable for prospective use on a given target has gained importance. Here we analyse two approaches to select an existing scoring function for the target along with a third approach consisting in generating a scoring function tailored to the target. These analyses required discussing the limitations of popular SBVS benchmarks, the alternatives to benchmark scoring functions for SBVS and how to generate them or use them using freely-available software.
Collapse
|
82
|
Adeshina YO, Deeds EJ, Karanicolas J. Machine learning classification can reduce false positives in structure-based virtual screening. Proc Natl Acad Sci U S A 2020; 117:18477-18488. [PMID: 32669436 PMCID: PMC7414157 DOI: 10.1073/pnas.2000585117] [Citation(s) in RCA: 93] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
With the recent explosion in the size of libraries available for screening, virtual screening is positioned to assume a more prominent role in early drug discovery's search for active chemical matter. In typical virtual screens, however, only about 12% of the top-scoring compounds actually show activity when tested in biochemical assays. We argue that most scoring functions used for this task have been developed with insufficient thoughtfulness into the datasets on which they are trained and tested, leading to overly simplistic models and/or overtraining. These problems are compounded in the literature because studies reporting new scoring methods have not validated their models prospectively within the same study. Here, we report a strategy for building a training dataset (D-COID) that aims to generate highly compelling decoy complexes that are individually matched to available active complexes. Using this dataset, we train a general-purpose classifier for virtual screening (vScreenML) that is built on the XGBoost framework. In retrospective benchmarks, our classifier shows outstanding performance relative to other scoring functions. In a prospective context, nearly all candidate inhibitors from a screen against acetylcholinesterase show detectable activity; beyond this, 10 of 23 compounds have IC50 better than 50 μM. Without any medicinal chemistry optimization, the most potent hit has IC50 280 nM, corresponding to Ki of 173 nM. These results support using the D-COID strategy for training classifiers in other computational biology tasks, and for vScreenML in virtual screening campaigns against other protein targets. Both D-COID and vScreenML are freely distributed to facilitate such efforts.
Collapse
Affiliation(s)
- Yusuf O Adeshina
- Program in Molecular Therapeutics, Fox Chase Cancer Center, Philadelphia, PA 19111
- Center for Computational Biology, University of Kansas, Lawrence, KS 66045
| | - Eric J Deeds
- Center for Computational Biology, University of Kansas, Lawrence, KS 66045
- Department of Molecular Biosciences, University of Kansas, Lawrence, KS 66045
| | - John Karanicolas
- Program in Molecular Therapeutics, Fox Chase Cancer Center, Philadelphia, PA 19111;
| |
Collapse
|
83
|
Shen C, Hu Y, Wang Z, Zhang X, Pang J, Wang G, Zhong H, Xu L, Cao D, Hou T. Beware of the generic machine learning-based scoring functions in structure-based virtual screening. Brief Bioinform 2020; 22:5850047. [PMID: 32484221 DOI: 10.1093/bib/bbaa070] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Revised: 04/17/2020] [Accepted: 03/30/2020] [Indexed: 12/14/2022] Open
Abstract
Machine learning-based scoring functions (MLSFs) have attracted extensive attention recently and are expected to be potential rescoring tools for structure-based virtual screening (SBVS). However, a major concern nowadays is whether MLSFs trained for generic uses rather than a given target can consistently be applicable for VS. In this study, a systematic assessment was carried out to re-evaluate the effectiveness of 14 reported MLSFs in VS. Overall, most of these MLSFs could hardly achieve satisfactory results for any dataset, and they could even not outperform the baseline of classical SFs such as Glide SP. An exception was observed for RFscore-VS trained on the Directory of Useful Decoys-Enhanced dataset, which showed its superiority for most targets. However, in most cases, it clearly illustrated rather limited performance on the targets that were dissimilar to the proteins in the corresponding training sets. We also used the top three docking poses rather than the top one for rescoring and retrained the models with the updated versions of the training set, but only minor improvements were observed. Taken together, generic MLSFs may have poor generalization capabilities to be applicable for the real VS campaigns. Therefore, it should be quite cautious to use this type of methods for VS.
Collapse
Affiliation(s)
| | - Ye Hu
- Central South University, China
| | | | | | | | | | | | - Lei Xu
- Central South University, China
| | | | | |
Collapse
|
84
|
Li H, Sze K, Lu G, Ballester PJ. Machine‐learning scoring functions for structure‐based virtual screening. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1478] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Hongjian Li
- Cancer Research Center of Marseille (INSERM U1068, Institut Paoli‐Calmettes, Aix‐Marseille Université UM105, CNRS UMR7258) Marseille France
- CUHK‐SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences Chinese University of Hong Kong Shatin Hong Kong
| | - Kam‐Heung Sze
- CUHK‐SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences Chinese University of Hong Kong Shatin Hong Kong
| | - Gang Lu
- CUHK‐SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences Chinese University of Hong Kong Shatin Hong Kong
| | - Pedro J. Ballester
- Cancer Research Center of Marseille (INSERM U1068, Institut Paoli‐Calmettes, Aix‐Marseille Université UM105, CNRS UMR7258) Marseille France
| |
Collapse
|
85
|
Karlov D, Sosnin S, Fedorov MV, Popov P. graphDelta: MPNN Scoring Function for the Affinity Prediction of Protein-Ligand Complexes. ACS OMEGA 2020; 5:5150-5159. [PMID: 32201802 PMCID: PMC7081425 DOI: 10.1021/acsomega.9b04162] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2019] [Accepted: 02/21/2020] [Indexed: 06/04/2023]
Abstract
In this work, we present graph-convolutional neural networks for the prediction of binding constants of protein-ligand complexes. We derived the model using multi task learning, where the target variables are the dissociation constant (K d), inhibition constant (K i), and half maximal inhibitory concentration (IC50). Being rigorously trained on the PDBbind dataset, the model achieves the Pearson correlation coefficient of 0.87 and the RMSE value of 1.05 in pK units, outperforming recently developed 3D convolutional neural network model K deep.
Collapse
Affiliation(s)
- Dmitry
S. Karlov
- Skolkovo
Institute of Science and Technology, Moscow 143026, Russia
| | - Sergey Sosnin
- Skolkovo
Institute of Science and Technology, Moscow 143026, Russia
- Skolkovo
Innovation Center,Syntelly LLC, 42 Bolshoy Boulevard, Moscow 143026, Russia
| | - Maxim V. Fedorov
- Skolkovo
Institute of Science and Technology, Moscow 143026, Russia
- Skolkovo
Innovation Center,Syntelly LLC, 42 Bolshoy Boulevard, Moscow 143026, Russia
- University
of Strathclyde, Physics
John Anderson Building, 107 Rottenrow East, Glasgow UK G4 0NG, U.K.
| | - Petr Popov
- Skolkovo
Institute of Science and Technology, Moscow 143026, Russia
- Moscow
Institute of Physics and Technology, Dolgoprudny 141701, Russia
| |
Collapse
|
86
|
Wójcikowski M, Kukiełka M, Stepniewska-Dziubinska MM, Siedlecki P. Development of a protein-ligand extended connectivity (PLEC) fingerprint and its application for binding affinity predictions. Bioinformatics 2020; 35:1334-1341. [PMID: 30202917 PMCID: PMC6477977 DOI: 10.1093/bioinformatics/bty757] [Citation(s) in RCA: 84] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Revised: 07/11/2018] [Accepted: 09/06/2018] [Indexed: 01/05/2023] Open
Abstract
Motivation Fingerprints (FPs) are the most common small molecule representation in cheminformatics. There are a wide variety of FPs, and the Extended Connectivity Fingerprint (ECFP) is one of the best-suited for general applications. Despite the overall FP abundance, only a few FPs represent the 3D structure of the molecule, and hardly any encode protein–ligand interactions. Results Here, we present a Protein–Ligand Extended Connectivity (PLEC) FP that implicitly encodes protein–ligand interactions by pairing the ECFP environments from the ligand and the protein. PLEC FPs were used to construct different machine learning models tailored for predicting protein–ligand affinities (pKi∕d). Even the simplest linear model built on the PLEC FP achieved Rp = 0.817 on the Protein Databank (PDB) bind v2016 ‘core set’, demonstrating its descriptive power. Availability and implementation The PLEC FP has been implemented in the Open Drug Discovery Toolkit (https://github.com/oddt/oddt). Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Maciej Wójcikowski
- Institute of Biochemistry and Biophysics PAS, Pawinskiego 5a, Warsaw, Poland
| | - Michał Kukiełka
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Banacha 2, Warsaw, Poland
| | | | - Pawel Siedlecki
- Institute of Biochemistry and Biophysics PAS, Pawinskiego 5a, Warsaw, Poland.,Department of Systems Biology, University of Warsaw, Miecznikowa 1, Warsaw, Poland
| |
Collapse
|
87
|
Li H, Sze K, Lu G, Ballester PJ. Machine‐learning scoring functions for structure‐based drug lead optimization. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1465] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Hongjian Li
- CUHK‐SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences Chinese University of Hong Kong Shatin Hong Kong
| | - Kam‐Heung Sze
- CUHK‐SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences Chinese University of Hong Kong Shatin Hong Kong
| | - Gang Lu
- CUHK‐SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences Chinese University of Hong Kong Shatin Hong Kong
| | - Pedro J. Ballester
- Cancer Research Center of Marseille (INSERM U1068, Institut Paoli‐Calmettes, Aix‐Marseille Université UM105, CNRS UMR7258) Marseille France
| |
Collapse
|
88
|
Baksheeva VE, Nemashkalova EL, Firsov AM, Zalevsky AO, Vladimirov VI, Tikhomirova NK, Philippov PP, Zamyatnin AA, Zinchenko DV, Antonenko YN, Permyakov SE, Zernii EY. Membrane Binding of Neuronal Calcium Sensor-1: Highly Specific Interaction with Phosphatidylinositol-3-Phosphate. Biomolecules 2020; 10:biom10020164. [PMID: 31973069 PMCID: PMC7072451 DOI: 10.3390/biom10020164] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Revised: 01/15/2020] [Accepted: 01/17/2020] [Indexed: 12/20/2022] Open
Abstract
Neuronal calcium sensors are a family of N-terminally myristoylated membrane-binding proteins possessing a different intracellular localization and thereby targeting unique signaling partner(s). Apart from the myristoyl group, the membrane attachment of these proteins may be modulated by their N-terminal positively charged residues responsible for specific recognition of the membrane components. Here, we examined the interaction of neuronal calcium sensor-1 (NCS-1) with natural membranes of different lipid composition as well as individual phospholipids in form of multilamellar liposomes or immobilized monolayers and characterized the role of myristoyl group and N-terminal lysine residues in membrane binding and phospholipid preference of the protein. NCS-1 binds to photoreceptor and hippocampal membranes in a Ca2+-independent manner and the binding is attenuated in the absence of myristoyl group. Meanwhile, the interaction with photoreceptor membranes is less dependent on myristoylation and more sensitive to replacement of K3, K7, and/or K9 of NCS-1 by glutamic acid, reflecting affinity of the protein to negatively charged phospholipids. Consistently, among the major phospholipids, NCS-1 preferentially interacts with phosphatidylserine and phosphatidylinositol with micromolar affinity and the interaction with the former is inhibited upon mutating of N-terminal lysines of the protein. Remarkably, NCS-1 demonstrates pronounced specific binding to phosphoinositides with high preference for phosphatidylinositol-3-phosphate. The binding does not depend on myristoylation and, unexpectedly, is not sensitive to the charge inversion mutations. Instead, phosphatidylinositol-3-phosphate can be recognized by a specific site located in the N-terminal region of the protein. These data provide important novel insights into the general mechanism of membrane binding of NCS-1 and its targeting to specific phospholipids ensuring involvement of the protein in phosphoinositide-regulated signaling pathways.
Collapse
Affiliation(s)
- Viktoriia E. Baksheeva
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, 119992 Moscow, Russia; (V.E.B.); (A.M.F.); (N.K.T.); (P.P.P.); (Y.N.A.)
| | - Ekaterina L. Nemashkalova
- Institute for Biological Instrumentation of the Russian Academy of Sciences, Pushchino, 142290 Moscow Region, Russia; (E.L.N.); (S.E.P.)
| | - Alexander M. Firsov
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, 119992 Moscow, Russia; (V.E.B.); (A.M.F.); (N.K.T.); (P.P.P.); (Y.N.A.)
| | - Arthur O. Zalevsky
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119992 Moscow, Russia;
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, 117997 Moscow, Russia
- Institute of Molecular Medicine, Sechenov First Moscow State Medical University, 119991 Moscow, Russia
| | - Vasily I. Vladimirov
- Branch of Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences in Pushchino, Pushchino, 142290 Moscow Region, Russia; (V.I.V.); (D.V.Z.)
| | - Natalia K. Tikhomirova
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, 119992 Moscow, Russia; (V.E.B.); (A.M.F.); (N.K.T.); (P.P.P.); (Y.N.A.)
| | - Pavel P. Philippov
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, 119992 Moscow, Russia; (V.E.B.); (A.M.F.); (N.K.T.); (P.P.P.); (Y.N.A.)
| | - Andrey A. Zamyatnin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, 119992 Moscow, Russia; (V.E.B.); (A.M.F.); (N.K.T.); (P.P.P.); (Y.N.A.)
- Institute of Molecular Medicine, Sechenov First Moscow State Medical University, 119991 Moscow, Russia
| | - Dmitry V. Zinchenko
- Branch of Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences in Pushchino, Pushchino, 142290 Moscow Region, Russia; (V.I.V.); (D.V.Z.)
| | - Yuri N. Antonenko
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, 119992 Moscow, Russia; (V.E.B.); (A.M.F.); (N.K.T.); (P.P.P.); (Y.N.A.)
| | - Sergey E. Permyakov
- Institute for Biological Instrumentation of the Russian Academy of Sciences, Pushchino, 142290 Moscow Region, Russia; (E.L.N.); (S.E.P.)
| | - Evgeni Yu. Zernii
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, 119992 Moscow, Russia; (V.E.B.); (A.M.F.); (N.K.T.); (P.P.P.); (Y.N.A.)
- Institute of Molecular Medicine, Sechenov First Moscow State Medical University, 119991 Moscow, Russia
- Correspondence: ; Tel.: +7-495-939-2344
| |
Collapse
|
89
|
Zhao Z, Xu Y, Zhao Y. SXGBsite: Prediction of Protein-Ligand Binding Sites Using Sequence Information and Extreme Gradient Boosting. Genes (Basel) 2019; 10:E965. [PMID: 31771119 PMCID: PMC6947422 DOI: 10.3390/genes10120965] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Revised: 10/19/2019] [Accepted: 11/19/2019] [Indexed: 12/13/2022] Open
Abstract
The prediction of protein-ligand binding sites is important in drug discovery and drug design. Protein-ligand binding site prediction computational methods are inexpensive and fast compared with experimental methods. This paper proposes a new computational method, SXGBsite, which includes the synthetic minority over-sampling technique (SMOTE) and the Extreme Gradient Boosting (XGBoost). SXGBsite uses the position-specific scoring matrix discrete cosine transform (PSSM-DCT) and predicted solvent accessibility (PSA) to extract features containing sequence information. A new balanced dataset was generated by SMOTE to improve classifier performance, and a prediction model was constructed using XGBoost. The parallel computing and regularization techniques enabled high-quality and fast predictions and mitigated overfitting caused by SMOTE. An evaluation using 12 different types of ligand binding site independent test sets showed that SXGBsite performs similarly to the existing methods on eight of the independent test sets with a faster computation time. SXGBsite may be applied as a complement to biological experiments.
Collapse
Affiliation(s)
| | - Yonghong Xu
- School of Electrical Engineering, Yanshan University, Qinhuangdao 066004, China
| | | |
Collapse
|
90
|
Ptushenko VV, Solovchenko AE, Bychkov AY, Chivkunova OB, Golovin AV, Gorelova OA, Ismagulova TT, Kulik LV, Lobakova ES, Lukyanov AA, Samoilova RI, Scherbakov PN, Selyakh IO, Semenova LR, Vasilieva SG, Baulina OI, Skulachev MV, Kirpichnikov MP. Cationic penetrating antioxidants switch off Mn cluster of photosystem II in situ. PHOTOSYNTHESIS RESEARCH 2019; 142:229-240. [PMID: 31302832 DOI: 10.1007/s11120-019-00657-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/11/2019] [Accepted: 06/27/2019] [Indexed: 06/10/2023]
Abstract
Mitochondria-targeted antioxidants (also known as 'Skulachev Ions' electrophoretically accumulated by mitochondria) exert anti-ageing and ROS-protecting effects well documented in animal and human cells. However, their effects on chloroplast in photosynthetic cells and corresponding mechanisms are scarcely known. For the first time, we describe a dramatic quenching effect of (10-(6-plastoquinonyl)decyl triphenylphosphonium (SkQ1) on chlorophyll fluorescence, apparently mediated by redox interaction of SkQ1 with Mn cluster in Photosystem II (PSII) of chlorophyte microalga Chlorella vulgaris and disabling the oxygen-evolving complex (OEC). Microalgal cells displayed a vigorous uptake of SkQ1 which internal concentration built up to a very high level. Using optical and EPR spectroscopy, as well as electron donors and in silico molecular simulation techniques, we found that SkQ1 molecule can interact with Mn atoms of the OEC in PSII. This stops water splitting giving rise to potent quencher(s), e.g. oxidized reaction centre of PSII. Other components of the photosynthetic apparatus proved to be mostly intact. This effect of the Skulachev ions might help to develop in vivo models of photosynthetic cells with impaired OEC function but essentially intact otherwise. The observed phenomenon suggests that SkQ1 can be applied to study stress-induced damages to OEC in photosynthetic organisms.
Collapse
Affiliation(s)
- Vasily V Ptushenko
- A.N. Belozersky Institute of Physical-Chemical Biology, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234.
- N.M. Emanuel Institute of Biochemical Physics of RAS, Moscow, Russia, 119334.
| | - Alexei E Solovchenko
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
- Peoples Friendship University of Russia (RUDN University), Moscow, Russia, 117198
| | - Andrew Y Bychkov
- Faculty of Geology, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
| | - Olga B Chivkunova
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
| | - Andrey V Golovin
- Faculty of Bioengineering and Bioinformatics, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
- Institute of Molecular Medicine, Sechenov First Moscow State Medical University, Moscow, Russia, 119991
| | - Olga A Gorelova
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
| | - Tatiana T Ismagulova
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
| | - Leonid V Kulik
- V.V. Voevodsky Institute of Chemical Kinetics and Combustion of SB RAS, Novosibirsk, Russia, 630090
- Novosibirsk State University, Pirogova Street 2, Novosibirsk, Russia, 630090
| | - Elena S Lobakova
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
| | - Alexandr A Lukyanov
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
| | - Rima I Samoilova
- V.V. Voevodsky Institute of Chemical Kinetics and Combustion of SB RAS, Novosibirsk, Russia, 630090
| | - Pavel N Scherbakov
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
| | - Irina O Selyakh
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
| | - Larisa R Semenova
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
| | - Svetlana G Vasilieva
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
| | - Olga I Baulina
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
| | - Maxim V Skulachev
- A.N. Belozersky Institute of Physical-Chemical Biology, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
- Institute of Mitoengineering, M.V. Lomonosov Moscow State University, Moscow, Russia, 119234
| | | |
Collapse
|
91
|
Stepniewska-Dziubinska MM, Zielenkiewicz P, Siedlecki P. Development and evaluation of a deep learning model for protein-ligand binding affinity prediction. Bioinformatics 2019; 34:3666-3674. [PMID: 29757353 PMCID: PMC6198856 DOI: 10.1093/bioinformatics/bty374] [Citation(s) in RCA: 269] [Impact Index Per Article: 53.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 05/04/2018] [Indexed: 11/24/2022] Open
Abstract
Motivation Structure based ligand discovery is one of the most successful approaches for augmenting the drug discovery process. Currently, there is a notable shift towards machine learning (ML) methodologies to aid such procedures. Deep learning has recently gained considerable attention as it allows the model to ‘learn’ to extract features that are relevant for the task at hand. Results We have developed a novel deep neural network estimating the binding affinity of ligand–receptor complexes. The complex is represented with a 3D grid, and the model utilizes a 3D convolution to produce a feature map of this representation, treating the atoms of both proteins and ligands in the same manner. Our network was tested on the CASF-2013 ‘scoring power’ benchmark and Astex Diverse Set and outperformed classical scoring functions. Availability and implementation The model, together with usage instructions and examples, is available as a git repository at http://gitlab.com/cheminfIBB/pafnucy. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Piotr Zielenkiewicz
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, Poland.,Department of Systems Biology, Institute of Experimental Plant Biology and Biotechnology, University of Warsaw, Warsaw, Poland
| | - Pawel Siedlecki
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, Poland.,Department of Systems Biology, Institute of Experimental Plant Biology and Biotechnology, University of Warsaw, Warsaw, Poland
| |
Collapse
|
92
|
Gheyouche E, Launay R, Lethiec J, Labeeuw A, Roze C, Amossé A, Téletchéa S. DockNmine, a Web Portal to Assemble and Analyse Virtual and Experimental Interaction Data. Int J Mol Sci 2019; 20:E5062. [PMID: 31614716 PMCID: PMC6829441 DOI: 10.3390/ijms20205062] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 10/03/2019] [Accepted: 10/07/2019] [Indexed: 12/22/2022] Open
Abstract
Scientists have to perform multiple experiments producing qualitative and quantitative data to determine if a compound is able to bind to a given target. Due to the large diversity of the potential ligand chemical space, the possibility of experimentally exploring a lot of compounds on a target rapidly becomes out of reach. Scientists therefore need to use virtual screening methods to determine the putative binding mode of ligands on a protein and then post-process the raw docking experiments with a dedicated scoring function in relation with experimental data. Two of the major difficulties for comparing docking predictions with experiments mostly come from the lack of transferability of experimental data and the lack of standardisation in molecule names. Although large portals like PubChem or ChEMBL are available for general purpose, there is no service allowing a formal expert annotation of both experimental data and docking studies. To address these issues, researchers build their own collection of data in flat files, often in spreadsheets, with limited possibilities of extensive annotations or standardisation of ligand descriptions allowing cross-database retrieval. We have conceived the dockNmine platform to provide a service allowing an expert and authenticated annotation of ligands and targets. First, this portal allows a scientist to incorporate controlled information in the database using reference identifiers for the protein (Uniprot ID) and the ligand (SMILES description), the data and the publication associated to it. Second, it allows the incorporation of docking experiments using forms that automatically parse useful parameters and results. Last, the web interface provides a lot of pre-computed outputs to assess the degree of correlations between docking experiments and experimental data.
Collapse
Affiliation(s)
- Ennys Gheyouche
- UFIP, Université de Nantes, UMR CNRS 6286, 2 rue de la Houssinière, 44322 Nantes, France.
| | - Romain Launay
- UFIP, Université de Nantes, UMR CNRS 6286, 2 rue de la Houssinière, 44322 Nantes, France.
| | - Jean Lethiec
- UFIP, Université de Nantes, UMR CNRS 6286, 2 rue de la Houssinière, 44322 Nantes, France.
| | - Antoine Labeeuw
- UFIP, Université de Nantes, UMR CNRS 6286, 2 rue de la Houssinière, 44322 Nantes, France.
| | - Caroline Roze
- UFIP, Université de Nantes, UMR CNRS 6286, 2 rue de la Houssinière, 44322 Nantes, France.
| | - Alan Amossé
- UFIP, Université de Nantes, UMR CNRS 6286, 2 rue de la Houssinière, 44322 Nantes, France.
| | - Stéphane Téletchéa
- UFIP, Université de Nantes, UMR CNRS 6286, 2 rue de la Houssinière, 44322 Nantes, France.
| |
Collapse
|
93
|
Zheng L, Fan J, Mu Y. OnionNet: a Multiple-Layer Intermolecular-Contact-Based Convolutional Neural Network for Protein-Ligand Binding Affinity Prediction. ACS OMEGA 2019; 4:15956-15965. [PMID: 31592466 PMCID: PMC6776976 DOI: 10.1021/acsomega.9b01997] [Citation(s) in RCA: 132] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Accepted: 09/06/2019] [Indexed: 05/12/2023]
Abstract
Computational drug discovery provides an efficient tool for helping large-scale lead molecule screening. One of the major tasks of lead discovery is identifying molecules with promising binding affinities toward a target, a protein in general. The accuracies of current scoring functions that are used to predict the binding affinity are not satisfactory enough. Thus, machine learning or deep learning based methods have been developed recently to improve the scoring functions. In this study, a deep convolutional neural network model (called OnionNet) is introduced; its features are based on rotation-free element-pair-specific contacts between ligands and protein atoms, and the contacts are further grouped into different distance ranges to cover both the local and nonlocal interaction information between the ligand and the protein. The prediction power of the model is evaluated and compared with other scoring functions using the comparative assessment of scoring functions (CASF-2013) benchmark and the v2016 core set of the PDBbind database. The robustness of the model is further explored by predicting the binding affinities of the complexes generated from docking simulations instead of experimentally determined PDB structures.
Collapse
Affiliation(s)
- Liangzhen Zheng
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551, Singapore
| | - Jingrong Fan
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551, Singapore
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551, Singapore
| |
Collapse
|
94
|
Boyles F, Deane CM, Morris GM. Learning from the ligand: using ligand-based features to improve binding affinity prediction. Bioinformatics 2019; 36:758-764. [DOI: 10.1093/bioinformatics/btz665] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Revised: 08/14/2019] [Accepted: 08/21/2019] [Indexed: 12/27/2022] Open
Abstract
Abstract
Motivation
Machine learning scoring functions for protein–ligand binding affinity prediction have been found to consistently outperform classical scoring functions. Structure-based scoring functions for universal affinity prediction typically use features describing interactions derived from the protein–ligand complex, with limited information about the chemical or topological properties of the ligand itself.
Results
We demonstrate that the performance of machine learning scoring functions are consistently improved by the inclusion of diverse ligand-based features. For example, a Random Forest (RF) combining the features of RF-Score v3 with RDKit molecular descriptors achieved Pearson correlation coefficients of up to 0.836, 0.780 and 0.821 on the PDBbind 2007, 2013 and 2016 core sets, respectively, compared to 0.790, 0.746 and 0.814 when using the features of RF-Score v3 alone. Excluding proteins and/or ligands that are similar to those in the test sets from the training set has a significant effect on scoring function performance, but does not remove the predictive power of ligand-based features. Furthermore a RF using only ligand-based features is predictive at a level similar to classical scoring functions and it appears to be predicting the mean binding affinity of a ligand for its protein targets.
Availability and implementation
Data and code to reproduce all the results are freely available at http://opig.stats.ox.ac.uk/resources.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Fergus Boyles
- Department of Statistics, University of Oxford, Oxford, UK
| | | | | |
Collapse
|
95
|
Bultum LE, Woyessa AM, Lee D. ETM-DB: integrated Ethiopian traditional herbal medicine and phytochemicals database. Altern Ther Health Med 2019; 19:212. [PMID: 31412866 PMCID: PMC6692943 DOI: 10.1186/s12906-019-2634-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Accepted: 08/08/2019] [Indexed: 11/27/2022]
Abstract
Background Recently, there has been an increasing tendency to go back to nature in search of new medicines. To facilitate this, a great deal of effort has been made to compile information on natural products worldwide, and as a result, many ethnic-based traditional medicine databases have been developed. In Ethiopia, there are more than 80 ethnic groups, each having their indigenous knowledge on the use of traditional medicine. About 80% of the population uses traditional medicine for primary health care. Despite this, there is no structured online database for Ethiopian traditional medicine, which limits natural products based drug discovery researches using natural products from this country. Description To develop ETM-DB, online research articles, theses, books, and public databases containing Ethiopian herbal medicine and phytochemicals information were searched. These resources were thoroughly inspected and the necessary data were extracted. Then, we developed a comprehensive online relational database which contains information on 1054 Ethiopian medicinal herbs with 1465 traditional therapeutic uses, 573 multi-herb prescriptions, 4285 compounds, 11,621 human target gene/proteins, covering 5779 herb-phenotype, 1879 prescription-herb, 16,426 herb-compound, 105,202 compound-phenotype, 162,632 compound-gene/protein, and 16,584 phenotype-gene/protein relationships. Using various cheminformatics tools, we obtained predicted physicochemical and absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of ETM-DB compounds. We also evaluated drug-likeness properties of these compounds using FAF-Drugs4 webserver. From the 4285 compounds, 4080 of them passed the FAF-Drugs4 input data curation stage, of which 876 were found to have acceptable drug-likeness properties. Conclusion ETM-DB is the largest, freely accessible, web-based integrated resource on Ethiopian traditional medicine. It provides traditional herbal medicine entities and their relationships in well-structured forms including reference to the sources. The ETM-DB website interface allows users to search the entities using various options provided by the search menu. We hope that our database will expedite drug discovery and development researches from Ethiopian natural products as it contains information on the chemical composition and related human target gene/proteins. The current version of ETM-DB is openly accessible at http://biosoft.kaist.ac.kr/etm.
Collapse
|
96
|
Advancing Drug Discovery via Artificial Intelligence. Trends Pharmacol Sci 2019; 40:592-604. [DOI: 10.1016/j.tips.2019.06.004] [Citation(s) in RCA: 164] [Impact Index Per Article: 32.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Revised: 05/23/2019] [Accepted: 06/11/2019] [Indexed: 01/15/2023]
|
97
|
Yang X, Wang Y, Byrne R, Schneider G, Yang S. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem Rev 2019; 119:10520-10594. [PMID: 31294972 DOI: 10.1021/acs.chemrev.8b00728] [Citation(s) in RCA: 340] [Impact Index Per Article: 68.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Artificial intelligence (AI), and, in particular, deep learning as a subcategory of AI, provides opportunities for the discovery and development of innovative drugs. Various machine learning approaches have recently (re)emerged, some of which may be considered instances of domain-specific AI which have been successfully employed for drug discovery and design. This review provides a comprehensive portrayal of these machine learning techniques and of their applications in medicinal chemistry. After introducing the basic principles, alongside some application notes, of the various machine learning algorithms, the current state-of-the art of AI-assisted pharmaceutical discovery is discussed, including applications in structure- and ligand-based virtual screening, de novo drug design, physicochemical and pharmacokinetic property prediction, drug repurposing, and related aspects. Finally, several challenges and limitations of the current methods are summarized, with a view to potential future directions for AI-assisted drug discovery and design.
Collapse
Affiliation(s)
- Xin Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Yifei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Ryan Byrne
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Gisbert Schneider
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Shengyong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| |
Collapse
|
98
|
Shen C, Ding J, Wang Z, Cao D, Ding X, Hou T. From machine learning to deep learning: Advances in scoring functions for protein–ligand docking. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2019. [DOI: 10.1002/wcms.1429] [Citation(s) in RCA: 76] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Chao Shen
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University Hangzhou P. R. China
| | - Junjie Ding
- Beijing Institute of Pharmaceutical Chemistry Beijing P. R. China
| | - Zhe Wang
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University Hangzhou P. R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University Changsha P. R. China
| | - Xiaoqin Ding
- Beijing Institute of Pharmaceutical Chemistry Beijing P. R. China
| | - Tingjun Hou
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University Hangzhou P. R. China
| |
Collapse
|
99
|
Skalic M, Jiménez J, Sabbadin D, De Fabritiis G. Shape-Based Generative Modeling for de Novo Drug Design. J Chem Inf Model 2019; 59:1205-1214. [PMID: 30762364 DOI: 10.1021/acs.jcim.8b00706] [Citation(s) in RCA: 112] [Impact Index Per Article: 22.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
In this work, we propose a machine learning approach to generate novel molecules starting from a seed compound, its three-dimensional (3D) shape, and its pharmacophoric features. The pipeline draws inspiration from generative models used in image analysis and represents a first example of the de novo design of lead-like molecules guided by shape-based features. A variational autoencoder is used to perturb the 3D representation of a compound, followed by a system of convolutional and recurrent neural networks that generate a sequence of SMILES tokens. The generative design of novel scaffolds and functional groups can cover unexplored regions of chemical space that still possess lead-like properties.
Collapse
Affiliation(s)
- Miha Skalic
- Computational Science Laboratory , Universitat Pompeu Fabra , Barcelona Biomedical Research Park (PRBB), C Dr Aiguader 88 , 08003 Barcelona , Spain
| | - José Jiménez
- Computational Science Laboratory , Universitat Pompeu Fabra , Barcelona Biomedical Research Park (PRBB), C Dr Aiguader 88 , 08003 Barcelona , Spain
| | - Davide Sabbadin
- Computational Science Laboratory , Universitat Pompeu Fabra , Barcelona Biomedical Research Park (PRBB), C Dr Aiguader 88 , 08003 Barcelona , Spain
| | - Gianni De Fabritiis
- Computational Science Laboratory , Universitat Pompeu Fabra , Barcelona Biomedical Research Park (PRBB), C Dr Aiguader 88 , 08003 Barcelona , Spain.,Acellera , Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88 , 08003 Barcelona , Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA) , Passeig Lluis Companys 23 , 08010 Barcelona , Spain
| |
Collapse
|
100
|
PeptoGrid-Rescoring Function for AutoDock Vina to Identify New Bioactive Molecules from Short Peptide Libraries. Molecules 2019; 24:molecules24020277. [PMID: 30642123 PMCID: PMC6359344 DOI: 10.3390/molecules24020277] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 01/05/2019] [Accepted: 01/09/2019] [Indexed: 11/20/2022] Open
Abstract
Peptides are promising drug candidates due to high specificity and standout safety. Identification of bioactive peptides de novo using molecular docking is a widely used approach. However, current scoring functions are poorly optimized for peptide ligands. In this work, we present a novel algorithm PeptoGrid that rescores poses predicted by AutoDock Vina according to frequency information of ligand atoms with particular properties appearing at different positions in the target protein’s ligand binding site. We explored the relevance of PeptoGrid ranking with a virtual screening of peptide libraries using angiotensin-converting enzyme and GABAB receptor as targets. A reasonable agreement between the computational and experimental data suggests that PeptoGrid is suitable for discovering functional leads.
Collapse
|