1
|
Yang R, Zha X, Gao X, Wang K, Cheng B, Yan B. Multi-stage virtual screening of natural products against p38α mitogen-activated protein kinase: predictive modeling by machine learning, docking study and molecular dynamics simulation. Heliyon 2022; 8:e10495. [PMID: 36105464 PMCID: PMC9465123 DOI: 10.1016/j.heliyon.2022.e10495] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 03/20/2022] [Accepted: 08/25/2022] [Indexed: 11/20/2022] Open
Abstract
p38α is a mitogen-activated protein kinase (MAPK), and the signaling pathways involved are closely related to the inflammation, apoptosis and differentiation of cells, which also makes it an attractive target for drug discovery. With the high efficiency and low cost, virtual screening technology is becoming an indispensable part of drug development. In this study, a novel multi-stage virtual screening method based on machine learning, molecular docking and molecular dynamics simulation was developed to identify p38α MAPK inhibitors from natural products in ZINC database, which improves the prediction accuracy by considering and utilizing both ligand and receptor information compared to any individual approach. Ultimately, we screened out two candidate inhibitors with acceptable ADMET properties (ZINC4260400 and ZINC8300300). Among the generated machine learning models, Random Forest (RF) and Support Vector Machine (SVM) performed better, with the area under the receiver operating characteristic curve (AUC) values of 0.932 and 0.931 on the test set, as well as 0.834 and 0.850 on the external validation set. In addition, the results of molecular docking and ADMET prediction showed that two compounds with appropriate pharmacokinetic properties had binding free energies less than −8.0 kcal/mol for the target protein, and the results of molecular dynamics simulations further confirmed that they were stable during the process of inhibition.
Collapse
|
2
|
Qiu T, Qiu J, Feng J, Wu D, Yang Y, Tang K, Cao Z, Zhu R. The recent progress in proteochemometric modelling: focusing on target descriptors, cross-term descriptors and application scope. Brief Bioinform 2016; 18:125-136. [PMID: 26873661 DOI: 10.1093/bib/bbw004] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Revised: 12/09/2015] [Indexed: 12/17/2022] Open
Abstract
As an extension of the conventional quantitative structure activity relationship models, proteochemometric (PCM) modelling is a computational method that can predict the bioactivity relations between multiple ligands and multiple targets. Traditional PCM modelling includes three essential elements: descriptors (including target descriptors, ligand descriptors and cross-term descriptors), bioactivity data and appropriate learning functions that link the descriptors to the bioactivity data. Since its appearance, PCM modelling has developed rapidly over the past decade by taking advantage of the progress of different descriptors and machine learning techniques, along with the increasing amounts of available bioactivity data. Specifically, the new emerging target descriptors and cross-term descriptors not only significantly increased the performance of PCM modelling but also expanded its application scope from traditional protein-ligand interaction to more abundant interactions, including protein-peptide, protein-DNA and even protein-protein interactions. In this review, target descriptors and cross-term descriptors, as well as the corresponding application scope, are intensively summarized. Additionally, we look forward to seeing PCM modelling extend into new application scopes, such as Target-Catalyst-Ligand systems, with the further development of descriptors, machine learning techniques and increasing amounts of available bioactivity data.
Collapse
|
3
|
Screening Ingredients from Herbs against Pregnane X Receptor in the Study of Inductive Herb-Drug Interactions: Combining Pharmacophore and Docking-Based Rank Aggregation. BIOMED RESEARCH INTERNATIONAL 2015; 2015:657159. [PMID: 26339628 PMCID: PMC4538340 DOI: 10.1155/2015/657159] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/04/2014] [Revised: 12/22/2014] [Accepted: 12/27/2014] [Indexed: 01/30/2023]
Abstract
The issue of herb-drug interactions has been widely reported. Herbal ingredients can activate nuclear receptors and further induce the gene expression alteration of drug-metabolizing enzyme and/or transporter. Therefore, the herb-drug interaction will happen when the herbs and drugs are coadministered. This kind of interaction is called inductive herb-drug interactions. Pregnane X Receptor (PXR) and drug-metabolizing target genes are involved in most of inductive herb-drug interactions. To predict this kind of herb-drug interaction, the protocol could be simplified to only screen agonists of PXR from herbs because the relations of drugs with their metabolizing enzymes are well studied. Here, a combinational in silico strategy of pharmacophore modelling and docking-based rank aggregation (DRA) was employed to identify PXR's agonists. Firstly, 305 ingredients were screened out from 820 ingredients as candidate agonists of PXR with our pharmacophore model. Secondly, DRA was used to rerank the result of pharmacophore filtering. To validate our prediction, a curated herb-drug interaction database was built, which recorded 380 herb-drug interactions. Finally, among the top 10 herb ingredients from the ranking list, 6 ingredients were reported to involve in herb-drug interactions. The accuracy of our method is higher than other traditional methods. The strategy could be extended to studies on other inductive herb-drug interactions.
Collapse
|
4
|
Kumar A, Zhang KYJ. Hierarchical virtual screening approaches in small molecule drug discovery. Methods 2015; 71:26-37. [PMID: 25072167 PMCID: PMC7129923 DOI: 10.1016/j.ymeth.2014.07.007] [Citation(s) in RCA: 95] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2014] [Revised: 07/16/2014] [Accepted: 07/17/2014] [Indexed: 02/06/2023] Open
Abstract
Virtual screening has played a significant role in the discovery of small molecule inhibitors of therapeutic targets in last two decades. Various ligand and structure-based virtual screening approaches are employed to identify small molecule ligands for proteins of interest. These approaches are often combined in either hierarchical or parallel manner to take advantage of the strength and avoid the limitations associated with individual methods. Hierarchical combination of ligand and structure-based virtual screening approaches has received noteworthy success in numerous drug discovery campaigns. In hierarchical virtual screening, several filters using ligand and structure-based approaches are sequentially applied to reduce a large screening library to a number small enough for experimental testing. In this review, we focus on different hierarchical virtual screening strategies and their application in the discovery of small molecule modulators of important drug targets. Several virtual screening studies are discussed to demonstrate the successful application of hierarchical virtual screening in small molecule drug discovery.
Collapse
Affiliation(s)
- Ashutosh Kumar
- Structural Bioinformatics Team, Center for Life Science Technologies, RIKEN, 1-7-22 Suehiro, Tsurumi, Yokohama, Kanagawa 230-0045, Japan
| | - Kam Y J Zhang
- Structural Bioinformatics Team, Center for Life Science Technologies, RIKEN, 1-7-22 Suehiro, Tsurumi, Yokohama, Kanagawa 230-0045, Japan.
| |
Collapse
|
5
|
Tian C, Zhu R, Zhu L, Qiu T, Cao Z, Kang T. Potassium Channels: Structures, Diseases, and Modulators. Chem Biol Drug Des 2013; 83:1-26. [DOI: 10.1111/cbdd.12237] [Citation(s) in RCA: 63] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- Chuan Tian
- School of Life Sciences and Technology; Tongji University; Shanghai 200092 China
- School of Pharmacy; Liaoning University of Traditional Chinese Medicine; Dalian Liaoning 116600 China
| | - Ruixin Zhu
- School of Life Sciences and Technology; Tongji University; Shanghai 200092 China
| | - Lixin Zhu
- Department of Pediatrics; Digestive Diseases and Nutrition Center; The State University of New York at Buffalo; Buffalo NY 14226 USA
| | - Tianyi Qiu
- School of Life Sciences and Technology; Tongji University; Shanghai 200092 China
| | - Zhiwei Cao
- School of Life Sciences and Technology; Tongji University; Shanghai 200092 China
| | - Tingguo Kang
- School of Pharmacy; Liaoning University of Traditional Chinese Medicine; Dalian Liaoning 116600 China
| |
Collapse
|
6
|
Gao J, Huang Q, Wu D, Zhang Q, Zhang Y, Chen T, Liu Q, Zhu R, Cao Z, He Y. Study on human GPCR–inhibitor interactions by proteochemometric modeling. Gene 2013; 518:124-31. [DOI: 10.1016/j.gene.2012.11.061] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2012] [Accepted: 11/27/2012] [Indexed: 11/15/2022]
|
7
|
Wu Q, Kang H, Tian C, Huang Q, Zhu R. Binding Mechanism of Inhibitors to CDK5/p25 Complex: Free Energy Calculation and Ranking Aggregation Analysis. Mol Inform 2013; 32:251-60. [DOI: 10.1002/minf.201200139] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2012] [Accepted: 01/17/2013] [Indexed: 11/11/2022]
|
8
|
Zhang Y, Baker SS, Baker RD, Zhu R, Zhu L. Systematic analysis of the gene expression in the livers of nonalcoholic steatohepatitis: implications on potential biomarkers and molecular pathological mechanism. PLoS One 2012; 7:e51131. [PMID: 23300535 PMCID: PMC3530598 DOI: 10.1371/journal.pone.0051131] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2012] [Accepted: 10/31/2012] [Indexed: 02/07/2023] Open
Abstract
Non-alcoholic steatohepatitis (NASH) is a severe form of non-alcoholic fatty liver disease (NAFLD). The molecular pathological mechanism of NASH is poorly understood. Recently, high throughput data such as microarray data together with bioinformatics methods have become a powerful way to identify biomarkers and to investigate pathogenesis of diseases. Taking advantage of well characterized microarray datasets of NASH livers, we performed a systematic analysis of potential biomarkers and possible pathological mechanism of NASH from a bioinformatics perspective.CodeLink Human Whole Genome Bioarrays were analyzed to find differentially expressed genes (DEGs) between controls and NASH patients. Four methods were used to identify DEGs and the intersection of DEGs identified by these methods was subsequently used for both biomarker prediction and molecular pathological mechanism analysis. For biomarker prediction, rank aggregation was used to rank DEGs identified by all these methods according to their significance of different expression. Alcohol dehydrogenase 4 (ADH4) exhibited the highest rank suggesting the most significant differential expression between normal and disease condition. Together with the previous report demonstrating the association between ADH4 and the pathogenesis of NASH, our data suggest that ADH4 could be a potential biomarker for NASH. For molecular pathological mechanism analysis, two clusters of highly correlated annotation terms and genes in these terms were identified based on the intersection of DEGs. Then, pathways enriched with these genes were identified to construct the network. Using this network, both for the first time, amino acid catabolism is implicated to play a pivotal role and urea cycle is implicated to be involved in the development of NASH.The results of our study identified potential biomarkers and suggested possible molecular pathological mechanism of NASH. These findings provide a comprehensive and systematic understanding of the pathogenesis of NASH and may facilitate the diagnosis, prevention and treatment of NASH.
Collapse
Affiliation(s)
- Yida Zhang
- Department of Bioinformatics, Tongji University, Shanghai, P.R. China
| | - Susan S. Baker
- Digestive Diseases and Nutrition Center, Department of Pediatrics, the State University of New York at Buffalo, Buffalo, New York, United States of America
| | - Robert D. Baker
- Digestive Diseases and Nutrition Center, Department of Pediatrics, the State University of New York at Buffalo, Buffalo, New York, United States of America
| | - Ruixin Zhu
- Department of Bioinformatics, Tongji University, Shanghai, P.R. China
| | - Lixin Zhu
- Digestive Diseases and Nutrition Center, Department of Pediatrics, the State University of New York at Buffalo, Buffalo, New York, United States of America
| |
Collapse
|
9
|
Dixit A, Verkhivker GM. Integrating ligand-based and protein-centric virtual screening of kinase inhibitors using ensembles of multiple protein kinase genes and conformations. J Chem Inf Model 2012; 52:2501-15. [PMID: 22992037 DOI: 10.1021/ci3002638] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
The rapidly growing wealth of structural and functional information about kinase genes and kinase inhibitors that is fueled by a significant therapeutic role of this protein family provides a significant impetus for development of targeted computational screening approaches. In this work, we explore an ensemble-based, protein-centric approach that allows for simultaneous virtual ligand screening against multiple kinase genes and multiple kinase receptor conformations. We systematically analyze and compare the results of ligand-based and protein-centric screening approaches using both single-receptor and ensemble-based docking protocols. A panel of protein kinase targets that includes ABL, EGFR, P38, CDK2, TK, and VEGFR2 kinases is used in this comparative analysis. By applying various performance metrics we have shown that ligand-centric shape matching can provide an effective enrichment of active compounds outperforming single-receptor docking screening. However, ligand-based approaches can be highly sensitive to the choice of inhibitor queries. Employment of multiple inhibitor queries combined with parallel selection ranking criteria can improve the performance and efficiency of ligand-based virtual screening. We also demonstrated that replica-exchange Monte Carlo docking with kinome-based ensembles of multiple crystal structures can provide a superior early enrichment on the kinase targets. The central finding of this study is that incorporation of the template-based structural information about kinase inhibitors and protein kinase structures in diverse functional states can significantly enhance the overall performance and robustness of both ligand and protein-centric screening strategies. The results of this study may be useful in virtual screening of kinase inhibitors potentially offering a beneficial spectrum of therapeutic activities across multiple disease states.
Collapse
Affiliation(s)
- Anshuman Dixit
- Department of Pharmaceutical Chemistry, School of Pharmacy, The University of Kansas, 2095 Constant Avenue, Lawrence, Kansas 66047, USA
| | | |
Collapse
|
10
|
Wu D, Huang Q, Zhang Y, Zhang Q, Liu Q, Gao J, Cao Z, Zhu R. Screening of selective histone deacetylase inhibitors by proteochemometric modeling. BMC Bioinformatics 2012; 13:212. [PMID: 22913517 PMCID: PMC3542186 DOI: 10.1186/1471-2105-13-212] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2012] [Accepted: 08/16/2012] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Histone deacetylase (HDAC) is a novel target for the treatment of cancer and it can be classified into three classes, i.e., classes I, II, and IV. The inhibitors selectively targeting individual HDAC have been proved to be the better candidate antitumor drugs. To screen selective HDAC inhibitors, several proteochemometric (PCM) models based on different combinations of three kinds of protein descriptors, two kinds of ligand descriptors and multiplication cross-terms were constructed in our study. RESULTS The results show that structure similarity descriptors are better than sequence similarity descriptors and geometry descriptors in the leftacterization of HDACs. Furthermore, the predictive ability was not improved by introducing the cross-terms in our models. Finally, a best PCM model based on protein structure similarity descriptors and 32-dimensional general descriptors was derived (R2 = 0.9897, Qtest2 = 0.7542), which shows a powerful ability to screen selective HDAC inhibitors. CONCLUSIONS Our best model not only predict the activities of inhibitors for each HDAC isoform, but also screen and distinguish class-selective inhibitors and even more isoform-selective inhibitors, thus it provides a potential way to discover or design novel candidate antitumor drugs with reduced side effect.
Collapse
Affiliation(s)
- Dingfeng Wu
- School of Life Sciences and Technology, Tongji University, Shanghai, 200092, P.R. China
| | - Qi Huang
- School of Life Sciences and Technology, Tongji University, Shanghai, 200092, P.R. China
| | - Yida Zhang
- School of Life Sciences and Technology, Tongji University, Shanghai, 200092, P.R. China
| | - Qingchen Zhang
- School of Life Sciences and Technology, Tongji University, Shanghai, 200092, P.R. China
| | - Qi Liu
- School of Life Sciences and Technology, Tongji University, Shanghai, 200092, P.R. China
| | - Jun Gao
- School of Life Sciences and Technology, Tongji University, Shanghai, 200092, P.R. China
- School of Information Engineering, Shanghai Maritime University, Shanghai, 201306, P.R. China
| | - Zhiwei Cao
- School of Life Sciences and Technology, Tongji University, Shanghai, 200092, P.R. China
| | - Ruixin Zhu
- School of Life Sciences and Technology, Tongji University, Shanghai, 200092, P.R. China
- Institute for Advanced Study of Translational Medicine, Tongji University, Shanghai, 200092, P.R. China
- School of Pharmacy, Liaoning University of Traditional Chinese Medicine, Dalian, Liaoning, 116600, P.R. China
| |
Collapse
|
11
|
Proteochemometric modeling of the bioactivity spectra of HIV-1 protease inhibitors by introducing protein-ligand interaction fingerprint. PLoS One 2012; 7:e41698. [PMID: 22848570 PMCID: PMC3407198 DOI: 10.1371/journal.pone.0041698] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2012] [Accepted: 06/25/2012] [Indexed: 01/01/2023] Open
Abstract
HIV-1 protease is one of the main therapeutic targets in HIV. However, a major problem in treatment of HIV is the rapid emergence of drug-resistant strains. It should be particularly helpful to clinical therapy of AIDS if one method can be used to predict antivirus capability of compounds for different variants. In our study, proteochemometric (PCM) models were created to study the bioactivity spectra of 92 chemical compounds with 47 unique HIV-1 protease variants. In contrast to other PCM models, which used Multiplication of Ligands and Proteins Descriptors (MLPD) as cross-term, one new cross-term, i.e. Protein-Ligand Interaction Fingerprint (PLIF) was introduced in our modeling. With different combinations of ligand descriptors, protein descriptors and cross-terms, nine PCM models were obtained, and six of them achieved good predictive abilities (Q(2)(test)>0.7). These results showed that the performance of PCM models could be improved when ligand and protein descriptors were complemented by the newly introduced cross-term PLIF. Compared with the conventional cross-term MLPD, the newly introduced PLIF had a better predictive ability. Furthermore, our best model (GD & P & PLIF: Q(2)(test) = 0.8271) could select out those inhibitors which have a broad antiviral activity. As a conclusion, our study indicates that proteochemometric modeling with PLIF as cross-term is a potential useful way to solve the HIV-1 drug-resistant problem.
Collapse
|
12
|
Comparison of different ranking methods in protein-ligand binding site prediction. Int J Mol Sci 2012; 13:8752-8761. [PMID: 22942732 PMCID: PMC3430263 DOI: 10.3390/ijms13078752] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2012] [Revised: 06/19/2012] [Accepted: 07/02/2012] [Indexed: 11/17/2022] Open
Abstract
In recent years, although many ligand-binding site prediction methods have been developed, there has still been a great demand to improve the prediction accuracy and compare different prediction algorithms to evaluate their performances. In this work, in order to improve the performance of the protein-ligand binding site prediction method presented in our former study, a comparison of different binding site ranking lists was studied. Four kinds of properties, i.e., pocket size, distance from the protein centroid, sequence conservation and the number of hydrophobic residues, have been chosen as the corresponding ranking criterion respectively. Our studies show that the sequence conservation information helps to rank the real pockets with the most successful accuracy compared to others. At the same time, the pocket size and the distance of binding site from the protein centroid are also found to be helpful. In addition, a multi-view ranking aggregation method, which combines the information among those four properties, was further applied in our study. The results show that a better performance can be achieved by the aggregation of the complementary properties in the prediction of ligand-binding sites.
Collapse
|