1
|
Wu J, Chen Y, Wu J, Zhao D, Huang J, Lin M, Wang L. Large-scale comparison of machine learning methods for profiling prediction of kinase inhibitors. J Cheminform 2024; 16:13. [PMID: 38291477 PMCID: PMC10829268 DOI: 10.1186/s13321-023-00799-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 12/22/2023] [Indexed: 02/01/2024] Open
Abstract
Conventional machine learning (ML) and deep learning (DL) play a key role in the selectivity prediction of kinase inhibitors. A number of models based on available datasets can be used to predict the kinase profile of compounds, but there is still controversy about the advantages and disadvantages of ML and DL for such tasks. In this study, we constructed a comprehensive benchmark dataset of kinase inhibitors, involving in 141,086 unique compounds and 216,823 well-defined bioassay data points for 354 kinases. We then systematically compared the performance of 12 ML and DL methods on the kinase profiling prediction task. Extensive experimental results reveal that (1) Descriptor-based ML models generally slightly outperform fingerprint-based ML models in terms of predictive performance. RF as an ensemble learning approach displays the overall best predictive performance. (2) Single-task graph-based DL models are generally inferior to conventional descriptor- and fingerprint-based ML models, however, the corresponding multi-task models generally improves the average accuracy of kinase profile prediction. For example, the multi-task FP-GNN model outperforms the conventional descriptor- and fingerprint-based ML models with an average AUC of 0.807. (3) Fusion models based on voting and stacking methods can further improve the performance of the kinase profiling prediction task, specifically, RF::AtomPairs + FP2 + RDKitDes fusion model performs best with the highest average AUC value of 0.825 on the test sets. These findings provide useful information for guiding choices of the ML and DL methods for the kinase profiling prediction tasks. Finally, an online platform called KIPP ( https://kipp.idruglab.cn ) and python software are developed based on the best models to support the kinase profiling prediction, as well as various kinase inhibitor identification tasks including virtual screening, compound repositioning and target fishing.
Collapse
Affiliation(s)
- Jiangxia Wu
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou, 510006, China
| | - Yihao Chen
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou, 510006, China
| | - Jingxing Wu
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou, 510006, China
| | - Duancheng Zhao
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou, 510006, China
| | - Jindi Huang
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou, 510006, China
| | - MuJie Lin
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou, 510006, China
| | - Ling Wang
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou, 510006, China.
| |
Collapse
|
2
|
Ren Q, Qu N, Sun J, Zhou J, Liu J, Ni L, Tong X, Zhang Z, Kong X, Wen Y, Wang Y, Wang D, Luo X, Zhang S, Zheng M, Li X. KinomeMETA: meta-learning enhanced kinome-wide polypharmacology profiling. Brief Bioinform 2023; 25:bbad461. [PMID: 38113075 PMCID: PMC10729787 DOI: 10.1093/bib/bbad461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 11/08/2023] [Accepted: 11/22/2023] [Indexed: 12/21/2023] Open
Abstract
Kinase inhibitors are crucial in cancer treatment, but drug resistance and side effects hinder the development of effective drugs. To address these challenges, it is essential to analyze the polypharmacology of kinase inhibitor and identify compound with high selectivity profile. This study presents KinomeMETA, a framework for profiling the activity of small molecule kinase inhibitors across a panel of 661 kinases. By training a meta-learner based on a graph neural network and fine-tuning it to create kinase-specific learners, KinomeMETA outperforms benchmark multi-task models and other kinase profiling models. It provides higher accuracy for understudied kinases with limited known data and broader coverage of kinase types, including important mutant kinases. Case studies on the discovery of new scaffold inhibitors for membrane-associated tyrosine- and threonine-specific cdc2-inhibitory kinase and selective inhibitors for fibroblast growth factor receptors demonstrate the role of KinomeMETA in virtual screening and kinome-wide activity profiling. Overall, KinomeMETA has the potential to accelerate kinase drug discovery by more effectively exploring the kinase polypharmacology landscape.
Collapse
Affiliation(s)
- Qun Ren
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing 210023, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Ning Qu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Jingjing Sun
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Jingyi Zhou
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- School of Physical Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Lingang Laboratory, Shanghai 200031, China
| | - Jin Liu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Lin Ni
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing 210023, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Xiaochu Tong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Zimei Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Xiangtai Kong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Yiming Wen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Yitian Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Dingyan Wang
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, Hangzhou 330106, China
| | - Xiaomin Luo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Sulin Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing 210023, China
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| |
Collapse
|
3
|
Bongers BJ, Sijben HJ, Hartog PBR, Tarnovskiy A, IJzerman AP, Heitman LH, van Westen GJP. Proteochemometric Modeling Identifies Chemically Diverse Norepinephrine Transporter Inhibitors. J Chem Inf Model 2023; 63:1745-1755. [PMID: 36926886 PMCID: PMC10052348 DOI: 10.1021/acs.jcim.2c01645] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
Solute carriers (SLCs) are relatively underexplored compared to other prominent protein families such as kinases and G protein-coupled receptors. However, proteins from the SLC family play an essential role in various diseases. One such SLC is the high-affinity norepinephrine transporter (NET/SLC6A2). In contrast to most other SLCs, the NET has been relatively well studied. However, the chemical space of known ligands has a low chemical diversity, making it challenging to identify chemically novel ligands. Here, a computational screening pipeline was developed to find new NET inhibitors. The approach increases the chemical space to model for NETs using the chemical space of related proteins that were selected utilizing similarity networks. Prior proteochemometric models added data from related proteins, but here we use a data-driven approach to select the optimal proteins to add to the modeled data set. After optimizing the data set, the proteochemometric model was optimized using stepwise feature selection. The final model was created using a two-step approach combining several proteochemometric machine learning models through stacking. This model was applied to the extensive virtual compound database of Enamine, from which the top predicted 22,000 of the 600 million virtual compounds were clustered to end up with 46 chemically diverse candidates. A subselection of 32 candidates was synthesized and subsequently tested using an impedance-based assay. There were five hit compounds identified (hit rate 16%) with sub-micromolar inhibitory potencies toward NET, which are promising for follow-up experimental research. This study demonstrates a data-driven approach to diversify known chemical space to identify novel ligands and is to our knowledge the first to select this set based on the sequence similarity of related targets.
Collapse
Affiliation(s)
- Brandon J Bongers
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden 2333 CC, The Netherlands
| | - Huub J Sijben
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden 2333 CC, The Netherlands
| | - Peter B R Hartog
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden 2333 CC, The Netherlands
| | | | - Adriaan P IJzerman
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden 2333 CC, The Netherlands
| | - Laura H Heitman
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden 2333 CC, The Netherlands.,Oncode Institute, Jaarbeursplein 6, Utrecht 3521 AL, The Netherlands
| | - Gerard J P van Westen
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden 2333 CC, The Netherlands
| |
Collapse
|
4
|
Karasev DA, Sobolev BN, Lagunin AA, Filimonov DA, Poroikov VV. The method predicting interaction between protein targets and small-molecular ligands with the wide applicability domain. Comput Biol Chem 2022; 98:107674. [DOI: 10.1016/j.compbiolchem.2022.107674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 03/24/2022] [Accepted: 03/28/2022] [Indexed: 11/03/2022]
|
5
|
Zhang S, Lu T, Xu P, Tao Q, Li M, Lu W. Predicting the Formability of Hybrid Organic-Inorganic Perovskites via an Interpretable Machine Learning Strategy. J Phys Chem Lett 2021; 12:7423-7430. [PMID: 34337946 DOI: 10.1021/acs.jpclett.1c01939] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Predicting the formability of perovskite structure for hybrid organic-inorganic perovskites (HOIPs) is a prominent challenge in the search for the required materials from a huge search space. Here, we propose an interpretable strategy combining machine learning with a shapley additive explanations (SHAP) approach to accelerate the discovery of potential HOIPs. According to the prediction of the best classification model, top-198 nontoxic candidates with a probability of formability (Pf) of >0.99 are screened from 18560 virtual samples. The SHAP analysis reveals that the radius and lattice constant of the B site (rB and LCB) are positively related to formability, while the ionic radius of the A site (rA), the tolerant factor (t), and the first ionization energy of the B site (I1B) have negative relations. The significant finding is that stricter ranges of t (0.84-1.12) and improved tolerant factor τ (critical value of 6.20) do exist for HOIPs, which are different from inorganic perovskites, providing a simple and fast assessment in the design of materials with an HOIP structure.
Collapse
Affiliation(s)
- Shilin Zhang
- Department of Chemistry, College of Sciences, Shanghai University, Shanghai 200444, China
| | - Tian Lu
- Materials Genome Institute, Shanghai University, Shanghai 200444, China
| | - Pengcheng Xu
- Materials Genome Institute, Shanghai University, Shanghai 200444, China
| | - Qiuling Tao
- Department of Chemistry, College of Sciences, Shanghai University, Shanghai 200444, China
| | - Minjie Li
- Department of Chemistry, College of Sciences, Shanghai University, Shanghai 200444, China
| | - Wencong Lu
- Department of Chemistry, College of Sciences, Shanghai University, Shanghai 200444, China
- Materials Genome Institute, Shanghai University, Shanghai 200444, China
| |
Collapse
|
6
|
Wang T, Liang L, Zhao C, Sun J, Wang H, Wang W, Lin J, Hu Y. Elucidating direct kinase targets of compound Danshen dropping pills employing archived data and prediction models. Sci Rep 2021; 11:9541. [PMID: 33953309 PMCID: PMC8100098 DOI: 10.1038/s41598-021-89035-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 04/19/2021] [Indexed: 12/17/2022] Open
Abstract
Research on direct targets of traditional Chinese medicine (TCM) is the key to study the mechanism and material basis of it, but there is still no effective methods at present. We took Compound Danshen dropping pills (CDDP) as a study case to establish a strategy to identify significant direct targets of TCM. As a result, thirty potential active kinase targets of CDDP were identified. Nine of them had potential dose-dependent effects. In addition, the direct inhibitory effect of CDDP on three kinases, AURKB, MET and PIM1 were observed both on biochemical level and cellular level, which could not only shed light on the mechanisms of action involved in CDDP, but also suggesting the potency of drug repositioning of CDDP. Our results indicated that the research strategy including both in silico models and experimental validation that we built, were relatively efficient and reliable for direct targets identification for TCM prescription, which will help elucidating the mechanisms of TCM and promoting the modernization of TCM.
Collapse
Affiliation(s)
- Tongxing Wang
- GeneNet Pharmaceuticals Co. Ltd., No. 1, Tingjiang West Road, Beichen District, Tianjin, 300410, China
| | - Lu Liang
- College of Pharmacy, Nankai University, Haihe Education Park, 38 Tongyan Road, Jinnan District, Tianjin, 300353, China
| | - Chunlai Zhao
- GeneNet Pharmaceuticals Co. Ltd., No. 1, Tingjiang West Road, Beichen District, Tianjin, 300410, China
| | - Jia Sun
- GeneNet Pharmaceuticals Co. Ltd., No. 1, Tingjiang West Road, Beichen District, Tianjin, 300410, China
| | - Hairong Wang
- GeneNet Pharmaceuticals Co. Ltd., No. 1, Tingjiang West Road, Beichen District, Tianjin, 300410, China
| | - Wenjia Wang
- GeneNet Pharmaceuticals Co. Ltd., No. 1, Tingjiang West Road, Beichen District, Tianjin, 300410, China
| | - Jianping Lin
- College of Pharmacy, Nankai University, Haihe Education Park, 38 Tongyan Road, Jinnan District, Tianjin, 300353, China
| | - Yunhui Hu
- GeneNet Pharmaceuticals Co. Ltd., No. 1, Tingjiang West Road, Beichen District, Tianjin, 300410, China.
| |
Collapse
|
7
|
Karasev D, Sobolev B, Lagunin A, Filimonov D, Poroikov V. Prediction of Protein-ligand Interaction Based on Sequence Similarity and Ligand Structural Features. Int J Mol Sci 2020; 21:ijms21218152. [PMID: 33142754 PMCID: PMC7663273 DOI: 10.3390/ijms21218152] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 10/28/2020] [Accepted: 10/29/2020] [Indexed: 01/09/2023] Open
Abstract
Computationally predicting the interaction of proteins and ligands presents three main directions: the search of new target proteins for ligands, the search of new ligands for targets, and predicting the interaction of new proteins and new ligands. We proposed an approach providing the fuzzy classification of protein sequences based on the ligand structural features to analyze the latter most complicated case. We tested our approach on five protein groups, which represented promised targets for drug-like ligands and differed in functional peculiarities. The training sets were built with the original procedure overcoming the data ambiguity. Our study showed the effective prediction of new targets for ligands with an average accuracy of 0.96. The prediction of new ligands for targets displayed the average accuracy 0.95; accuracy estimates were close to our previous results, comparable in accuracy to those of other methods or exceeded them. Using the fuzzy coefficients reflecting the target-to-ligand specificity, we provided predicting interactions for new proteins and new ligands; the obtained accuracy values from 0.89 to 0.99 were acceptable for such a sophisticated task. The protein kinase family case demonstrated the ability to account for subtle features of proteins and ligands required for the specificity of protein–ligand interaction.
Collapse
Affiliation(s)
- Dmitry Karasev
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
- Correspondence:
| | - Boris Sobolev
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
| | - Alexey Lagunin
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
- Department of Bioinformatics, Russian National Research Medical University, Moscow 117997, Russia
| | - Dmitry Filimonov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
| | - Vladimir Poroikov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
| |
Collapse
|
8
|
|
9
|
Hathout RM, Metwally AA, Woodman TJ, Hardy JG. Prediction of Drug Loading in the Gelatin Matrix Using Computational Methods. ACS OMEGA 2020; 5:1549-1556. [PMID: 32010828 PMCID: PMC6990624 DOI: 10.1021/acsomega.9b03487] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Accepted: 12/31/2019] [Indexed: 05/05/2023]
Abstract
The delivery of drugs is a topic of intense research activity in both academia and industry with potential for positive economic, health, and societal impacts. The selection of the appropriate formulation (carrier and drug) with optimal delivery is a challenge investigated by researchers in academia and industry, in which millions of dollars are invested annually. Experiments involving different carriers and determination of their capacity for drug loading are very time-consuming and therefore expensive; consequently, approaches that employ computational/theoretical chemistry to speed have the potential to make hugely beneficial economic, environmental, and health impacts through savings in costs associated with chemicals (and their safe disposal) and time. Here, we report the use of computational tools (data mining of the available literature, principal component analysis, hierarchical clustering analysis, partial least squares regression, autocovariance calculations, molecular dynamics simulations, and molecular docking) to successfully predict drug loading into model drug delivery systems (gelatin nanospheres). We believe that this methodology has the potential to lead to significant change in drug formulation studies across the world.
Collapse
Affiliation(s)
- Rania M. Hathout
- Department
of Pharmaceutics and Industrial Pharmacy, Faculty of Pharmacy, Ain Shams University, Cairo 11566, Egypt
- E-mail: (R.M.H.)
| | - AbdelKader A. Metwally
- Department
of Pharmaceutics and Industrial Pharmacy, Faculty of Pharmacy, Ain Shams University, Cairo 11566, Egypt
- Department
of Pharmaceutics, Faculty of Pharmacy, Health Sciences Center, Kuwait University, Kuwait 90805, Kuwait
| | - Timothy J. Woodman
- Department
of Pharmacy and Pharmacology, University
of Bath, Bath BA2 7AY, U.K
| | - John G. Hardy
- Department
of Chemistry, Lancaster University, Lancaster, Lancashire LA1 4YB, U.K
- Materials
Science Institute, Lancaster University, Lancaster, Lancashire LA1 4YB, U.K
- E-mail; (J.G.H.)
| |
Collapse
|
10
|
Karasev D, Sobolev B, Lagunin A, Filimonov D, Poroikov V. Prediction of Protein-Ligand Interaction Based on the Positional Similarity Scores Derived from Amino Acid Sequences. Int J Mol Sci 2019; 21:ijms21010024. [PMID: 31861473 PMCID: PMC6981593 DOI: 10.3390/ijms21010024] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 12/13/2019] [Accepted: 12/16/2019] [Indexed: 12/14/2022] Open
Abstract
The affinity of different drug-like ligands to multiple protein targets reflects general chemical–biological interactions. Computational methods estimating such interactions analyze the available information about the structure of the targets, ligands, or both. Prediction of protein–ligand interactions based on pairwise sequence alignment provides reasonable accuracy if the ligands’ specificity well coincides with the phylogenic taxonomy of the proteins. Methods using multiple alignment require an accurate match of functionally significant residues. Such conditions may not be met in the case of diverged protein families. To overcome these limitations, we propose an approach based on the analysis of local sequence similarity within the set of analyzed proteins. The positional scores, calculated by sequence fragment comparisons, are used as input data for the Bayesian classifier. Our approach provides a prediction accuracy comparable or exceeding those of other methods. It was demonstrated on the popular Gold Standard test sets, presenting different sequence heterogeneity and varying from the group, including different protein families to the more specific groups. A reasonable prediction accuracy was also found for protein kinases, displaying weak relationships between sequence phylogeny and inhibitor specificity. Thus, our method can be applied to the broad area of protein–ligand interactions.
Collapse
Affiliation(s)
- Dmitry Karasev
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
- Correspondence:
| | - Boris Sobolev
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
| | - Alexey Lagunin
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
- Department of Bioinformatics, Russian National Research Medical University, Moscow 117997, Russia
| | - Dmitry Filimonov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
| | - Vladimir Poroikov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
| |
Collapse
|
11
|
Bongers BJ, IJzerman AP, Van Westen GJP. Proteochemometrics - recent developments in bioactivity and selectivity modeling. DRUG DISCOVERY TODAY. TECHNOLOGIES 2019; 32-33:89-98. [PMID: 33386099 DOI: 10.1016/j.ddtec.2020.08.003] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 08/18/2020] [Accepted: 08/28/2020] [Indexed: 06/12/2023]
Abstract
Proteochemometrics is a machine learning based modeling approach relying on a combination of ligand and protein descriptors. With ongoing developments in machine learning and increases in public data the technique is more frequently applied in early drug discovery, typically in ligand-target binding prediction. Common applications include improvements to single target quantitative structure-activity relationship models, protein selectivity and promiscuity modeling, and large-scale deep learning approaches. The increase in predictive power using proteochemometrics is observed in multi-target bioactivity modeling, opening the door to more extensive studies covering whole protein families. On top of that, with deep learning fueling more complex and larger scale models, proteochemometrics allows faster and higher quality computational models supporting the design, make, test cycle.
Collapse
Affiliation(s)
- Brandon J Bongers
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, P.O. Box 9502, 2300 RA, Leiden, The Netherlands
| | - Adriaan P IJzerman
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, P.O. Box 9502, 2300 RA, Leiden, The Netherlands
| | - Gerard J P Van Westen
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, P.O. Box 9502, 2300 RA, Leiden, The Netherlands.
| |
Collapse
|
12
|
Li X, Li Z, Wu X, Xiong Z, Yang T, Fu Z, Liu X, Tan X, Zhong F, Wan X, Wang D, Ding X, Yang R, Hou H, Li C, Liu H, Chen K, Jiang H, Zheng M. Deep Learning Enhancing Kinome-Wide Polypharmacology Profiling: Model Construction and Experiment Validation. J Med Chem 2019; 63:8723-8737. [PMID: 31364850 DOI: 10.1021/acs.jmedchem.9b00855] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The kinome-wide virtual profiling of small molecules with high-dimensional structure-activity data is a challenging task in drug discovery. Here, we present a virtual profiling model against a panel of 391 kinases based on large-scale bioactivity data and the multitask deep neural network algorithm. The obtained model yields excellent internal prediction capability with an auROC of 0.90 and consistently outperforms conventional single-task models on external tests, especially for kinases with insufficient activity data. Moreover, more rigorous experimental validations including 1410 kinase-compound pairs showed a high-quality average auROC of 0.75 and confirmed many novel predicted "off-target" activities. Given the verified generalizability, the model was further applied to various scenarios for depicting the kinome-wide selectivity and the association with certain diseases. Overall, the computational model enables us to create a comprehensive kinome interaction network for designing novel chemical modulators or drug repositioning and is of practical value for exploring previously less studied kinases.
Collapse
Affiliation(s)
- Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Zhaojun Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,School of Information Management, Dezhou University, 566 West University Road, Dezhou 253023, China
| | - Xiaolong Wu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Zhaoping Xiong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,School of Life Science and Technology, ShanghaiTech University, 393 Huaxiazhong Road, Shanghai 200031, China
| | - Tianbiao Yang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Zunyun Fu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Xiaohong Liu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,School of Life Science and Technology, ShanghaiTech University, 393 Huaxiazhong Road, Shanghai 200031, China
| | - Xiaoqin Tan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Feisheng Zhong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Xiaozhe Wan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Dingyan Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Xiaoyu Ding
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Ruirui Yang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,School of Life Science and Technology, ShanghaiTech University, 393 Huaxiazhong Road, Shanghai 200031, China
| | - Hui Hou
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, China
| | - Chunpu Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Hong Liu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Kaixian Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,School of Life Science and Technology, ShanghaiTech University, 393 Huaxiazhong Road, Shanghai 200031, China
| | - Hualiang Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.,School of Life Science and Technology, ShanghaiTech University, 393 Huaxiazhong Road, Shanghai 200031, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| |
Collapse
|
13
|
Rasti B, Mazraedoost S, Panahi H, Falahati M, Attar F. New insights into the selective inhibition of the β-carbonic anhydrases of pathogenic bacteria Burkholderia pseudomallei and Francisella tularensis: a proteochemometrics study. Mol Divers 2018; 23:263-273. [PMID: 30120657 DOI: 10.1007/s11030-018-9869-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Accepted: 08/09/2018] [Indexed: 10/28/2022]
Abstract
Nowadays, antibiotic resistance has turned into one of the most important worldwide health problems. Biological end point of critical enzymes induced by potent inhibitors is recently being considered as a highly effective and popular strategy to defeat antibiotic-resistant pathogens. For instance, the simple but critical β-carbonic anhydrase has recently been in the center of attention for anti-pathogen drug discoveries. However, no β-carbonic anhydrase selective inhibitor has yet been developed. Available β-carbonic anhydrase inhibitors are also highly potent with regard to human carbonic anhydrases, leading to severe inevitable side effects in case of usage. Therefore, developing novel inhibitors with high selectivity against pathogenic β-carbonic anhydrases is of great essence. Herein, for the first time, we have conducted a proteochemometric study to explore the structural and the chemical aspects of the interactions governed by bacterial β-carbonic anhydrases and their inhibitors. We have found valuable information which can lead to designing novel inhibitors with better selectivity for bacterial β-carbonic anhydrases.
Collapse
Affiliation(s)
- Behnam Rasti
- Department of Microbiology, Faculty of Basic Sciences, Lahijan Branch, Islamic Azad University (IAU), Lahijan, Guilan, Iran.
| | - Sargol Mazraedoost
- Department of Microbiology, Faculty of Basic Sciences, Lahijan Branch, Islamic Azad University (IAU), Lahijan, Guilan, Iran
| | - Hanieh Panahi
- Department of Mathematics and Statistics, Lahijan Branch, Islamic Azad University, Lahijan, Iran
| | - Mojtaba Falahati
- Department of Nanotechnology, Faculty of Advance Science and Technology, Pharmaceutical Sciences Branch, Islamic Azad University (IAUPS), Tehran, Iran
| | - Farnoosh Attar
- Department of Biology, Faculty of Food Industry and Agriculture, Standard Research Institute (SRI), Karaj, Iran
| |
Collapse
|
14
|
Qiu T, Wu D, Qiu J, Cao Z. Finding the molecular scaffold of nuclear receptor inhibitors through high-throughput screening based on proteochemometric modelling. J Cheminform 2018; 10:21. [PMID: 29651663 PMCID: PMC5897275 DOI: 10.1186/s13321-018-0275-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 04/02/2018] [Indexed: 02/10/2023] Open
Abstract
Nuclear receptors (NR) are a class of proteins that are responsible for sensing steroid and thyroid hormones and certain other molecules. In that case, NR have the ability to regulate the expression of specific genes and associated with various diseases, which make it essential drug targets. Approaches which can predict the inhibition ability of compounds for different NR target should be particularly helpful for drug development. In this study, proteochemometric modelling was introduced to analysis the bioactivity between chemical compounds and NR targets. Results illustrated the ability of our PCM model for high-throughput NR-inhibitor screening after evaluated on both internal (AUC > 0.870) and external (AUC > 0.746) validation set. Moreover, in-silico predicted bioactive compounds were clustered according to structure similarity and a series of representative molecular scaffolds can be derived for five major NR targets. Through scaffolds analysis, those essential bioactive scaffolds of different NR target can be detected and compared. Generally, the methods and molecular scaffolds proposed in this article can not only help the screening of potential therapeutic NR-inhibitors but also able to guide the future NR-related drug discovery.
Collapse
Affiliation(s)
- Tianyi Qiu
- School of Life Sciences and Technology, Shanghai 10th People's Hospital, Tongji University, No. 1239 SiPing Road, Shanghai, China.,The Institute of Biomedical Sciences, Fudan University, No. 138 Medical College Road, Shanghai, China
| | - Dingfeng Wu
- School of Life Sciences and Technology, Shanghai 10th People's Hospital, Tongji University, No. 1239 SiPing Road, Shanghai, China
| | - Jingxuan Qiu
- School of Life Sciences and Technology, Shanghai 10th People's Hospital, Tongji University, No. 1239 SiPing Road, Shanghai, China.,School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, No. 516 JunGong Road, Shanghai, China
| | - Zhiwei Cao
- School of Life Sciences and Technology, Shanghai 10th People's Hospital, Tongji University, No. 1239 SiPing Road, Shanghai, China.
| |
Collapse
|
15
|
Rasti B, Shahangian SS. Proteochemometric modeling of the origin of thymidylate synthase inhibition. Chem Biol Drug Des 2018; 91:1007-1016. [PMID: 29251822 DOI: 10.1111/cbdd.13163] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Revised: 11/09/2017] [Accepted: 12/01/2017] [Indexed: 12/11/2022]
Affiliation(s)
- Behnam Rasti
- Department of Microbiology; Faculty of Basic Sciences; Lahijan Branch; Islamic Azad University (IAU); Lahijan Guilan Iran
| | | |
Collapse
|
16
|
Raghavendra NM, Pingili D, Kadasi S, Mettu A, Prasad SVUM. Dual or multi-targeting inhibitors: The next generation anticancer agents. Eur J Med Chem 2017; 143:1277-1300. [PMID: 29126724 DOI: 10.1016/j.ejmech.2017.10.021] [Citation(s) in RCA: 156] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Revised: 10/04/2017] [Accepted: 10/09/2017] [Indexed: 12/17/2022]
Abstract
Dual-targeting/Multi-targeting of oncoproteins by a single drug molecule represents an efficient, logical and alternative approach to drug combinations. An increasing interest in this approach is indicated by a steady upsurge in the number of articles on targeting dual/multi proteins published in the last 5 years. Combining different inhibitors that destiny specific single target is the standard treatment for cancer. A new generation of dual or multi-targeting drugs is emerging, where a single chemical entity can act on multiple molecular targets. Dual/Multi-targeting agents are beneficial for solving limited efficiencies, poor safety and resistant profiles of an individual target. Designing dual/multi-target inhibitors with predefined biological profiles present a challenge. The latest advances in bioinformatic tools and the availability of detailed structural information of target proteins have shown a way of discovering multi-targeting molecules. This neoteric artifice that amalgamates the molecular docking of small molecules with protein-based common pharmacophore to design multi-targeting inhibitors is gaining great importance in anticancer drug discovery. Current review focus on the discoveries of dual targeting agents in cancer therapy using rational, computational, proteomic, bioinformatics and polypharmacological approach that enables the discovery and rational design of effective and safe multi-target anticancer agents.
Collapse
Affiliation(s)
- Nulgumnalli Manjunathaiah Raghavendra
- Center for Technological Development in Health, National Institute of Science and Technology on Innovation on Neglected Diseases, Fiocruz, Rio de Janeiro, Brazil.
| | - Divya Pingili
- Sri Venkateshwara College of Pharmacy, Osmania University, Hyderabad, India; Department of Pharmacy, Jawaharlal Nehru Technological University, Kakinada, India
| | - Sundeep Kadasi
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Osmania University, Hyderabad, India
| | - Akhila Mettu
- Department of Pharmaceutical Chemistry, Gokaraju Rangaraju College of Pharmacy, Osmania University, Hyderabad, India
| | - S V U M Prasad
- Department of Pharmacy, Jawaharlal Nehru Technological University, Kakinada, India
| |
Collapse
|
17
|
Sorgenfrei FA, Fulle S, Merget B. Kinome-Wide Profiling Prediction of Small Molecules. ChemMedChem 2017; 13:495-499. [PMID: 28544552 DOI: 10.1002/cmdc.201700180] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 05/20/2017] [Indexed: 12/21/2022]
Abstract
Extensive kinase profiling data, covering more than half of the human kinome, are available nowadays and allow the construction of activity prediction models of high practical utility. Proteochemometric (PCM) approaches use compound and protein descriptors, which enables the extrapolation of bioactivity values to thus far unexplored kinases. In this study, the potential of PCM to make large-scale predictions on the entire kinome is explored, considering the applicability on novel compounds and kinases, including clinically relevant mutants. A rigorous validation indicates high predictive power on left-out kinases and superiority over individual kinase QSAR models for new compounds. Furthermore, external validation on clinically relevant mutant kinases reveals an excellent predictive power for mutations spread across the ATP binding site.
Collapse
Affiliation(s)
- Frieda A Sorgenfrei
- BioMed X Innovation Center, Im Neuenheimer Feld 515, 69120, Heidelberg, Germany
| | - Simone Fulle
- BioMed X Innovation Center, Im Neuenheimer Feld 515, 69120, Heidelberg, Germany
| | - Benjamin Merget
- BioMed X Innovation Center, Im Neuenheimer Feld 515, 69120, Heidelberg, Germany
| |
Collapse
|
18
|
Bosc N, Wroblowski B, Meyer C, Bonnet P. Prediction of Protein Kinase-Ligand Interactions through 2.5D Kinochemometrics. J Chem Inf Model 2017; 57:93-101. [PMID: 27983837 DOI: 10.1021/acs.jcim.6b00520] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
So far, 518 protein kinases have been identified in the human genome. They share a common mechanism of protein phosphorylation and are involved in many critical biological processes of eukaryotic cells. Deregulation of the kinase phosphorylation function induces severe illnesses such as cancer, diabetes, or inflammatory diseases. Many actors in the pharmaceutical domain have made significant efforts to design potent and selective protein kinase inhibitors as new potential drugs. Because the ATP binding site is highly conserved in the protein kinase family, the design of selective inhibitors remains a challenge and has negatively impacted the progression of drug candidates to late-stage clinical development. The work presented here adopts a 2.5D kinochemometrics (KCM) approach, derived from proteochemometrics (PCM), in which protein kinases are depicted by a novel 3D descriptor and the ligands by 2D fingerprints. We demonstrate in two examples that the protein descriptor successfully classified protein kinases based on their group membership and their Asp-Phe-Gly (DFG) conformation. We also compared the performance of our models with those obtained from a full 2D KCM model and QSAR models. In both cases, the internal validation of the models demonstrated good capabilities to distinguish "active" from "inactive" protein kinase-ligand pairs. However, the external validation performed on two independent data sets showed that the two statistical models tended to overestimate the number of "inactive" pairs.
Collapse
Affiliation(s)
- Nicolas Bosc
- Institut de Chimie Organique et Analytique (ICOA), UMR CNRS-Université d'Orléans 7311 , Université d'Orléans BP 6759, 45067 Orléans Cedex 2, France
| | - Berthold Wroblowski
- Janssen Research & Development, Janssen Pharmaceutica N.V. , Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Christophe Meyer
- Centre de Recherche Janssen-Cilag , Campus de Maigremont - CS 10615, 27106 Val de Reuil CEDEX, France
| | - Pascal Bonnet
- Institut de Chimie Organique et Analytique (ICOA), UMR CNRS-Université d'Orléans 7311 , Université d'Orléans BP 6759, 45067 Orléans Cedex 2, France
| |
Collapse
|
19
|
Rasti B, Schaduangrat N, Shahangian SS, Nantasenamat C. Exploring the origin of phosphodiesterase inhibition via proteochemometric modeling. RSC Adv 2017. [DOI: 10.1039/c7ra02332d] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
A proteochemometric study of a set of phosphodiesterase 4B and 4D inhibitors sheds light on the origin of their inhibition and selectivities.
Collapse
Affiliation(s)
- Behnam Rasti
- Department of Microbiology
- Faculty of Basic Sciences
- Lahijan Branch
- Islamic Azad University (IAU)
- Lahijan
| | - Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics
- Faculty of Medical Technology
- Mahidol University
- Bangkok 10700
- Thailand
| | - S. Shirin Shahangian
- Department of Biology
- Faculty of Sciences
- University of Guilan
- Rasht 41938-33697
- Iran
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics
- Faculty of Medical Technology
- Mahidol University
- Bangkok 10700
- Thailand
| |
Collapse
|
20
|
Rasti B, Namazi M, Karimi-Jafari MH, Ghasemi JB. Proteochemometric Modeling of the Interaction Space of Carbonic Anhydrase and its Inhibitors: An Assessment of Structure-based and Sequence-based Descriptors. Mol Inform 2016; 36. [PMID: 27860295 DOI: 10.1002/minf.201600102] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2015] [Accepted: 10/26/2016] [Indexed: 11/08/2022]
Abstract
Due to its physiological and clinical roles, carbonic anhydrase (CA) is one of the most interesting case studies. There are different classes of CAinhibitors including sulfonamides, polyamines, coumarins and dithiocarbamates (DTCs). However, many of them hardly act as a selective inhibitor against a specific isoform. Therefore, finding highly selective inhibitors for different isoforms of CA is still an ongoing project. Proteochemometrics modeling (PCM) is able to model the bioactivity of multiple compounds against different isoforms of a protein. Therefore, it would be extremely applicable when investigating the selectivity of different ligands towards different receptors. Given the facts, we applied PCM to investigate the interaction space and structural properties that lead to the selective inhibition of CA isoforms by some dithiocarbamates. Our models have provided interesting structural information that can be considered to design compounds capable of inhibiting different isoforms of CA in an improved selective manner. Validity and predictivity of the models were confirmed by both internal and external validation methods; while Y-scrambling approach was applied to assess the robustness of the models. To prove the reliability and the applicability of our findings, we showed how ligands-receptors selectivity can be affected by removing any of these critical findings from the modeling process.
Collapse
Affiliation(s)
- Behnam Rasti
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Mohsen Namazi
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - M H Karimi-Jafari
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Jahan B Ghasemi
- Department of Analytical Chemistry, School of Chemistry, College of Science, University of Tehran, Tehran, Iran
| |
Collapse
|
21
|
Christmann-Franck S, van Westen GJP, Papadatos G, Beltran Escudie F, Roberts A, Overington JP, Domine D. Unprecedently Large-Scale Kinase Inhibitor Set Enabling the Accurate Prediction of Compound-Kinase Activities: A Way toward Selective Promiscuity by Design? J Chem Inf Model 2016; 56:1654-75. [PMID: 27482722 PMCID: PMC5039764 DOI: 10.1021/acs.jcim.6b00122] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Drug discovery programs frequently target members of the human kinome and try to identify small molecule protein kinase inhibitors, primarily for cancer treatment, additional indications being increasingly investigated. One of the challenges is controlling the inhibitors degree of selectivity, assessed by in vitro profiling against panels of protein kinases. We manually extracted, compiled, and standardized such profiles published in the literature: we collected 356 908 data points corresponding to 482 protein kinases, 2106 inhibitors, and 661 patents. We then analyzed this data set in terms of kinome coverage, results reproducibility, popularity, and degree of selectivity of both kinases and inhibitors. We used the data set to create robust proteochemometric models capable of predicting kinase activity (the ligand-target space was modeled with an externally validated RMSE of 0.41 ± 0.02 log units and R02 0.74 ± 0.03), in order to account for missing or unreliable measurements. The influence on the prediction quality of parameters such as number of measurements, Murcko scaffold frequency or inhibitor type was assessed. Interpretation of the models enabled to highlight inhibitors and kinases properties correlated with higher affinities, and an analysis in the context of kinases crystal structures was performed. Overall, the models quality allows the accurate prediction of kinase-inhibitor activities and their structural interpretation, thus paving the way for the rational design of compounds with a targeted selectivity profile.
Collapse
Affiliation(s)
| | - Gerard J P van Westen
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus , Hinxton, Cambridgeshire CB10 1SD, U.K
| | - George Papadatos
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus , Hinxton, Cambridgeshire CB10 1SD, U.K
| | | | | | - John P Overington
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus , Hinxton, Cambridgeshire CB10 1SD, U.K
| | - Daniel Domine
- Merck Serono , Chemin des Mines 9, 1202 Genève, Switzerland
| |
Collapse
|
22
|
Rasti B, Karimi-Jafari MH, Ghasemi JB. Quantitative Characterization of the Interaction Space of the Mammalian Carbonic Anhydrase Isoforms I, II, VII, IX, XII, and XIV and their Inhibitors, Using the Proteochemometric Approach. Chem Biol Drug Des 2016; 88:341-53. [PMID: 26990115 DOI: 10.1111/cbdd.12759] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2015] [Revised: 01/12/2016] [Accepted: 02/29/2016] [Indexed: 12/23/2022]
Affiliation(s)
- Behnam Rasti
- Department of Bioinformatics; Institute of Biochemistry and Biophysics; University of Tehran; PO Box 13145-1365 Tehran Iran
| | - Mohammad H. Karimi-Jafari
- Department of Bioinformatics; Institute of Biochemistry and Biophysics; University of Tehran; PO Box 13145-1365 Tehran Iran
| | - Jahan B. Ghasemi
- Department of Analytical Chemistry; School of Chemistry; College of Science; University of Tehran; PO Box 13145-1365 Tehran Iran
| |
Collapse
|
23
|
Qiu T, Qiu J, Feng J, Wu D, Yang Y, Tang K, Cao Z, Zhu R. The recent progress in proteochemometric modelling: focusing on target descriptors, cross-term descriptors and application scope. Brief Bioinform 2016; 18:125-136. [PMID: 26873661 DOI: 10.1093/bib/bbw004] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Revised: 12/09/2015] [Indexed: 12/17/2022] Open
Abstract
As an extension of the conventional quantitative structure activity relationship models, proteochemometric (PCM) modelling is a computational method that can predict the bioactivity relations between multiple ligands and multiple targets. Traditional PCM modelling includes three essential elements: descriptors (including target descriptors, ligand descriptors and cross-term descriptors), bioactivity data and appropriate learning functions that link the descriptors to the bioactivity data. Since its appearance, PCM modelling has developed rapidly over the past decade by taking advantage of the progress of different descriptors and machine learning techniques, along with the increasing amounts of available bioactivity data. Specifically, the new emerging target descriptors and cross-term descriptors not only significantly increased the performance of PCM modelling but also expanded its application scope from traditional protein-ligand interaction to more abundant interactions, including protein-peptide, protein-DNA and even protein-protein interactions. In this review, target descriptors and cross-term descriptors, as well as the corresponding application scope, are intensively summarized. Additionally, we look forward to seeing PCM modelling extend into new application scopes, such as Target-Catalyst-Ligand systems, with the further development of descriptors, machine learning techniques and increasing amounts of available bioactivity data.
Collapse
|
24
|
Subramanian V, Prusis P, Xhaard H, Wohlfahrt G. Predictive proteochemometric models for kinases derived from 3D protein field-based descriptors. MEDCHEMCOMM 2016. [DOI: 10.1039/c5md00556f] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Proteochemometric models of kinases derived from protein fields and ligand 4-point pharmacophoric fingerprints are predictive and visually interpretable.
Collapse
Affiliation(s)
- Vigneshwari Subramanian
- Computer-Aided Drug Design
- Orion Pharma
- FI-02101 Espoo
- Finland
- Division of Pharmaceutical Chemistry and Technology
| | - Peteris Prusis
- Computer-Aided Drug Design
- Orion Pharma
- FI-02101 Espoo
- Finland
| | - Henri Xhaard
- Division of Pharmaceutical Chemistry and Technology
- Faculty of Pharmacy
- University of Helsinki
- FI-00014 Helsinki
- Finland
| | - Gerd Wohlfahrt
- Computer-Aided Drug Design
- Orion Pharma
- FI-02101 Espoo
- Finland
| |
Collapse
|
25
|
Ain QU, Méndez-Lucio O, Ciriano IC, Malliavin T, van Westen GJP, Bender A. Modelling ligand selectivity of serine proteases using integrative proteochemometric approaches improves model performance and allows the multi-target dependent interpretation of features. Integr Biol (Camb) 2015; 6:1023-33. [PMID: 25255469 DOI: 10.1039/c4ib00175c] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Serine proteases, implicated in important physiological functions, have a high intra-family similarity, which leads to unwanted off-target effects of inhibitors with insufficient selectivity. However, the availability of sequence and structure data has now made it possible to develop approaches to design pharmacological agents that can discriminate successfully between their related binding sites. In this study, we have quantified the relationship between 12,625 distinct protease inhibitors and their bioactivity against 67 targets of the serine protease family (20,213 data points) in an integrative manner, using proteochemometric modelling (PCM). The benchmarking of 21 different target descriptors motivated the usage of specific binding pocket amino acid descriptors, which helped in the identification of active site residues and selective compound chemotypes affecting compound affinity and selectivity. PCM models performed better than alternative approaches (models trained using exclusively compound descriptors on all available data, QSAR) employed for comparison with R(2)/RMSE values of 0.64 ± 0.23/0.66 ± 0.20 vs. 0.35 ± 0.27/1.05 ± 0.27 log units, respectively. Moreover, the interpretation of the PCM model singled out various chemical substructures responsible for bioactivity and selectivity towards particular proteases (thrombin, trypsin and coagulation factor 10) in agreement with the literature. For instance, absence of a tertiary sulphonamide was identified to be responsible for decreased selective activity (by on average 0.27 ± 0.65 pChEMBL units) on FA10. Among the binding pocket residues, the amino acids (arginine, leucine and tyrosine) at positions 35, 39, 60, 93, 140 and 207 were observed as key contributing residues for selective affinity on these three targets.
Collapse
Affiliation(s)
- Qurrat U Ain
- Centre for Molecular Informatics, Department of Chemistry, Lensfield Road, CB2 1EW, University of Cambridge, UK.
| | | | | | | | | | | |
Collapse
|
26
|
Chen Q, Luo H, Zhang C, Chen YPP. Bioinformatics in protein kinases regulatory network and drug discovery. Math Biosci 2015; 262:147-56. [PMID: 25656386 DOI: 10.1016/j.mbs.2015.01.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2014] [Revised: 01/16/2015] [Accepted: 01/22/2015] [Indexed: 10/24/2022]
Abstract
Protein kinases have been implicated in a number of diseases, where kinases participate many aspects that control cell growth, movement and death. The deregulated kinase activities and the knowledge of these disorders are of great clinical interest of drug discovery. The most critical issue is the development of safe and efficient disease diagnosis and treatment for less cost and in less time. It is critical to develop innovative approaches that aim at the root cause of a disease, not just its symptoms. Bioinformatics including genetic, genomic, mathematics and computational technologies, has become the most promising option for effective drug discovery, and has showed its potential in early stage of drug-target identification and target validation. It is essential that these aspects are understood and integrated into new methods used in drug discovery for diseases arisen from deregulated kinase activity. This article reviews bioinformatics techniques for protein kinase data management and analysis, kinase pathways and drug targets and describes their potential application in pharma ceutical industry.
Collapse
Affiliation(s)
- Qingfeng Chen
- School of Computer, Electronic and Information, Guangxi University, Nanning, 530004, China; State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Guangxi University, China.
| | - Haiqiong Luo
- School of Public Health, Guangxi Medical University, Nanning, 530021, China.
| | - Chengqi Zhang
- Centre for Quantum Computation & Intelligent Systems, University of Technology, Sydney P.O. Box 123, Broadway, NSW 2007, Australia.
| | - Yi-Ping Phoebe Chen
- Department of Computer Science and Computer Engineering, La Trobe University, Vic 3086, Australia.
| |
Collapse
|
27
|
Cortés-Ciriano I, Ain QU, Subramanian V, Lenselink EB, Méndez-Lucio O, IJzerman AP, Wohlfahrt G, Prusis P, Malliavin TE, van Westen GJP, Bender A. Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospects. MEDCHEMCOMM 2015. [DOI: 10.1039/c4md00216d] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Proteochemometric (PCM) modelling is a computational method to model the bioactivity of multiple ligands against multiple related protein targets simultaneously.
Collapse
Affiliation(s)
- Isidro Cortés-Ciriano
- Unité de Bioinformatique Structurale
- Institut Pasteur and CNRS UMR 3825
- Structural Biology and Chemistry Department
- 75 724 Paris
- France
| | - Qurrat Ul Ain
- Unilever Centre for Molecular Informatics
- Department of Chemistry
- CB2 1EW Cambridge
- UK
| | | | - Eelke B. Lenselink
- Division of Medicinal Chemistry
- Leiden Academic Centre for Drug Research
- Leiden
- The Netherlands
| | - Oscar Méndez-Lucio
- Unilever Centre for Molecular Informatics
- Department of Chemistry
- CB2 1EW Cambridge
- UK
| | - Adriaan P. IJzerman
- Division of Medicinal Chemistry
- Leiden Academic Centre for Drug Research
- Leiden
- The Netherlands
| | - Gerd Wohlfahrt
- Computer-Aided Drug Design
- Orion Pharma
- FIN-02101 Espoo
- Finland
| | - Peteris Prusis
- Computer-Aided Drug Design
- Orion Pharma
- FIN-02101 Espoo
- Finland
| | - Thérèse E. Malliavin
- Unité de Bioinformatique Structurale
- Institut Pasteur and CNRS UMR 3825
- Structural Biology and Chemistry Department
- 75 724 Paris
- France
| | - Gerard J. P. van Westen
- European Molecular Biology Laboratory
- European Bioinformatics Institute
- Wellcome Trust Genome Campus
- Hinxton
- UK
| | - Andreas Bender
- Unilever Centre for Molecular Informatics
- Department of Chemistry
- CB2 1EW Cambridge
- UK
| |
Collapse
|
28
|
Nabu S, Nantasenamat C, Owasirikul W, Lawung R, Isarankura-Na-Ayudhya C, Lapins M, Wikberg JES, Prachayasittikul V. Proteochemometric model for predicting the inhibition of penicillin-binding proteins. J Comput Aided Mol Des 2014; 29:127-41. [DOI: 10.1007/s10822-014-9809-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2014] [Accepted: 10/21/2014] [Indexed: 12/17/2022]
|
29
|
Nantasenamat C, Simeon S, Owasirikul W, Songtawee N, Lapins M, Prachayasittikul V, Wikberg JES. Illuminating the origins of spectral properties of green fluorescent proteins via proteochemometric and molecular modeling. J Comput Chem 2014; 35:1951-66. [DOI: 10.1002/jcc.23708] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Revised: 04/28/2014] [Accepted: 07/28/2014] [Indexed: 01/06/2023]
Affiliation(s)
- Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics; Faculty of Medical Technology, Mahidol University; Bangkok 10700 Thailand
- Department of Clinical Microbiology and Applied Technology; Faculty of Medical Technology, Mahidol University; Bangkok 10700 Thailand
| | - Saw Simeon
- Center of Data Mining and Biomedical Informatics; Faculty of Medical Technology, Mahidol University; Bangkok 10700 Thailand
| | - Wiwat Owasirikul
- Center of Data Mining and Biomedical Informatics; Faculty of Medical Technology, Mahidol University; Bangkok 10700 Thailand
- Department of Radiological Technology; Faculty of Medical Technology, Mahidol University; Bangkok 10700 Thailand
| | - Napat Songtawee
- Center of Data Mining and Biomedical Informatics; Faculty of Medical Technology, Mahidol University; Bangkok 10700 Thailand
| | - Maris Lapins
- Department of Pharmaceutical Biosciences; Uppsala University; Uppsala Sweden
| | - Virapong Prachayasittikul
- Department of Clinical Microbiology and Applied Technology; Faculty of Medical Technology, Mahidol University; Bangkok 10700 Thailand
| | - Jarl E. S. Wikberg
- Department of Pharmaceutical Biosciences; Uppsala University; Uppsala Sweden
| |
Collapse
|
30
|
Ferrè F, Palmeri A, Helmer-Citterich M. Computational methods for analysis and inference of kinase/inhibitor relationships. Front Genet 2014; 5:196. [PMID: 25071826 PMCID: PMC4075008 DOI: 10.3389/fgene.2014.00196] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2014] [Accepted: 06/13/2014] [Indexed: 12/21/2022] Open
Abstract
The central role of kinases in virtually all signal transduction networks is the driving motivation for the development of compounds modulating their activity. ATP-mimetic inhibitors are essential tools for elucidating signaling pathways and are emerging as promising therapeutic agents. However, off-target ligand binding and complex and sometimes unexpected kinase/inhibitor relationships can occur for seemingly unrelated kinases, stressing that computational approaches are needed for learning the interaction determinants and for the inference of the effect of small compounds on a given kinase. Recently published high-throughput profiling studies assessed the effects of thousands of small compound inhibitors, covering a substantial portion of the kinome. This wealth of data paved the road for computational resources and methods that can offer a major contribution in understanding the reasons of the inhibition, helping in the rational design of more specific molecules, in the in silico prediction of inhibition for those neglected kinases for which no systematic analysis has been carried yet, in the selection of novel inhibitors with desired selectivity, and offering novel avenues of personalized therapies.
Collapse
Affiliation(s)
- Fabrizio Ferrè
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata Rome, Italy
| | - Antonio Palmeri
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata Rome, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata Rome, Italy
| |
Collapse
|
31
|
Di Luca M, Maccari G, Nifosì R. Treatment of microbial biofilms in the post-antibiotic era: prophylactic and therapeutic use of antimicrobial peptides and their design by bioinformatics tools. Pathog Dis 2014; 70:257-70. [PMID: 24515391 DOI: 10.1111/2049-632x.12151] [Citation(s) in RCA: 65] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2013] [Revised: 01/22/2014] [Accepted: 01/30/2014] [Indexed: 12/14/2022] Open
Abstract
The treatment for biofilm infections is particularly challenging because bacteria in these conditions become refractory to antibiotic drugs. The reduced effectiveness of current therapies spurs research for the identification of novel molecules endowed with antimicrobial activities and new mechanisms of antibiofilm action. Antimicrobial peptides (AMPs) have been receiving increasing attention as potential therapeutic agents, because they represent a novel class of antibiotics with a wide spectrum of activity and a low rate in inducing bacterial resistance. Over the past decades, a large number of naturally occurring AMPs have been identified or predicted from various organisms as effector molecules of the innate immune system playing a crucial role in the first line of defense. Recent studies have shown the ability of some AMPs to act against microbial biofilms, in particular during early phases of biofilm development. Here, we provide a review of the antimicrobial peptides tested on biofilms, highlighting their advantages and disadvantages for prophylactic and therapeutic applications. In addition, we describe the strategies and methods for de novo design of potentially active AMPs and discuss how informatics and computational tools may be exploited to improve antibiofilm effectiveness.
Collapse
|
32
|
Medina-Franco JL, Méndez-Lucio O, Martinez-Mayorga K. The Interplay Between Molecular Modeling and Chemoinformatics to Characterize Protein–Ligand and Protein–Protein Interactions Landscapes for Drug Discovery. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2014; 96:1-37. [DOI: 10.1016/bs.apcsb.2014.06.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
33
|
Schrynemackers M, Küffner R, Geurts P. On protocols and measures for the validation of supervised methods for the inference of biological networks. Front Genet 2013; 4:262. [PMID: 24348517 PMCID: PMC3848415 DOI: 10.3389/fgene.2013.00262] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2013] [Accepted: 11/13/2013] [Indexed: 11/30/2022] Open
Abstract
Networks provide a natural representation of molecular biology knowledge, in particular to model relationships between biological entities such as genes, proteins, drugs, or diseases. Because of the effort, the cost, or the lack of the experiments necessary for the elucidation of these networks, computational approaches for network inference have been frequently investigated in the literature. In this paper, we examine the assessment of supervised network inference. Supervised inference is based on machine learning techniques that infer the network from a training sample of known interacting and possibly non-interacting entities and additional measurement data. While these methods are very effective, their reliable validation in silico poses a challenge, since both prediction and validation need to be performed on the basis of the same partially known network. Cross-validation techniques need to be specifically adapted to classification problems on pairs of objects. We perform a critical review and assessment of protocols and measures proposed in the literature and derive specific guidelines how to best exploit and evaluate machine learning techniques for network inference. Through theoretical considerations and in silico experiments, we analyze in depth how important factors influence the outcome of performance estimation. These factors include the amount of information available for the interacting entities, the sparsity and topology of biological networks, and the lack of experimentally verified non-interacting pairs.
Collapse
Affiliation(s)
- Marie Schrynemackers
- Systems and Modeling, Department of Electrical Engineering and Computer Science and GIGA-R, University of Liège Liège, Belgium
| | - Robert Küffner
- Institute for Practical Informatics and Bioinformatics, Ludwig-Maximilians-University Munich, Germany
| | - Pierre Geurts
- Systems and Modeling, Department of Electrical Engineering and Computer Science and GIGA-R, University of Liège Liège, Belgium
| |
Collapse
|
34
|
Subramanian V, Prusis P, Pietilä LO, Xhaard H, Wohlfahrt G. Visually interpretable models of kinase selectivity related features derived from field-based proteochemometrics. J Chem Inf Model 2013; 53:3021-30. [PMID: 24116714 DOI: 10.1021/ci400369z] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Achieving selectivity for small organic molecules toward biological targets is a main focus of drug discovery but has been proven difficult, for example, for kinases because of the high similarity of their ATP binding pockets. To support the design of more selective inhibitors with fewer side effects or with altered target profiles for improved efficacy, we developed a method combining ligand- and receptor-based information. Conventional QSAR models enable one to study the interactions of multiple ligands toward a single protein target, but in order to understand the interactions between multiple ligands and multiple proteins, we have used proteochemometrics, a multivariate statistics method that aims to combine and correlate both ligand and protein descriptions with affinity to receptors. The superimposed binding sites of 50 unique kinases were described by molecular interaction fields derived from knowledge-based potentials and Schrödinger's WaterMap software. Eighty ligands were described by Mold(2), Open Babel, and Volsurf descriptors. Partial least-squares regression including cross-terms, which describe the selectivity, was used for model building. This combination of methods allows interpretation and easy visualization of the models within the context of ligand binding pockets, which can be translated readily into the design of novel inhibitors.
Collapse
|
35
|
Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets. J Cheminform 2013; 5:42. [PMID: 24059743 PMCID: PMC4015169 DOI: 10.1186/1758-2946-5-42] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2013] [Accepted: 09/18/2013] [Indexed: 11/10/2022] Open
Abstract
Background While a large body of work exists on comparing and benchmarking descriptors of molecular structures, a similar comparison of protein descriptor sets is lacking. Hence, in the current work a total of 13 amino acid descriptor sets have been benchmarked with respect to their ability of establishing bioactivity models. The descriptor sets included in the study are Z-scales (3 variants), VHSE, T-scales, ST-scales, MS-WHIM, FASGAI, BLOSUM, a novel protein descriptor set (termed ProtFP (4 variants)), and in addition we created and benchmarked three pairs of descriptor combinations. Prediction performance was evaluated in seven structure-activity benchmarks which comprise Angiotensin Converting Enzyme (ACE) dipeptidic inhibitor data, and three proteochemometric data sets, namely (1) GPCR ligands modeled against a GPCR panel, (2) enzyme inhibitors (NNRTIs) with associated bioactivities against a set of HIV enzyme mutants, and (3) enzyme inhibitors (PIs) with associated bioactivities on a large set of HIV enzyme mutants. Results The amino acid descriptor sets compared here show similar performance (<0.1 log units RMSE difference and <0.1 difference in MCC), while errors for individual proteins were in some cases found to be larger than those resulting from descriptor set differences ( > 0.3 log units RMSE difference and >0.7 difference in MCC). Combining different descriptor sets generally leads to better modeling performance than utilizing individual sets. The best performers were Z-scales (3) combined with ProtFP (Feature), or Z-Scales (3) combined with an average Z-Scale value for each target, while ProtFP (PCA8), ST-Scales, and ProtFP (Feature) rank last. Conclusions While amino acid descriptor sets capture different aspects of amino acids their ability to be used for bioactivity modeling is still – on average – surprisingly similar. Still, combining sets describing complementary information consistently leads to small but consistent improvement in modeling performance (average MCC 0.01 better, average RMSE 0.01 log units lower). Finally, performance differences exist between the targets compared thereby underlining that choosing an appropriate descriptor set is of fundamental for bioactivity modeling, both from the ligand- as well as the protein side.
Collapse
|
36
|
van Westen GJ, Swier RF, Wegner JK, Ijzerman AP, van Vlijmen HW, Bender A. Benchmarking of protein descriptor sets in proteochemometric modeling (part 1): comparative study of 13 amino acid descriptor sets. J Cheminform 2013; 5:41. [PMID: 24059694 PMCID: PMC3848949 DOI: 10.1186/1758-2946-5-41] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2013] [Accepted: 09/18/2013] [Indexed: 11/10/2022] Open
Abstract
Background While a large body of work exists on comparing and benchmarking of descriptors of molecular structures, a similar comparison of protein descriptor sets is lacking. Hence, in the current work a total of 13 different protein descriptor sets have been compared with respect to their behavior in perceiving similarities between amino acids. The descriptor sets included in the study are Z-scales (3 variants), VHSE, T-scales, ST-scales, MS-WHIM, FASGAI and BLOSUM, and a novel protein descriptor set termed ProtFP (4 variants). We investigate to which extent descriptor sets show collinear as well as orthogonal behavior via principal component analysis (PCA). Results In describing amino acid similarities, MSWHIM, T-scales and ST-scales show related behavior, as do the VHSE, FASGAI, and ProtFP (PCA3) descriptor sets. Conversely, the ProtFP (PCA5), ProtFP (PCA8), Z-Scales (Binned), and BLOSUM descriptor sets show behavior that is distinct from one another as well as both of the clusters above. Generally, the use of more principal components (>3 per amino acid, per descriptor) leads to a significant differences in the way amino acids are described, despite that the later principal components capture less variation per component of the original input data. Conclusion In this work a comparison is provided of how similar (and differently) currently available amino acids descriptor sets behave when converting structure to property space. The results obtained enable molecular modelers to select suitable amino acid descriptor sets for structure-activity analyses, e.g. those showing complementary behavior.
Collapse
Affiliation(s)
- Gerard Jp van Westen
- Division of Medicinal Chemistry, Leiden / Amsterdam Center for Drug Research, Einsteinweg 55, Leiden 2333, CC, The Netherlands.
| | | | | | | | | | | |
Collapse
|
37
|
Park H, Kim KK, Kim C, Shin JM, No KT. Descriptor-Based Profile Analysis of Kinase Inhibitors to Predict Inhibitory Activity and to Grasp Kinase Selectivity. B KOREAN CHEM SOC 2013. [DOI: 10.5012/bkcs.2013.34.9.2680] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
38
|
Antimicrobial peptides design by evolutionary multiobjective optimization. PLoS Comput Biol 2013; 9:e1003212. [PMID: 24039565 PMCID: PMC3764005 DOI: 10.1371/journal.pcbi.1003212] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Accepted: 07/23/2013] [Indexed: 02/03/2023] Open
Abstract
Antimicrobial peptides (AMPs) are an abundant and wide class of molecules produced by many tissues and cell types in a variety of mammals, plant and animal species. Linear alpha-helical antimicrobial peptides are among the most widespread membrane-disruptive AMPs in nature, representing a particularly successful structural arrangement in innate defense. Recently, AMPs have received increasing attention as potential therapeutic agents, owing to their broad activity spectrum and their reduced tendency to induce resistance. The introduction of non-natural amino acids will be a key requisite in order to contrast host resistance and increase compound's life. In this work, the possibility to design novel AMP sequences with non-natural amino acids was achieved through a flexible computational approach, based on chemophysical profiles of peptide sequences. Quantitative structure-activity relationship (QSAR) descriptors were employed to code each peptide and train two statistical models in order to account for structural and functional properties of alpha-helical amphipathic AMPs. These models were then used as fitness functions for a multi-objective evolutional algorithm, together with a set of constraints for the design of a series of candidate AMPs. Two ab-initio natural peptides were synthesized and experimentally validated for antimicrobial activity, together with a series of control peptides. Furthermore, a well-known Cecropin-Mellitin alpha helical antimicrobial hybrid (CM18) was optimized by shortening its amino acid sequence while maintaining its activity and a peptide with non-natural amino acids was designed and tested, demonstrating the higher activity achievable with artificial residues. In recent years, the increasing and rapid spread of pathogenic microorganisms resistant to conventional antibiotics especially in hospital settings spurred research for the identification of novel molecules endowed with antimicrobial activities and new mechanisms of action. Antimicrobial peptides (AMPs) received an increasing attention as potential therapeutic agents because of their wide spectrum of activity and low rate in inducing bacterial resistance. Currently, research is focused on the design and optimization of novel AMPs to improve their antimicrobial activity, minimize the cytotoxicity and reduce the proteolytic degradation, also in biological fluids. To this end, the introduction of non-natural amino acids will be a key requisite in order to contrast host resistance and increase compound's life. However, the amino acidic alphabet extension to non-natural elements makes a systematic approach to AMPs design unfeasible. A rational in-silico approach can drastically reduce the number of testing compounds and consequently the production costs and the time required for evaluation of activity and toxicity. In this article, AMP in-silico design with non-natural amino acids was performed and a series of candidates were tested in order to demonstrate the potentiality of this approach.
Collapse
|
39
|
Lapins M, Worachartcheewan A, Spjuth O, Georgiev V, Prachayasittikul V, Nantasenamat C, Wikberg JES. A unified proteochemometric model for prediction of inhibition of cytochrome p450 isoforms. PLoS One 2013; 8:e66566. [PMID: 23799117 PMCID: PMC3684587 DOI: 10.1371/journal.pone.0066566] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2013] [Accepted: 05/08/2013] [Indexed: 11/17/2022] Open
Abstract
A unified proteochemometric (PCM) model for the prediction of the ability of drug-like chemicals to inhibit five major drug metabolizing CYP isoforms (i.e. CYP1A2, CYP2C9, CYP2C19, CYP2D6 and CYP3A4) was created and made publicly available under the Bioclipse Decision Support open source system at www.cyp450model.org. In regards to the proteochemometric modeling we represented the chemical compounds by molecular signature descriptors and the CYP-isoforms by alignment-independent description of composition and transition of amino acid properties of their protein primary sequences. The entire training dataset contained 63 391 interactions and the best PCM model was obtained using signature descriptors of height 1, 2 and 3 and inducing the model with a support vector machine. The model showed excellent predictive ability with internal AUC = 0.923 and an external AUC = 0.940, as evaluated on a large external dataset. The advantage of PCM models is their extensibility making it possible to extend our model for new CYP isoforms and polymorphic CYP forms. A key benefit of PCM is that all proteins are confined in one single model, which makes it generally more stable and predictive as compared with single target models. The inclusion of the model in Bioclipse Decision Support makes it possible to make virtual instantaneous predictions (∼100 ms per prediction) while interactively drawing or modifying chemical structures in the Bioclipse chemical structure editor.
Collapse
Affiliation(s)
- Maris Lapins
- Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
| | | | | | | | | | | | | |
Collapse
|
40
|
Zhou S, Li Y, Hou T. Feasibility of Using Molecular Docking-Based Virtual Screening for Searching Dual Target Kinase Inhibitors. J Chem Inf Model 2013; 53:982-96. [DOI: 10.1021/ci400065e] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Affiliation(s)
- Shunye Zhou
- Institute of Functional Nano & Soft Materials (FUNSOM) and Jiangsu Key Laboratory for Carbon-Based Functional Materials & Devices, Soochow University, Suzhou, Jiangsu 215123, China
| | - Youyong Li
- Institute of Functional Nano & Soft Materials (FUNSOM) and Jiangsu Key Laboratory for Carbon-Based Functional Materials & Devices, Soochow University, Suzhou, Jiangsu 215123, China
| | - Tingjun Hou
- Institute of Functional Nano & Soft Materials (FUNSOM) and Jiangsu Key Laboratory for Carbon-Based Functional Materials & Devices, Soochow University, Suzhou, Jiangsu 215123, China
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
41
|
Gao J, Huang Q, Wu D, Zhang Q, Zhang Y, Chen T, Liu Q, Zhu R, Cao Z, He Y. Study on human GPCR–inhibitor interactions by proteochemometric modeling. Gene 2013; 518:124-31. [DOI: 10.1016/j.gene.2012.11.061] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2012] [Accepted: 11/27/2012] [Indexed: 11/15/2022]
|
42
|
Proteochemometric modeling of the bioactivity spectra of HIV-1 protease inhibitors by introducing protein-ligand interaction fingerprint. PLoS One 2012; 7:e41698. [PMID: 22848570 PMCID: PMC3407198 DOI: 10.1371/journal.pone.0041698] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2012] [Accepted: 06/25/2012] [Indexed: 01/01/2023] Open
Abstract
HIV-1 protease is one of the main therapeutic targets in HIV. However, a major problem in treatment of HIV is the rapid emergence of drug-resistant strains. It should be particularly helpful to clinical therapy of AIDS if one method can be used to predict antivirus capability of compounds for different variants. In our study, proteochemometric (PCM) models were created to study the bioactivity spectra of 92 chemical compounds with 47 unique HIV-1 protease variants. In contrast to other PCM models, which used Multiplication of Ligands and Proteins Descriptors (MLPD) as cross-term, one new cross-term, i.e. Protein-Ligand Interaction Fingerprint (PLIF) was introduced in our modeling. With different combinations of ligand descriptors, protein descriptors and cross-terms, nine PCM models were obtained, and six of them achieved good predictive abilities (Q(2)(test)>0.7). These results showed that the performance of PCM models could be improved when ligand and protein descriptors were complemented by the newly introduced cross-term PLIF. Compared with the conventional cross-term MLPD, the newly introduced PLIF had a better predictive ability. Furthermore, our best model (GD & P & PLIF: Q(2)(test) = 0.8271) could select out those inhibitors which have a broad antiviral activity. As a conclusion, our study indicates that proteochemometric modeling with PLIF as cross-term is a potential useful way to solve the HIV-1 drug-resistant problem.
Collapse
|
43
|
Niijima S, Shiraishi A, Okuno Y. Dissecting Kinase Profiling Data to Predict Activity and Understand Cross-Reactivity of Kinase Inhibitors. J Chem Inf Model 2012; 52:901-12. [DOI: 10.1021/ci200607f] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Affiliation(s)
- Satoshi Niijima
- Department
of Systems Biosciences for Drug Discovery, Graduate School of Pharmaceutical
Sciences, Kyoto University, Kyoto, Japan
| | - Akira Shiraishi
- Department
of Systems Biosciences for Drug Discovery, Graduate School of Pharmaceutical
Sciences, Kyoto University, Kyoto, Japan
| | - Yasushi Okuno
- Department
of Systems Biosciences for Drug Discovery, Graduate School of Pharmaceutical
Sciences, Kyoto University, Kyoto, Japan
| |
Collapse
|
44
|
Meslamani J, Rognan D. Enhancing the Accuracy of Chemogenomic Models with a Three-Dimensional Binding Site Kernel. J Chem Inf Model 2011; 51:1593-603. [DOI: 10.1021/ci200166t] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Jamel Meslamani
- Structural Chemogenomics, Laboratory of Therapeutical Innovation, UMR 7200 CNRS, University of Strasbourg, F-67400 Illkirch, France
| | - Didier Rognan
- Structural Chemogenomics, Laboratory of Therapeutical Innovation, UMR 7200 CNRS, University of Strasbourg, F-67400 Illkirch, France
| |
Collapse
|
45
|
van Westen GJP, Wegner JK, IJzerman AP, van Vlijmen HWT, Bender A. Proteochemometric modeling as a tool to design selective compounds and for extrapolating to novel targets. MEDCHEMCOMM 2011. [DOI: 10.1039/c0md00165a] [Citation(s) in RCA: 123] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Proteochemometric modeling is founded on the principles of QSAR but is able to benefit from additional information in model training due to the inclusion of target information.
Collapse
Affiliation(s)
- Gerard J. P. van Westen
- Division of Medicinal Chemistry
- Leiden/Amsterdam Center for Drug Research
- Leiden
- The Netherlands
| | | | - Adriaan P. IJzerman
- Division of Medicinal Chemistry
- Leiden/Amsterdam Center for Drug Research
- Leiden
- The Netherlands
| | - Herman W. T. van Vlijmen
- Division of Medicinal Chemistry
- Leiden/Amsterdam Center for Drug Research
- Leiden
- The Netherlands
- Tibotec BVBA
| | - A. Bender
- Division of Medicinal Chemistry
- Leiden/Amsterdam Center for Drug Research
- Leiden
- The Netherlands
- Unilever Centre for Molecular Science Informatics
| |
Collapse
|