1
|
Dragan P, Joshi K, Atzei A, Latek D. Keras/TensorFlow in Drug Design for Immunity Disorders. Int J Mol Sci 2023; 24:15009. [PMID: 37834457 PMCID: PMC10573944 DOI: 10.3390/ijms241915009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 09/21/2023] [Accepted: 09/29/2023] [Indexed: 10/15/2023] Open
Abstract
Homeostasis of the host immune system is regulated by white blood cells with a variety of cell surface receptors for cytokines. Chemotactic cytokines (chemokines) activate their receptors to evoke the chemotaxis of immune cells in homeostatic migrations or inflammatory conditions towards inflamed tissue or pathogens. Dysregulation of the immune system leading to disorders such as allergies, autoimmune diseases, or cancer requires efficient, fast-acting drugs to minimize the long-term effects of chronic inflammation. Here, we performed structure-based virtual screening (SBVS) assisted by the Keras/TensorFlow neural network (NN) to find novel compound scaffolds acting on three chemokine receptors: CCR2, CCR3, and one CXC receptor, CXCR3. Keras/TensorFlow NN was used here not as a typically used binary classifier but as an efficient multi-class classifier that can discard not only inactive compounds but also low- or medium-activity compounds. Several compounds proposed by SBVS and NN were tested in 100 ns all-atom molecular dynamics simulations to confirm their binding affinity. To improve the basic binding affinity of the compounds, new chemical modifications were proposed. The modified compounds were compared with known antagonists of these three chemokine receptors. Known CXCR3 compounds were among the top predicted compounds; thus, the benefits of using Keras/TensorFlow in drug discovery have been shown in addition to structure-based approaches. Furthermore, we showed that Keras/TensorFlow NN can accurately predict the receptor subtype selectivity of compounds, for which SBVS often fails. We cross-tested chemokine receptor datasets retrieved from ChEMBL and curated datasets for cannabinoid receptors. The NN model trained on the cannabinoid receptor datasets retrieved from ChEMBL was the most accurate in the receptor subtype selectivity prediction. Among NN models trained on the chemokine receptor datasets, the CXCR3 model showed the highest accuracy in differentiating the receptor subtype for a given compound dataset.
Collapse
Affiliation(s)
- Paulina Dragan
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-903 Warsaw, Poland; (P.D.); (A.A.)
| | - Kavita Joshi
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-903 Warsaw, Poland; (P.D.); (A.A.)
| | - Alessandro Atzei
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-903 Warsaw, Poland; (P.D.); (A.A.)
- Department of Life and Environmental Science, Food Toxicology Unit, University of Cagliari, University Campus of Monserrato, SS 554, 09042 Cagliari, Italy
| | - Dorota Latek
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-903 Warsaw, Poland; (P.D.); (A.A.)
| |
Collapse
|
2
|
Gambacorta N, Ciriaco F, Amoroso N, Altomare CD, Bajorath J, Nicolotti O. CIRCE: Web-Based Platform for the Prediction of Cannabinoid Receptor Ligands Using Explainable Machine Learning. J Chem Inf Model 2023; 63:5916-5926. [PMID: 37675493 DOI: 10.1021/acs.jcim.3c00914] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
The endocannabinoid system, which includes cannabinoid receptor 1 and 2 subtypes (CB1R and CB2R, respectively), is responsible for the onset of various pathologies including neurodegeneration, cancer, neuropathic and inflammatory pain, obesity, and inflammatory bowel disease. Given the high similarity of CB1R and CB2R, generating subtype-selective ligands is still an open challenge. In this work, the Cannabinoid Iterative Revaluation for Classification and Explanation (CIRCE) compound prediction platform has been generated based on explainable machine learning to support the design of selective CB1R and CB2R ligands. Multilayer classifiers were combined with Shapley value analysis to facilitate explainable predictions. In test calculations, CIRCE predictions reached ∼80% accuracy and structural features determining ligand predictions were rationalized. CIRCE was designed as a web-based prediction platform that is made freely available as a part of our study.
Collapse
Affiliation(s)
- Nicola Gambacorta
- Dipartimento di Farmacia Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
- Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, D-53115 Bonn, Germany
| | - Fulvio Ciriaco
- Dipartimento di Chimica, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
| | - Nicola Amoroso
- Dipartimento di Farmacia Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
| | - Cosimo Damiano Altomare
- Dipartimento di Farmacia Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
| | - Jürgen Bajorath
- Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, D-53115 Bonn, Germany
| | - Orazio Nicolotti
- Dipartimento di Farmacia Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
| |
Collapse
|
3
|
Delre P, Contino M, Alberga D, Saviano M, Corriero N, Mangiatordi GF. ALPACA: A machine Learning Platform for Affinity and selectivity profiling of CAnnabinoids receptors modulators. Comput Biol Med 2023; 164:107314. [PMID: 37572442 DOI: 10.1016/j.compbiomed.2023.107314] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 07/10/2023] [Accepted: 08/07/2023] [Indexed: 08/14/2023]
Abstract
The development of small molecules that selectively target the cannabinoid receptor subtype 2 (CB2R) is emerging as an intriguing therapeutic strategy to treat neurodegeneration, as well as to contrast the onset and progression of cancer. In this context, in-silico tools able to predict CB2R affinity and selectivity with respect to the subtype 1 (CB1R), whose modulation is responsible for undesired psychotropic effects, are highly desirable. In this work, we developed a series of machine learning classifiers trained on high-quality bioactivity data of small molecules acting on CB2R and/or CB1R extracted from ChEMBL v30. Our classifiers showed strong predictive power in accurately determining CB2R affinity, CB1R affinity, and CB2R/CB1R selectivity. Among the built models, those obtained using random forest as algorithm proved to be the top-performing ones (AUC in validation ≥0.96) and were made freely accessible through a user-friendly web platform developed ad hoc and called ALPACA (https://www.ba.ic.cnr.it/softwareic/alpaca/). Due to its user-friendly interface and robust predictive power, ALPACA can be a valuable tool in saving both time and resources involved in the design of selective CB2R modulators.
Collapse
Affiliation(s)
- Pietro Delre
- CNR - Institute of Crystallography, Via Amendola 122/o, 70126, Bari, Italy
| | - Marialessandra Contino
- Department of Pharmacy - Pharmaceutical Sciences, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125, Bari, Italy
| | - Domenico Alberga
- CNR - Institute of Crystallography, Via Amendola 122/o, 70126, Bari, Italy
| | - Michele Saviano
- CNR - Institute of Crystallography, Via Vivaldi 43, 81100, Caserta, Italy
| | - Nicola Corriero
- CNR - Institute of Crystallography, Via Amendola 122/o, 70126, Bari, Italy.
| | | |
Collapse
|
4
|
Dragan P, Merski M, Wiśniewski S, Sanmukh SG, Latek D. Chemokine Receptors-Structure-Based Virtual Screening Assisted by Machine Learning. Pharmaceutics 2023; 15:pharmaceutics15020516. [PMID: 36839838 PMCID: PMC9965785 DOI: 10.3390/pharmaceutics15020516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 01/30/2023] [Accepted: 01/31/2023] [Indexed: 02/08/2023] Open
Abstract
Chemokines modulate the immune response by regulating the migration of immune cells. They are also known to participate in such processes as cell-cell adhesion, allograft rejection, and angiogenesis. Chemokines interact with two different subfamilies of G protein-coupled receptors: conventional chemokine receptors and atypical chemokine receptors. Here, we focused on the former one which has been linked to many inflammatory diseases, including: multiple sclerosis, asthma, nephritis, and rheumatoid arthritis. Available crystal and cryo-EM structures and homology models of six chemokine receptors (CCR1 to CCR6) were described and tested in terms of their usefulness in structure-based drug design. As a result of structure-based virtual screening for CCR2 and CCR3, several new active compounds were proposed. Known inhibitors of CCR1 to CCR6, acquired from ChEMBL, were used as training sets for two machine learning algorithms in ligand-based drug design. Performance of LightGBM was compared with a sequential Keras/TensorFlow model of neural network for these diverse datasets. A combination of structure-based virtual screening with machine learning allowed to propose several active ligands for CCR2 and CCR3 with two distinct compounds predicted as CCR3 actives by all three tested methods: Glide, Keras/TensorFlow NN, and LightGBM. In addition, the performance of these three methods in the prediction of the CCR2/CCR3 receptor subtype selectivity was assessed.
Collapse
|
5
|
Zhou H, Shan M, Qin LP, Cheng G. Reliable prediction of cannabinoid receptor 2 ligand by machine learning based on combined fingerprints. Comput Biol Med 2023; 152:106379. [PMID: 36502694 DOI: 10.1016/j.compbiomed.2022.106379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 11/15/2022] [Accepted: 11/28/2022] [Indexed: 12/02/2022]
Abstract
Cannabinoid receptors, as part of the family of the G protein-coupled receptors (GPCRs), are involved in various physiological functions. Its subtype cannabinoid receptor subtype 2 (CB2), mainly distributed in the periphery, is a crucial therapeutic target for anti-epileptic, anti-inflammation, anti-fibrosis, and bone metabolism regulation, and it regulates these physiological functions without psychiatric side effects. Recently machine learning methods for predicting biophysics properties have attracted much attention. Successful application of machine learning usually highly depends on the appropriate representation of the compounds. In this study, we comprehensively evaluate the performance of the descriptor-based models (including XGBoost, Random Forest, and KNN) and two graph-based models (D-MPNN, MolMap) for the prediction of the CB2 regulators, and found that XGBoost offers outstanding performance for both regression tasks and classification tasks. 13 different molecular fingerprints and 12 descriptors, as well as their combination were further screened; AvalonFP + AtomPairFP + RDkitFP + MorganFP and AtomPairFP + MorganFP + AvalonFP were the optimum combinations for regression task (R2 increase to 0.667) and classification task (AUC-ROC increase to 0.933), respectively. Specifically, the best XGBoost regression model with optimum features achieves better performance than Mizera's QSAR model on the same dataset developed by Mizera (R2 0.664 versus 0.62). It also achieves optimal performance with an AUC-ROC of 0.917 on the external validation set. By comparison, MolMap and D-MPNN only provide 0.912 and 0.898. The Shapley additive explanation method was used to interpret the models, and features importance were shown for both regression and classification task. The XGBoost model equipped with essential molecular fingerprints combination in this paper may provide valuable clues to designing novel CB2 ligands and developing models for other properties prediction.
Collapse
Affiliation(s)
- Hao Zhou
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University Hangzhou, 310053, People's Republic of China
| | - Mengyi Shan
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University Hangzhou, 310053, People's Republic of China
| | - Lu-Ping Qin
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University Hangzhou, 310053, People's Republic of China.
| | - Gang Cheng
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University Hangzhou, 310053, People's Republic of China.
| |
Collapse
|
6
|
Mizera M, Latek D. Ligand-Receptor Interactions and Machine Learning in GCGR and GLP-1R Drug Discovery. Int J Mol Sci 2021; 22:ijms22084060. [PMID: 33920024 PMCID: PMC8071054 DOI: 10.3390/ijms22084060] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 03/31/2021] [Accepted: 04/07/2021] [Indexed: 12/03/2022] Open
Abstract
The large amount of data that has been collected so far for G protein-coupled receptors requires machine learning (ML) approaches to fully exploit its potential. Our previous ML model based on gradient boosting used for prediction of drug affinity and selectivity for a receptor subtype was compared with explicit information on ligand-receptor interactions from induced-fit docking. Both methods have proved their usefulness in drug response predictions. Yet, their successful combination still requires allosteric/orthosteric assignment of ligands from datasets. Our ligand datasets included activities of two members of the secretin receptor family: GCGR and GLP-1R. Simultaneous activation of two or three receptors of this family by dual or triple agonists is not a typical kind of information included in compound databases. A precise allosteric/orthosteric ligand assignment requires a continuous update based on new structural and biological data. This data incompleteness remains the main obstacle for current ML methods applied to class B GPCR drug discovery. Even so, for these two class B receptors, our ligand-based ML model demonstrated high accuracy (5-fold cross-validation Q2 > 0.63 and Q2 > 0.67 for GLP-1R and GCGR, respectively). In addition, we performed a ligand annotation using recent cryogenic-electron microscopy (cryo-EM) and X-ray crystallographic data on small-molecule complexes of GCGR and GLP-1R. As a result, we assigned GLP-1R and GCGR actives deposited in ChEMBL to four small-molecule binding sites occupied by positive and negative allosteric modulators and a full agonist. Annotated compounds were added to our recently released repository of GPCR data.
Collapse
|