1
|
Kim D, Cho S, Jeon JJ, Choi J. Inhalation Toxicity Screening of Consumer Products Chemicals using OECD Test Guideline Data-based Machine Learning Models. JOURNAL OF HAZARDOUS MATERIALS 2024; 478:135446. [PMID: 39154469 DOI: 10.1016/j.jhazmat.2024.135446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Revised: 07/24/2024] [Accepted: 08/05/2024] [Indexed: 08/20/2024]
Abstract
This study aimed to screen the inhalation toxicity of chemicals found in consumer products such as air fresheners, fragrances, and anti-fogging agents submitted to K-REACH using machine learning models. We manually curated inhalation toxicity data based on OECD test guideline 403 (Acute inhalation), 412 (Sub-acute inhalation), and 413 (Sub-chronic inhalation) for 1709 chemicals from the OECD eChemPortal database. Machine learning models were trained using ten algorithms, along with four molecular fingerprints (MACCS, Morgan, Topo, RDKit) and molecular descriptors, achieving F1 scores ranging from 51 % to 91 % in test dataset. Leveraging the high-performing models, we conducted a virtual screening of chemicals, initially applying them to data-rich chemicals generally used in occupational settings to determine the prediction uncertainty. Results showed high sensitivity (75 %) but low specificity (23 %), suggesting that our models can contribute to conservative screening of chemicals. Subsequently, we applied the models to consumer product chemicals, identifying 79 as of high concern. Most of the prioritized chemicals lacked GHS classifications related to inhalation toxicity, even though they were predicted to be used in many consumer products. This study highlights a potential regulatory blind spot concerning the inhalation risk of consumer product chemicals while also indicating the potential of artificial intelligence (AI) models to aid in prioritizing chemicals at the screening level.
Collapse
Affiliation(s)
- Donghyeon Kim
- School of Environmental Engineering, University of Seoul, Seoul 02504, Republic of Korea
| | - Soyoung Cho
- Department of Statistics, University of Seoul, Seoul 02504, Republic of Korea
| | - Jong-June Jeon
- Department of Statistics, University of Seoul, Seoul 02504, Republic of Korea.
| | - Jinhee Choi
- School of Environmental Engineering, University of Seoul, Seoul 02504, Republic of Korea.
| |
Collapse
|
2
|
Agea MI, Čmelo I, Dehaen W, Chen Y, Kirchmair J, Sedlák D, Bartůněk P, Šícho M, Svozil D. Chemical space exploration with Molpher: Generating and assessing a glucocorticoid receptor ligand library. Mol Inform 2024; 43:e202300316. [PMID: 38979783 DOI: 10.1002/minf.202300316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 04/23/2024] [Accepted: 04/24/2024] [Indexed: 07/10/2024]
Abstract
Computational exploration of chemical space is crucial in modern cheminformatics research for accelerating the discovery of new biologically active compounds. In this study, we present a detailed analysis of the chemical library of potential glucocorticoid receptor (GR) ligands generated by the molecular generator, Molpher. To generate the targeted GR library and construct the classification models, structures from the ChEMBL database as well as from the internal IMG library, which was experimentally screened for biological activity in the primary luciferase reporter cell assay, were utilized. The composition of the targeted GR ligand library was compared with a reference library that randomly samples chemical space. A random forest model was used to determine the biological activity of ligands, incorporating its applicability domain using conformal prediction. It was demonstrated that the GR library is significantly enriched with GR ligands compared to the random library. Furthermore, a prospective analysis demonstrated that Molpher successfully designed compounds, which were subsequently experimentally confirmed to be active on the GR. A collection of 34 potential new GR ligands was also identified. Moreover, an important contribution of this study is the establishment of a comprehensive workflow for evaluating computationally generated ligands, particularly those with potential activity against targets that are challenging to dock.
Collapse
Affiliation(s)
- M Isabel Agea
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
| | - Ivan Čmelo
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
| | - Wim Dehaen
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
- Department of Organic Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
| | - Ya Chen
- Center for Bioinformatics (ZBH), Department of Informatics, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, 20146, Hamburg, Germany
- Division of Pharmaceutical Chemistry, Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, 1090, Vienna, Austria
| | - Johannes Kirchmair
- Center for Bioinformatics (ZBH), Department of Informatics, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, 20146, Hamburg, Germany
- Division of Pharmaceutical Chemistry, Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, 1090, Vienna, Austria
| | - David Sedlák
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, 14220, Czech Republic
| | - Petr Bartůněk
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, 14220, Czech Republic
| | - Martin Šícho
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
| | - Daniel Svozil
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, 14220, Czech Republic
| |
Collapse
|
3
|
Lv X, Wang J, Yuan Y, Pan L, Liu Q, Guo J. In Silico drug repurposing pipeline using deep learning and structure based approaches in epilepsy. Sci Rep 2024; 14:16562. [PMID: 39020064 PMCID: PMC11254927 DOI: 10.1038/s41598-024-67594-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 07/12/2024] [Indexed: 07/19/2024] Open
Abstract
Due to considerable global prevalence and high recurrence rate, the pursuit of effective new medication for epilepsy treatment remains an urgent and significant challenge. Drug repurposing emerges as a cost-effective and efficient strategy to combat this disorder. This study leverages the transformer-based deep learning methods coupled with molecular binding affinity calculation to develop a novel in-silico drug repurposing pipeline for epilepsy. The number of candidate inhibitors against 24 target proteins encoded by gain-of-function genes implicated in epileptogenesis ranged from zero to several hundreds. Our pipeline has repurposed the medications with most anti-epileptic drugs and nearly half psychiatric medications, highlighting the effectiveness of our pipeline. Furthermore, Lomitapide, a cholesterol-lowering drug, first emerged as particularly noteworthy, exhibiting high binding affinity for 10 targets and verified by molecular dynamics simulation and mechanism analysis. These findings provided a novel perspective on therapeutic strategies for other central nervous system disease.
Collapse
Affiliation(s)
- Xiaoying Lv
- Global Health Drug Discovery Institute, Beijing, China
| | - Jia Wang
- Cipher Gene Limited, Beijing, China
| | - Ying Yuan
- Global Health Drug Discovery Institute, Beijing, China
| | - Lurong Pan
- Global Health Drug Discovery Institute, Beijing, China
| | - Qi Liu
- Global Health Drug Discovery Institute, Beijing, China
| | - Jinjiang Guo
- Global Health Drug Discovery Institute, Beijing, China.
| |
Collapse
|
4
|
Xu Y, Liaw A, Sheridan RP, Svetnik V. Development and Evaluation of Conformal Prediction Methods for Quantitative Structure-Activity Relationship. ACS OMEGA 2024; 9:29478-29490. [PMID: 39005801 PMCID: PMC11238240 DOI: 10.1021/acsomega.4c02017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 06/10/2024] [Accepted: 06/12/2024] [Indexed: 07/16/2024]
Abstract
The quantitative structure-activity relationship (QSAR) regression model is a commonly used technique for predicting the biological activities of compounds using their molecular descriptors. Besides accurate activity estimation, obtaining a prediction uncertainty metric like a prediction interval is highly desirable. Quantifying prediction uncertainty is an active research area in statistical and machine learning (ML), but the implementation for QSAR remains challenging. However, most ML algorithms with high predictive performance require add-on companions for estimating the uncertainty of their prediction. Conformal prediction (CP) is a promising approach as its main components are agnostic to the prediction modes, and it produces valid prediction intervals under weak assumptions on the data distribution. We proposed computationally efficient CP algorithms tailored to the most widely used ML models, including random forests, deep neural networks, and gradient boosting. The algorithms use a novel approach to the derivation of nonconformity scores from the estimates of prediction uncertainty generated by the ensembles of point predictions. The validity and efficiency of proposed algorithms are demonstrated on a diverse collection of QSAR data sets as well as simulation studies. The provided software implementing our algorithms can be used as stand-alone or easily incorporated into other ML software packages for QSAR modeling.
Collapse
Affiliation(s)
- Yuting Xu
- Early
Development Statistics, Merck & Co.,
Inc., Rahway, New Jersey 07065, United States
| | - Andy Liaw
- Early
Development Statistics, Merck & Co.,
Inc., Rahway, New Jersey 07065, United States
| | - Robert P. Sheridan
- Modeling
and Informatics, Merck & Co., Inc., Rahway, New Jersey 07033, United States
| | - Vladimir Svetnik
- Early
Development Statistics, Merck & Co.,
Inc., Rahway, New Jersey 07065, United States
| |
Collapse
|
5
|
Kaveh S, Mani-Varnosfaderani A, Neiband MS. Deriving general structure-activity/selectivity relationship patterns for different subfamilies of cyclin-dependent kinase inhibitors using machine learning methods. Sci Rep 2024; 14:15315. [PMID: 38961127 PMCID: PMC11222421 DOI: 10.1038/s41598-024-66173-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Accepted: 06/27/2024] [Indexed: 07/05/2024] Open
Abstract
Cyclin-dependent kinases (CDKs) play essential roles in regulating the cell cycle and are among the most critical targets for cancer therapy and drug discovery. The primary objective of this research is to derive general structure-activity relationship (SAR) patterns for modeling the selectivity and activity levels of CDK inhibitors using machine learning methods. To accomplish this, 8592 small molecules with different binding affinities to CDK1, CDK2, CDK4, CDK5, and CDK9 were collected from Binding DB, and a diverse set of descriptors was calculated for each molecule. The supervised Kohonen networks (SKN) and counter propagation artificial neural networks (CPANN) models were trained to predict the activity levels and therapeutic targets of the molecules. The validity of models was confirmed through tenfold cross-validation and external test sets. Using selected sets of molecular descriptors (e.g. hydrophilicity and total polar surface area) we derived activity and selectivity maps to elucidate local regions in chemical space for active and selective CDK inhibitors. The SKN models exhibited prediction accuracies ranging from 0.75 to 0.94 for the external test sets. The developed multivariate classifiers were used for ligand-based virtual screening of 2 million random molecules of the PubChem database, yielding areas under the receiver operating characteristic curves ranging from 0.72 to 1.00 for the SKN model. Considering the persistent challenge of achieving CDK selectivity, this research significantly contributes to addressing the issue and underscores the paramount importance of developing drugs with minimized side effects.
Collapse
Affiliation(s)
- Sara Kaveh
- Chemometrics and Cheminformatics Laboratory, Department of Analytical Chemistry, Tarbiat Modares University, Tehran, Iran
| | - Ahmad Mani-Varnosfaderani
- Chemometrics and Cheminformatics Laboratory, Department of Analytical Chemistry, Tarbiat Modares University, Tehran, Iran.
| | - Marzieh Sadat Neiband
- Department of Chemistry, Payame Noor University (PNU), P.O. Box 19395-4697, Tehran, Iran
| |
Collapse
|
6
|
Jha T, Jana R, Banerjee S, Baidya SK, Amin SA, Gayen S, Ghosh B, Adhikari N. Exploring different classification-dependent QSAR modelling strategies for HDAC3 inhibitors in search of meaningful structural contributors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2024; 35:367-389. [PMID: 38757181 DOI: 10.1080/1062936x.2024.2350504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Accepted: 04/28/2024] [Indexed: 05/18/2024]
Abstract
Histone deacetylase 3 (HDAC3), a Zn2+-dependent class I HDACs, contributes to numerous disorders such as neurodegenerative disorders, diabetes, cardiovascular disease, kidney disease and several types of cancers. Therefore, the development of novel and selective HDAC3 inhibitors might be promising to combat such diseases. Here, different classification-based molecular modelling studies such as Bayesian classification, recursive partitioning (RP), SARpy and linear discriminant analysis (LDA) were conducted on a set of HDAC3 inhibitors to pinpoint essential structural requirements contributing to HDAC3 inhibition followed by molecular docking study and molecular dynamics (MD) simulation analyses. The current study revealed the importance of hydroxamate function for Zn2+ chelation as well as hydrogen bonding interaction with Tyr298 residue. The importance of hydroxamate function for higher HDAC3 inhibition was noticed in the case of Bayesian classification, recursive partitioning and SARpy models. Also, the importance of substituted thiazole ring was revealed, whereas the presence of linear alkyl groups with carboxylic acid function, any type of ester function, benzodiazepine moiety and methoxy group in the molecular structure can be detrimental to HDAC3 inhibition. Therefore, this study can aid in the design and discovery of effective novel HDAC3 inhibitors in the future.
Collapse
Affiliation(s)
- T Jha
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - R Jana
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - S Banerjee
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - S K Baidya
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - S A Amin
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - S Gayen
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - B Ghosh
- Epigenetic Research Laboratory, Department of Pharmacy, Birla Institute of Technology and Science-Pilani, Hyderabad, India
| | - N Adhikari
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| |
Collapse
|
7
|
An H, Liu X, Cai W, Shao X. Explainable Graph Neural Networks with Data Augmentation for Predicting p Ka of C-H Acids. J Chem Inf Model 2024; 64:2383-2392. [PMID: 37706462 DOI: 10.1021/acs.jcim.3c00958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/15/2023]
Abstract
The pKa of C-H acids is an important parameter in the fields of organic synthesis, drug discovery, and materials science. However, the prediction of pKa is still a great challenge due to the limit of experimental data and the lack of chemical insight. Here, a new model for predicting the pKa values of C-H acids is proposed on the basis of graph neural networks (GNNs) and data augmentation. A message passing unit (MPU) was used to extract the topological and target-related information from the molecular graph data, and a readout layer was utilized to retrieve the information on the ionization site C atom. The retrieved information then was adopted to predict pKa by a fully connected network. Furthermore, to increase the diversity of the training data, a knowledge-infused data augmentation technique was established by replacing the H atoms in a molecule with substituents exhibiting different electronic effects. The MPU was pretrained with the augmented data. The efficacy of data augmentation was confirmed by visualizing the distribution of compounds with different substituents and by classifying compounds. The explainability of the model was studied by examining the change of pKa values when a specific atom was masked. This explainability was used to identify the key substituents for pKa. The model was evaluated on two data sets from the iBonD database. Dataset1 includes the experimental pKa values of C-H acids measured in DMSO, while dataset2 comprises the pKa values measured in water. The results show that the knowledge-infused data augmentation technique greatly improves the predictive accuracy of the model, especially when the number of samples is small.
Collapse
Affiliation(s)
- Hongle An
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Xuyang Liu
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Wensheng Cai
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Xueguang Shao
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| |
Collapse
|
8
|
Huang Z, Lou S, Wang H, Li W, Liu G, Tang Y. AttentiveSkin: To Predict Skin Corrosion/Irritation Potentials of Chemicals via Explainable Machine Learning Methods. Chem Res Toxicol 2024; 37:361-373. [PMID: 38294881 DOI: 10.1021/acs.chemrestox.3c00332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2024]
Abstract
Skin Corrosion/Irritation (Corr./Irrit.) has long been a health hazard in the Globally Harmonized System (GHS). Several in silico models have been built to predict Skin Corr./Irrit. as an alternative to the increasingly restricted animal testing. However, current studies are limited by data amount/quality and model availability. To address these issues, we compiled a traceable consensus GHS data set comprising 731 Corr., 1283 Irrit., and 1205 negative (Neg.) samples from 6 governmental databases and 2 external data sets. Then, a series of binary classifiers were developed with five machine learning (ML) algorithms and six molecular representations. For 10-fold cross-validation, the best Corr. vs Neg. classifier achieved an Area Under the Receiver Operating Characteristic Curve (AUC) of 97.1%, while the best Irrit. vs Neg. classifier achieved an AUC of 84.7%. Compared with existing in silico tools on external validation, our Attentive FP classifiers showed the highest metrics on Corr. vs Neg. and the second highest accuracy on Irrit. vs Neg. The SHapley Additive exPlanation approach was further applied to figure out important molecular features, and the attention weights were visualized to perform interpretable prediction. Structural alerts associated with Skin Corr./Irrit. were also identified. The interpretable Attentive FP classifiers were integrated into the software AttentiveSkin at https://github.com/BeeBeeWong/AttentiveSkin. The conventional ML classifiers are also provided on our platform admetSAR at http://lmmd.ecust.edu.cn/admetsar2/. Considering the data deficiency and the limited model availability of Skin Corr./Irrit., we believe that our data set and models could facilitate chemical safety assessment and relevant studies.
Collapse
Affiliation(s)
- Zejun Huang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Shang Lou
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Haoqiang Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| |
Collapse
|
9
|
Kwon JH, Kim J, Lim KM, Kim MG. Integration of the Natural Language Processing of Structural Information Simplified Molecular-Input Line-Entry System Can Improve the In Vitro Prediction of Human Skin Sensitizers. TOXICS 2024; 12:153. [PMID: 38393248 PMCID: PMC10892072 DOI: 10.3390/toxics12020153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 02/03/2024] [Accepted: 02/14/2024] [Indexed: 02/25/2024]
Abstract
Natural language processing (NLP) technology has recently used to predict substance properties based on their Simplified Molecular-Input Line-Entry System (SMILES). We aimed to develop a model predicting human skin sensitizers by integrating text features derived from SMILES with in vitro test outcomes. The dataset on SMILES, physicochemical properties, in vitro tests (DPRA, KeratinoSensTM, h-CLAT, and SENS-IS assays), and human potency categories for 122 substances sourced from the Cosmetics Europe database. The ChemBERTa model was employed to analyze the SMILES of substances. The last hidden layer embedding of ChemBERTa was tested with other features. Given the modest dataset size, we trained five XGBoost models using subsets of the training data, and subsequently employed bagging to create the final model. Notably, the features computed from SMILES played a pivotal role in the model for distinguishing sensitizers and non-sensitizers. The final model demonstrated a classification accuracy of 80% and an AUC-ROC of 0.82, effectively discriminating sensitizers from non-sensitizers. Furthermore, the model exhibited an accuracy of 82% and an AUC-ROC of 0.82 in classifying strong and weak sensitizers. In summary, we demonstrated that the integration of NLP of SMILES with in vitro test results can enhance the prediction of health hazard associated with chemicals.
Collapse
Affiliation(s)
| | | | - Kyung-Min Lim
- College of Pharmacy, Ewha Womans University, Seoul 03760, Republic of Korea; (J.-H.K.); (J.K.)
| | - Myeong Gyu Kim
- College of Pharmacy, Ewha Womans University, Seoul 03760, Republic of Korea; (J.-H.K.); (J.K.)
| |
Collapse
|
10
|
Neal WM, Pandey P, Khan SI, Khan IA, Chittiboyina AG. Machine learning and traditional QSAR modeling methods: a case study of known PXR activators. J Biomol Struct Dyn 2024; 42:903-917. [PMID: 37059719 DOI: 10.1080/07391102.2023.2196701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 03/22/2023] [Indexed: 04/16/2023]
Abstract
Pregnane X receptor (PXR), extensively expressed in human tissues related to digestion and metabolism, is responsible for recognizing and detoxifying diverse xenobiotics encountered by humans. To comprehend the promiscuous nature of PXR and its ability to bind a variety of ligands, computational approaches, viz., quantitative structure-activity relationship (QSAR) models, aid in the rapid dereplication of potential toxicological agents and mitigate the number of animals used to establish a meaningful regulatory decision. Recent advancements in machine learning techniques accommodating larger datasets are expected to aid in developing effective predictive models for complex mixtures (viz., dietary supplements) before undertaking in-depth experiments. Five hundred structurally diverse PXR ligands were used to develop traditional two-dimensional (2D) QSAR, machine-learning-based 2D-QSAR, field-based three-dimensional (3D) QSAR, and machine-learning-based 3D-QSAR models to establish the utility of predictive machine learning methods. Additionally, the applicability domain of the agonists was established to ensure the generation of robust QSAR models. A prediction set of dietary PXR agonists was used to externally-validate generated QSAR models. QSAR data analysis revealed that machine-learning 3D-QSAR techniques were more accurate in predicting the activity of external terpenes with an external validation squared correlation coefficient (R2) of 0.70 versus an R2 of 0.52 in machine-learning 2D-QSAR. Additionally, a visual summary of the binding pocket of PXR was assembled from the field 3D-QSAR models. By developing multiple QSAR models in this study, a robust groundwork for assessing PXR agonism from various chemical backbones has been established in anticipation of the identification of potential causative agents in complex mixtures.
Collapse
Affiliation(s)
- William M Neal
- Division of Pharmacognosy, Department of BioMolecular Sciences, School of Pharmacy, The University of Mississippi, University, MS, USA
| | - Pankaj Pandey
- National Center for Natural Products Research, Research Institute of Pharmaceutical Sciences, School of Pharmacy, The University of Mississippi, University, MS, USA
| | - Shabana I Khan
- Division of Pharmacognosy, Department of BioMolecular Sciences, School of Pharmacy, The University of Mississippi, University, MS, USA
- National Center for Natural Products Research, Research Institute of Pharmaceutical Sciences, School of Pharmacy, The University of Mississippi, University, MS, USA
| | - Ikhlas A Khan
- Division of Pharmacognosy, Department of BioMolecular Sciences, School of Pharmacy, The University of Mississippi, University, MS, USA
- National Center for Natural Products Research, Research Institute of Pharmaceutical Sciences, School of Pharmacy, The University of Mississippi, University, MS, USA
| | - Amar G Chittiboyina
- National Center for Natural Products Research, Research Institute of Pharmaceutical Sciences, School of Pharmacy, The University of Mississippi, University, MS, USA
| |
Collapse
|
11
|
Zhao J, Shang C, Yin R. Developing a hybrid model for predicting the reaction kinetics between chlorine and micropollutants in water. WATER RESEARCH 2023; 247:120794. [PMID: 37918199 DOI: 10.1016/j.watres.2023.120794] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 10/03/2023] [Accepted: 10/27/2023] [Indexed: 11/04/2023]
Abstract
Understanding the reactivities of chlorine towards micropollutants is crucial for assessing the fate of micropollutants in water chlorination. In this study, we integrated machine learning with kinetic modeling to predict the reaction kinetics between micropollutants and chlorine in deionized water and real surface water. We first established a framework to predict the apparent second-order rate constants for micropollutants with chlorine by combining Morgan molecular fingerprints with machine learning algorithms. The framework was tuned using Bayesian optimization and showed high prediction accuracy. It was validated through experiments and used to predict the unreported apparent second-order rate constants for 103 emerging micropollutants with chlorine. The framework also improved the understanding of the structure-dependence of micropollutants' reactivity with chlorine. We incorporated the predicted apparent second-order rate constants into the Kintecus software to establish a hybrid model to profile the time-dependent changes of micropollutant concentrations by chlorination. The hybrid model was validated by experiments conducted in real surface water in the presence of natural organic matter. The hybrid model could predict how much micropollutants were degraded by chlorination with varied chlorine contact times and/or initial chlorine dosages. This study advances fundamental understanding of the reaction kinetics between chlorine and emerging micropollutants, and also offers a valuable tool to assess the fate of micropollutants during chlorination of drinking water.
Collapse
Affiliation(s)
- Jing Zhao
- Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Chii Shang
- Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong; Hong Kong Branch of Chinese National Engineering Research Center for Control & Treatment of Heavy Metal Pollution, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Ran Yin
- Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong.
| |
Collapse
|
12
|
Liu J, Xu L, Guo W, Li Z, Khan MKH, Ge W, Patterson TA, Hong H. Developing a SARS-CoV-2 main protease binding prediction random forest model for drug repurposing for COVID-19 treatment. Exp Biol Med (Maywood) 2023; 248:1927-1936. [PMID: 37997891 PMCID: PMC10798185 DOI: 10.1177/15353702231209413] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 09/26/2023] [Indexed: 11/25/2023] Open
Abstract
The coronavirus disease 2019 (COVID-19) global pandemic resulted in millions of people becoming infected with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus and close to seven million deaths worldwide. It is essential to further explore and design effective COVID-19 treatment drugs that target the main protease of SARS-CoV-2, a major target for COVID-19 drugs. In this study, machine learning was applied for predicting the SARS-CoV-2 main protease binding of Food and Drug Administration (FDA)-approved drugs to assist in the identification of potential repurposing candidates for COVID-19 treatment. Ligands bound to the SARS-CoV-2 main protease in the Protein Data Bank and compounds experimentally tested in SARS-CoV-2 main protease binding assays in the literature were curated. These chemicals were divided into training (516 chemicals) and testing (360 chemicals) data sets. To identify SARS-CoV-2 main protease binders as potential candidates for repurposing to treat COVID-19, 1188 FDA-approved drugs from the Liver Toxicity Knowledge Base were obtained. A random forest algorithm was used for constructing predictive models based on molecular descriptors calculated using Mold2 software. Model performance was evaluated using 100 iterations of fivefold cross-validations which resulted in 78.8% balanced accuracy. The random forest model that was constructed from the whole training dataset was used to predict SARS-CoV-2 main protease binding on the testing set and the FDA-approved drugs. Model applicability domain and prediction confidence on drugs predicted as the main protease binders discovered 10 FDA-approved drugs as potential candidates for repurposing to treat COVID-19. Our results demonstrate that machine learning is an efficient method for drug repurposing and, thus, may accelerate drug development targeting SARS-CoV-2.
Collapse
Affiliation(s)
| | | | - Wenjing Guo
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Zoe Li
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Md Kamrul Hasan Khan
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Weigong Ge
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| |
Collapse
|
13
|
Shin HK, Huang R, Chen M. In silico modeling-based new alternative methods to predict drug and herb-induced liver injury: A review. Food Chem Toxicol 2023; 179:113948. [PMID: 37460037 PMCID: PMC10640386 DOI: 10.1016/j.fct.2023.113948] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 07/10/2023] [Accepted: 07/14/2023] [Indexed: 07/25/2023]
Abstract
New approach methods (NAMs) have been developed to predict a wide range of toxicities through innovative technologies. Liver injury is one of the most extensively studied endpoints due to its severity and frequency, occurring among populations that consume drugs or dietary supplements. In this review, we focus on recent developments of in silico modeling for liver injury prediction using deep learning and in vitro data based on adverse outcome pathways (AOPs). Despite these models being mainly developed using datasets generated from drug-like molecules, they were also applied to the prediction of hepatotoxicity caused by herbal products. As deep learning has achieved great success in many different fields, advanced machine learning algorithms have been actively applied to improve the accuracy of in silico models. Additionally, the development of liver AOPs, combined with big data in toxicology, has been valuable in developing in silico models with enhanced predictive performance and interpretability. Specifically, one approach involves developing structure-based models for predicting molecular initiating events of liver AOPs, while others use in vitro data with structure information as model inputs for making predictions. Even though liver injury remains a difficult endpoint to predict, advancements in machine learning algorithms and the expansion of in vitro databases with relevant biological knowledge have made a huge impact on improving in silico modeling for drug-induced liver injury prediction.
Collapse
Affiliation(s)
- Hyun Kil Shin
- Department of Predictive Toxicology, Korea Institute of Toxicology (KIT), 34114, Daejeon, Republic of Korea
| | - Ruili Huang
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD, 20850, USA.
| | - Minjun Chen
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research (NCTR), U.S. Food and Drug Administration, 3900 NCTR Rd., Jefferson, AR, 72079, USA.
| |
Collapse
|
14
|
Maia MDS, Mendonça-Junior FJB, Rodrigues GCS, da Silva AS, de Oliveira NIP, da Silva PR, Felipe CFB, Gurgel APAD, Nayarisseri A, Scotti MT, Scotti L. Virtual Screening of Different Subclasses of Lignans with Anticancer Potential and Based on Genetic Profile. Molecules 2023; 28:6011. [PMID: 37630263 PMCID: PMC10459202 DOI: 10.3390/molecules28166011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Revised: 07/26/2023] [Accepted: 08/04/2023] [Indexed: 08/27/2023] Open
Abstract
Cancer is a multifactorial disease that continues to increase. Lignans are known to be important anticancer agents. However, due to the structural diversity of lignans, it is difficult to associate anticancer activity with a particular subclass. Therefore, the present study sought to evaluate the association of lignan subclasses with antitumor activity, considering the genetic profile of the variants of the selected targets. To do so, predictive models were built against the targets tyrosine-protein kinase ABL (ABL), epidermal growth factor receptor erbB1 (EGFR), histone deacetylase (HDAC), serine/threonine-protein kinase mTOR (mTOR) and poly [ADP-ribose] polymerase-1 (PARP1). Then, single nucleotide polymorphisms were mapped, target mutations were designed, and molecular docking was performed with the lignans with the best predicted biological activity. The results showed more anticancer activity in the dibenzocyclooctadiene, furofuran and aryltetralin subclasses. The lignans with the best predictive values of biological activity showed varying binding energy results in the presence of certain genetic variants.
Collapse
Affiliation(s)
- Mayara dos Santos Maia
- Department of Molecular Biology, Federal University of Paraíba, João Pessoa 58051-900, PB, Brazil;
| | - Francisco Jaime Bezerra Mendonça-Junior
- Laboratory of Synthesis and Drug Delivery, State Universtiy of Paraiba, João Pessoa 58071-160, PB, Brazil
- Postgraduate Program in Natural Synthetic and Bioactive Products (PgPNSB), Federal University of Paraíba, João Pessoa 58033-455, PB, Brazil; (P.R.d.S.); (C.F.B.F.); (M.T.S.); (L.S.)
| | | | - Adriano Soares da Silva
- Program in Ecology and Environmental Monitoring, Federal University of Paraíba, João Pessoa 58059-900, PB, Brazil; (A.S.d.S.); (N.I.P.d.O.)
| | - Niara Isis Pereira de Oliveira
- Program in Ecology and Environmental Monitoring, Federal University of Paraíba, João Pessoa 58059-900, PB, Brazil; (A.S.d.S.); (N.I.P.d.O.)
| | - Pablo Rayff da Silva
- Postgraduate Program in Natural Synthetic and Bioactive Products (PgPNSB), Federal University of Paraíba, João Pessoa 58033-455, PB, Brazil; (P.R.d.S.); (C.F.B.F.); (M.T.S.); (L.S.)
| | - Cícero Francisco Bezerra Felipe
- Postgraduate Program in Natural Synthetic and Bioactive Products (PgPNSB), Federal University of Paraíba, João Pessoa 58033-455, PB, Brazil; (P.R.d.S.); (C.F.B.F.); (M.T.S.); (L.S.)
| | | | - Anuraj Nayarisseri
- In Silico Research Laboratory, Eminent Bioscience, Indore 452010, Madhya Pradesh, India;
| | - Marcus Tullius Scotti
- Postgraduate Program in Natural Synthetic and Bioactive Products (PgPNSB), Federal University of Paraíba, João Pessoa 58033-455, PB, Brazil; (P.R.d.S.); (C.F.B.F.); (M.T.S.); (L.S.)
- Laboratory of Cheminformatics, Health Sciences Center, Federal University of Paraíba, João Pessoa 58033-455, PB, Brazil
| | - Luciana Scotti
- Postgraduate Program in Natural Synthetic and Bioactive Products (PgPNSB), Federal University of Paraíba, João Pessoa 58033-455, PB, Brazil; (P.R.d.S.); (C.F.B.F.); (M.T.S.); (L.S.)
- Laboratory of Cheminformatics, Health Sciences Center, Federal University of Paraíba, João Pessoa 58033-455, PB, Brazil
| |
Collapse
|
15
|
Sosnina EA, Sosnin S, Fedorov MV. Improvement of multi-task learning by data enrichment: application for drug discovery. J Comput Aided Mol Des 2023; 37:183-200. [PMID: 36943645 DOI: 10.1007/s10822-023-00500-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 02/21/2023] [Indexed: 03/23/2023]
Abstract
Multi-task learning in deep neural networks has become a topic of growing importance in many research fields, including drug discovery. However, applying multi-task learning poses new challenges in improving prediction performance. This study investigated the potential of training data enrichment to enhance multi-task model prediction quality in drug discovery. The study evaluated four scenarios with varying degrees of information capacity of the training data and applied two types of test data to evaluate prediction performance. We used three datasets: ViralChEMBL, which consisted of binary activities of compounds against viral species, was applied for the classification task; pQSAR(159) and pQSAR(4267), which consisted of bio-activities of compounds and assays from the research of the profile-QSAR method, were applied for regression tasks. We built multi-task models based on the feed-forward DNNs using the PyTorch framework. Our findings showed that training data enrichment could be an effective means of enhancing prediction performance in multi-task learning, but the degree of improvement depends on the quality of the training data. The more unique compounds and targets the training data included, the more new compound-target interactions are required for prediction improvement. Also, we found out that even using multi-task learning, one could not predict the interactions of compounds that are highly dissimilar from those used for model training. The study provides some recommendations for effectively employing multi-task learning in drug discovery to improve prediction accuracy and facilitate the discovery of novel drug candidates.
Collapse
Affiliation(s)
- Ekaterina A Sosnina
- Center for Computational and Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, Bolshoy Boulevard 30/1, Moscow, Russia, 143026.
| | - Sergey Sosnin
- Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, Josef-Holaubek-Platz 2, 1190, Vienna, Austria
| | - Maxim V Fedorov
- Center for Computational and Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, Bolshoy Boulevard 30/1, Moscow, Russia, 143026
- Sirius University of Science and Technology, Olympiisky Prospect 1, Sochi, Russia, 354340
| |
Collapse
|
16
|
Development of QSPR-ANN models for the estimation of critical properties of pure hydrocarbons. J Mol Graph Model 2023; 121:108450. [PMID: 36907016 DOI: 10.1016/j.jmgm.2023.108450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 02/21/2023] [Accepted: 03/06/2023] [Indexed: 03/09/2023]
Abstract
The current work aimed to predict three critical properties: critical temperature (Tc), critical volume (Vc), and critical pressure (Pc) of pure hydrocarbons. A multi-layer perceptron artificial neural network (MLP-ANN) has been adopted as a nonlinear modeling technique and computational approach based on a few relevant molecular descriptors. A set of diverse data points was used to build three QSPR-ANN models, including 223 points for Tc, Vc, and 221 for Pc. The entire database was randomly split into two subsets: 80% for the training set and 20% for the testing set. A large number of 1666 molecular descriptors were calculated and then reduced by a statistical methodology based on several phases to retain them into a reasonable number of relevant descriptors, wherein about 99% of initial descriptors were excluded. Thus, the Quasi-Newton backpropagation (BFGS) algorithm was applied to train the ANN structure. The results of three QSPR-ANN models showed good precision, confirmed by the high values of determination coefficient (R2) ranging from 0.9990 to 0.9945, and the low values of calculated errors, such as the Mean Absolute Percentage Error (MAPE) that ranged from 2.2497 to 0.7424% for the best three models of Tc, Vc, and Pc. The weight sensitivity analysis method was applied to know the contribution of each input descriptor individually or by class on each appropriate QSPR-ANN model. Moreover, the applicability domain (AD) method was also used with a strict limit of standardized residual values (di = ±2). However, the results were promising, with nearly 88% of the data points validated within the AD range. Finally, the results of the proposed QSPR-ANN models were compared with other well-known QSPR or ANN models for each property. Consequently, our three models provided satisfactory results, outperforming most of the models mentioned in this comparison. This computational approach can be applied in petroleum engineering and other related fields to accurately determine the critical properties of pure hydrocarbons: Tc, Vc, and Pc.
Collapse
|
17
|
Design of New Dispersants Using Machine Learning and Visual Analytics. Polymers (Basel) 2023; 15:polym15051324. [PMID: 36904566 PMCID: PMC10007083 DOI: 10.3390/polym15051324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 02/23/2023] [Accepted: 02/25/2023] [Indexed: 03/09/2023] Open
Abstract
Artificial intelligence (AI) is an emerging technology that is revolutionizing the discovery of new materials. One key application of AI is virtual screening of chemical libraries, which enables the accelerated discovery of materials with desired properties. In this study, we developed computational models to predict the dispersancy efficiency of oil and lubricant additives, a critical property in their design that can be estimated through a quantity named blotter spot. We propose a comprehensive approach that combines machine learning techniques with visual analytics strategies in an interactive tool that supports domain experts' decision-making. We evaluated the proposed models quantitatively and illustrated their benefits through a case study. Specifically, we analyzed a series of virtual polyisobutylene succinimide (PIBSI) molecules derived from a known reference substrate. Our best-performing probabilistic model was Bayesian Additive Regression Trees (BART), which achieved a mean absolute error of 5.50±0.34 and a root mean square error of 7.56±0.47, as estimated through 5-fold cross-validation. To facilitate future research, we have made the dataset, including the potential dispersants used for modeling, publicly available. Our approach can help accelerate the discovery of new oil and lubricant additives, and our interactive tool can aid domain experts in making informed decisions based on blotter spot and other key properties.
Collapse
|
18
|
Poongavanam V, Kölling F, Giese A, Göller AH, Lehmann L, Meibom D, Kihlberg J. Predictive Modeling of PROTAC Cell Permeability with Machine Learning. ACS OMEGA 2023; 8:5901-5916. [PMID: 36816707 PMCID: PMC9933238 DOI: 10.1021/acsomega.2c07717] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Accepted: 01/19/2023] [Indexed: 06/18/2023]
Abstract
Approaches for predicting proteolysis targeting chimera (PROTAC) cell permeability are of major interest to reduce resource-demanding synthesis and testing of low-permeable PROTACs. We report a comprehensive investigation of the scope and limitations of machine learning-based binary classification models developed using 17 simple descriptors for large and structurally diverse sets of cereblon (CRBN) and von Hippel-Lindau (VHL) PROTACs. For the VHL PROTAC set, kappa nearest neighbor and random forest models performed best and predicted the permeability of a blinded test set with >80% accuracy (k ≥ 0.57). Models retrained by combining the original training and the blinded test set performed equally well for a second blinded VHL set. However, models for CRBN PROTACs were less successful, mainly due to the imbalanced nature of the CRBN datasets. All descriptors contributed to the models, but size and lipophilicity were the most important. We conclude that properly trained machine learning models can be integrated as effective filters in the PROTAC design process.
Collapse
Affiliation(s)
| | - Florian Kölling
- Computational
Molecular Design, Bayer AG, 42096Wuppertal, Germany
| | - Anja Giese
- Drug
Discovery Sciences, Bayer AG, 13342Berlin, Germany
| | | | - Lutz Lehmann
- Drug
Discovery Sciences, Bayer AG, 42113Wuppertal, Germany
| | - Daniel Meibom
- Drug
Discovery Sciences, Bayer AG, 42113Wuppertal, Germany
| | - Jan Kihlberg
- Department
of Chemistry-BMC, Box 576, Uppsala University, 75123Uppsala, Sweden
| |
Collapse
|
19
|
Dotson JJ, van Dijk L, Timmerman JC, Grosslight S, Walroth RC, Gosselin F, Püntener K, Mack KA, Sigman MS. Data-Driven Multi-Objective Optimization Tactics for Catalytic Asymmetric Reactions Using Bisphosphine Ligands. J Am Chem Soc 2023; 145:110-121. [PMID: 36574729 DOI: 10.1021/jacs.2c08513] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Optimization of the catalyst structure to simultaneously improve multiple reaction objectives (e.g., yield, enantioselectivity, and regioselectivity) remains a formidable challenge. Herein, we describe a machine learning workflow for the multi-objective optimization of catalytic reactions that employ chiral bisphosphine ligands. This was demonstrated through the optimization of two sequential reactions required in the asymmetric synthesis of an active pharmaceutical ingredient. To accomplish this, a density functional theory-derived database of >550 bisphosphine ligands was constructed, and a designer chemical space mapping technique was established. The protocol used classification methods to identify active catalysts, followed by linear regression to model reaction selectivity. This led to the prediction and validation of significantly improved ligands for all reaction outputs, suggesting a general strategy that can be readily implemented for reaction optimizations where performance is controlled by bisphosphine ligands.
Collapse
Affiliation(s)
- Jordan J Dotson
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Lucy van Dijk
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Jacob C Timmerman
- Department of Small Molecule Process Chemistry, Genentech, Inc., South San Francisco, California 94080, United States
| | - Samantha Grosslight
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Richard C Walroth
- Department of Small Molecule Process Chemistry, Genentech, Inc., South San Francisco, California 94080, United States
| | - Francis Gosselin
- Department of Small Molecule Process Chemistry, Genentech, Inc., South San Francisco, California 94080, United States
| | - Kurt Püntener
- Synthetic Molecules Technical Development, Process Chemistry & Catalysis, F. Hoffmann-La Roche Limited, CH-4070 Basel, Switzerland
| | - Kyle A Mack
- Department of Small Molecule Process Chemistry, Genentech, Inc., South San Francisco, California 94080, United States
| | - Matthew S Sigman
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| |
Collapse
|
20
|
Togo MV, Mastrolorito F, Ciriaco F, Trisciuzzi D, Tondo AR, Gambacorta N, Bellantuono L, Monaco A, Leonetti F, Bellotti R, Altomare CD, Amoroso N, Nicolotti O. TIRESIA: An eXplainable Artificial Intelligence Platform for Predicting Developmental Toxicity. J Chem Inf Model 2023; 63:56-66. [PMID: 36520016 DOI: 10.1021/acs.jcim.2c01126] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Herein, a robust and reproducible eXplainable Artificial Intelligence (XAI) approach is presented, which allows prediction of developmental toxicity, a challenging human-health endpoint in toxicology. The application of XAI as an alternative method is of the utmost importance with developmental toxicity being one of the most animal-intensive areas of regulatory toxicology. In this work, the established CAESAR (Computer Assisted Evaluation of industrial chemical Substances According to Regulations) training set made of 234 chemicals for model learning is employed. Two test sets, including as a whole 585 chemicals, were instead used for validation and generalization purposes. The proposed framework favorably compares with the state-of-the-art approaches in terms of accuracy, sensitivity, and specificity, thus resulting in a reliable support system for developmental toxicity ensuring informativeness, uncertainty estimation, generalization, and transparency. Based on the eXtreme Gradient Boosting (XGB) algorithm, our predictive model provides easy interpretative keys based on specific molecular descriptors and structural alerts enabling one to distinguish toxic and nontoxic chemicals. Inspired by the Organisation for Economic Co-operation and Development (OECD) principles for the validation of Quantitative Structure-Activity Relationships (QSARs) for regulatory purposes, the results are summarized in a standard report in portable document format, enclosing also details concerned with a density-based model applicability domain and SHAP (SHapley Additive exPlanations) explainability, the latter particularly useful to better understand the effective roles played by molecular features. Notably, our model has been implemented in TIRESIA (Toxicology Intelligence and Regulatory Evaluations for Scientific and Industry Applications), a free of charge web platform available at http://tiresia.uniba.it.
Collapse
Affiliation(s)
- Maria Vittoria Togo
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Fabrizio Mastrolorito
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Fulvio Ciriaco
- Dipartimento di Chimica, Università degli Studi di Bari Aldo Moro, 70125, Bari, Italy
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Anna Rita Tondo
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Nicola Gambacorta
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Loredana Bellantuono
- Dipartimento di Biomedicina Traslazionale e Neuroscienze (DiBraiN), Università degli Studi di Bari Aldo Moro, 70124Bari, Italy.,Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125Bari, Italy
| | - Alfonso Monaco
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125Bari, Italy.,Dipartimento Interateneo di Fisica M. Merlin, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Francesco Leonetti
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Roberto Bellotti
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125Bari, Italy.,Dipartimento Interateneo di Fisica M. Merlin, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Cosimo Damiano Altomare
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Nicola Amoroso
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy.,Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125Bari, Italy
| | - Orazio Nicolotti
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| |
Collapse
|
21
|
Metwally AA, Nayel AA, Hathout RM. In silico prediction of siRNA ionizable-lipid nanoparticles In vivo efficacy: Machine learning modeling based on formulation and molecular descriptors. Front Mol Biosci 2022; 9:1042720. [PMID: 36619167 PMCID: PMC9811823 DOI: 10.3389/fmolb.2022.1042720] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 12/07/2022] [Indexed: 12/24/2022] Open
Abstract
In silico prediction of the in vivo efficacy of siRNA ionizable-lipid nanoparticles is desirable as it can save time and resources dedicated to wet-lab experimentation. This study aims to computationally predict siRNA nanoparticles in vivo efficacy. A data set containing 120 entries was prepared by combining molecular descriptors of the ionizable lipids together with two nanoparticles formulation characteristics. Input descriptor combinations were selected by an evolutionary algorithm. Artificial neural networks, support vector machines and partial least squares regression were used for QSAR modeling. Depending on how the data set is split, two training sets and two external validation sets were prepared. Training and validation sets contained 90 and 30 entries respectively. The results showed the successful predictions of validation set log (siRNA dose) with Rval 2= 0.86-0.89 and 0.75-80 for validation sets one and two, respectively. Artificial neural networks resulted in the best Rval 2 for both validation sets. For predictions that have high bias, improvement of Rval 2 from 0.47 to 0.96 was achieved by selecting the training set lipids lying within the applicability domain. In conclusion, in vivo performance of siRNA nanoparticles was successfully predicted by combining cheminformatics with machine learning techniques.
Collapse
Affiliation(s)
- Abdelkader A. Metwally
- Department of Pharmaceutics, Faculty of Pharmacy, Health Sciences Center, Kuwait University, Kuwait City, Kuwait,Department of Pharmaceutics and Industrial Pharmacy, Faculty of Pharmacy, Ain Shams University, Cairo, Egypt,*Correspondence: Abdelkader A. Metwally,
| | - Amira A. Nayel
- Clinical Pharmacy Department, Alexandria Ophthalmology Hospital, Alexandria, Egypt,Department of Clinical Pharmacy and Pharmacy Practice, Faculty of Pharmacy, Alexandria University, Alexandria, Egypt
| | - Rania M. Hathout
- Department of Pharmaceutics and Industrial Pharmacy, Faculty of Pharmacy, Ain Shams University, Cairo, Egypt
| |
Collapse
|
22
|
Zhou S, Jiang W, Chen G, Huang G. Design and Synthesis of Novel Double-Ring Conjugated Enones as Potent Anti-rheumatoid Arthritis Agents. ACS OMEGA 2022; 7:44065-44077. [PMID: 36506211 PMCID: PMC9730744 DOI: 10.1021/acsomega.2c05492] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 11/08/2022] [Indexed: 05/25/2023]
Abstract
Rheumatoid arthritis (RA) is a chronic and systemic disease of inflammatory synovitis with unknown etiology. In previous studies, we found that the double-ring conjugated enone structure has anti-rheumatoid arthritis activity and could effectively inhibit the proliferation of rat synovial cells in vitro and has good anti-inflammatory activity in vivo. Herein, we further modified the structure, which was a novel double-ring conjugated enone, to study its anti-rheumatoid arthritis activity. Results showed that the most potent compound 32 could effectively inhibit the proliferation of rat synovial cells in vitro and has better anti-inflammatory activity compared with that of the positive control methotrexate, as shown by in vivo activity evaluation. More interestingly, compound 32 could effectively inhibit the increase of TNF-α, IL-1β, and IL-6 induced by LPS and regulate the expression of TLR4, MyD88, NF-κB, and IκB in the signaling pathway of TLR4/NF-κB. Our results provided a promising starting point for the development of highly effective small molecules for the treatment of RA.
Collapse
Affiliation(s)
- Shiyang Zhou
- Chongqing
Chemical Industry Vocational College, Chongqing 401228, China
- Key
Laboratory of Tropical Medicinal Plant Chemistry of Hainan Province, Hainan Normal University, Haikou 571158, China
- Key
Laboratory of Carbohydrate Science and Engineering, Chongqing Key
Laboratory of Inorganic Functional Materials, Chongqing Normal University, Chongqing 401331, China
| | - Wenming Jiang
- Chongqing
Chemical Industry Vocational College, Chongqing 401228, China
| | - Guangying Chen
- Key
Laboratory of Tropical Medicinal Plant Chemistry of Hainan Province, Hainan Normal University, Haikou 571158, China
| | - Gangliang Huang
- Key
Laboratory of Carbohydrate Science and Engineering, Chongqing Key
Laboratory of Inorganic Functional Materials, Chongqing Normal University, Chongqing 401331, China
| |
Collapse
|
23
|
Tullius Scotti M, Herrera-Acevedo C, Barros de Menezes RP, Martin HJ, Muratov EN, Ítalo de Souza Silva Á, Faustino Albuquerque E, Ferreira Calado L, Coy-Barrera E, Scotti L. MolPredictX: Online Biological Activity Predictions by Machine Learning Models. Mol Inform 2022; 41:e2200133. [PMID: 35961924 DOI: 10.1002/minf.202200133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 08/12/2022] [Indexed: 01/05/2023]
Abstract
Here we report the development of MolPredictX, an innovate and freely accessible web interface for biological activity predictions of query molecules. MolPredictX utilizes in-house QSAR models to provide 27 qualitative predictions (active or inactive), and quantitative probabilities for bioactivity against parasitic (Trypanosoma and Leishmania), viral (Dengue, Sars-CoV and Hepatitis C), pathogenic yeast (Candida albicans), bacterial (Salmonella enterica and Escherichia coli), and Alzheimer disease enzymes. In this article, we introduce the methodology and usability of this webtool, highlighting its potential role in the development of new drugs against a variety of diseases. MolPredictX is undergoing continuous development and is freely available at https://www.molpredictx.ufpb.br/.
Collapse
Affiliation(s)
- Marcus Tullius Scotti
- Programa de Pós-Graduação de Produtos Naturais e Sintéticos Bioativos, Universidade Federal da Paraíba, 58051-900, João Pessoa-PB, Brazil
| | - Chonny Herrera-Acevedo
- Programa de Pós-Graduação de Produtos Naturais e Sintéticos Bioativos, Universidade Federal da Paraíba, 58051-900, João Pessoa-PB, Brazil.,Department of Chemical Engineering, Universidad ECCI, Carrera 19 # 49-20, 111311, Bogotá D.C., Colombia
| | - Renata Priscila Barros de Menezes
- Programa de Pós-Graduação de Produtos Naturais e Sintéticos Bioativos, Universidade Federal da Paraíba, 58051-900, João Pessoa-PB, Brazil
| | - Holli-Joi Martin
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Eugene N Muratov
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Ávilla Ítalo de Souza Silva
- Programa de Pós-Graduação de Produtos Naturais e Sintéticos Bioativos, Universidade Federal da Paraíba, 58051-900, João Pessoa-PB, Brazil
| | - Emmanuella Faustino Albuquerque
- Programa de Pós-Graduação de Produtos Naturais e Sintéticos Bioativos, Universidade Federal da Paraíba, 58051-900, João Pessoa-PB, Brazil
| | - Lucas Ferreira Calado
- Programa de Pós-Graduação de Produtos Naturais e Sintéticos Bioativos, Universidade Federal da Paraíba, 58051-900, João Pessoa-PB, Brazil
| | - Ericsson Coy-Barrera
- Bioorganic Chemistry Laboratory, Facultad de Ciencias Básicas y Aplicadas, Universidad Militar Nueva Granada, Cajicá, 250247, Colombia
| | - Luciana Scotti
- Programa de Pós-Graduação de Produtos Naturais e Sintéticos Bioativos, Universidade Federal da Paraíba, 58051-900, João Pessoa-PB, Brazil
| |
Collapse
|
24
|
Spake R, O’Dea RE, Nakagawa S, Doncaster CP, Ryo M, Callaghan CT, Bullock JM. Improving quantitative synthesis to achieve generality in ecology. Nat Ecol Evol 2022; 6:1818-1828. [DOI: 10.1038/s41559-022-01891-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Accepted: 08/26/2022] [Indexed: 11/05/2022]
|
25
|
Wambaugh JF, Rager JE. Exposure forecasting - ExpoCast - for data-poor chemicals in commerce and the environment. JOURNAL OF EXPOSURE SCIENCE & ENVIRONMENTAL EPIDEMIOLOGY 2022; 32:783-793. [PMID: 36347934 PMCID: PMC9742338 DOI: 10.1038/s41370-022-00492-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 10/21/2022] [Accepted: 10/21/2022] [Indexed: 05/10/2023]
Abstract
Estimates of exposure are critical to prioritize and assess chemicals based on risk posed to public health and the environment. The U.S. Environmental Protection Agency (EPA) is responsible for regulating thousands of chemicals in commerce and the environment for which exposure data are limited. Since 2009 the EPA's ExpoCast ("Exposure Forecasting") project has sought to develop the data, tools, and evaluation approaches required to generate rapid and scientifically defensible exposure predictions for the full universe of existing and proposed commercial chemicals. This review article aims to summarize issues in exposure science that have been addressed through initiatives affiliated with ExpoCast. ExpoCast research has generally focused on chemical exposure as a statistical systems problem intended to inform thousands of chemicals. The project exists as a companion to EPA's ToxCast ("Toxicity Forecasting") project which has used in vitro high-throughput screening technologies to characterize potential hazard posed by thousands of chemicals for which there are limited toxicity data. Rapid prediction of chemical exposures and in vitro-in vivo extrapolation (IVIVE) of ToxCast data allow for prioritization based upon risk of adverse outcomes due to environmental chemical exposure. ExpoCast has developed (1) integrated modeling approaches to reliably predict exposure and IVIVE dose, (2) highly efficient screening tools for chemical prioritization, (3) efficient and affordable tools for generating new exposure and dose data, and (4) easily accessible exposure databases. The development of new exposure models and databases along with the application of technologies like non-targeted analysis and machine learning have transformed exposure science for data-poor chemicals. By developing high-throughput tools for chemical exposure analytics and translating those tools into public health decisions ExpoCast research has served as a crucible for identifying and addressing exposure science knowledge gaps.
Collapse
Affiliation(s)
- John F Wambaugh
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. EPA, Research Triangle Park, NC, USA.
- Department of Environmental Sciences & Engineering, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| | - Julia E Rager
- Department of Environmental Sciences & Engineering, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
26
|
Stern N, Gacs A, Tátrai E, Flachner B, Hajdú I, Dobi K, Bágyi I, Dormán G, Lőrincz Z, Cseh S, Kígyós A, Tóvári J, Goldblum A. Dual Inhibitors of AChE and BACE-1 for Reducing Aβ in Alzheimer's Disease: From In Silico to In Vivo. Int J Mol Sci 2022; 23:13098. [PMID: 36361906 PMCID: PMC9655245 DOI: 10.3390/ijms232113098] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 10/03/2022] [Accepted: 10/05/2022] [Indexed: 07/30/2023] Open
Abstract
Alzheimer's disease (AD) is a complex and widespread condition, still not fully understood and with no cure yet. Amyloid beta (Aβ) peptide is suspected to be a major cause of AD, and therefore, simultaneously blocking its formation and aggregation by inhibition of the enzymes BACE-1 (β-secretase) and AChE (acetylcholinesterase) by a single inhibitor may be an effective therapeutic approach, as compared to blocking one of these targets or by combining two drugs, one for each of these targets. We used our ISE algorithm to model each of the AChE peripheral site inhibitors and BACE-1 inhibitors, on the basis of published data, and constructed classification models for each. Subsequently, we screened large molecular databases with both models. Top scored molecules were docked into AChE and BACE-1 crystal structures, and 36 Molecules with the best weighted scores (based on ISE indexes and docking results) were sent for inhibition studies on the two enzymes. Two of them inhibited both AChE (IC50 between 4-7 μM) and BACE-1 (IC50 between 50-65 μM). Two additional molecules inhibited only AChE, and another two molecules inhibited only BACE-1. Preliminary testing of inhibition by F681-0222 (molecule 2) on APPswe/PS1dE9 transgenic mice shows a reduction in brain tissue of soluble Aβ42.
Collapse
Affiliation(s)
- Noa Stern
- Molecular Modeling and Drug Design Lab, Institute for Drug Research, The Hebrew University of Jerusalem, Jerusalem 9112001, Israel
| | - Alexandra Gacs
- Department of Experimental Pharmacology, National Institute of Oncology, H-1122 Budapest, Hungary
| | - Enikő Tátrai
- Department of Experimental Pharmacology, National Institute of Oncology, H-1122 Budapest, Hungary
- KINETO Lab Ltd., H-1032 Budapest, Hungary
| | | | - István Hajdú
- TargetEx Ltd., H-2120 Dunakeszi, Hungary
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, H-1117 Budapest, Hungary
| | | | | | | | | | | | | | - József Tóvári
- KINETO Lab Ltd., H-1032 Budapest, Hungary
- Department of Tumor Biology, National Korányi Institute of TB and Pulmonology, H-1121 Budapest, Hungary
| | - Amiram Goldblum
- Molecular Modeling and Drug Design Lab, Institute for Drug Research, The Hebrew University of Jerusalem, Jerusalem 9112001, Israel
| |
Collapse
|
27
|
Shan M, Jiang C, Qin L, Cheng G. A Review of Computational Methods in Predicting hERG Channel Blockers. ChemistrySelect 2022. [DOI: 10.1002/slct.202201221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Mengyi Shan
- School of Pharmaceutical Sciences Zhejiang Chinese Medical University Hangzhou 310053 People's Republic of China
| | - Chen Jiang
- QuanMin RenZheng (HangZhou) Technology Co. Ltd. China
| | - Lu‐Ping Qin
- School of Pharmaceutical Sciences Zhejiang Chinese Medical University Hangzhou 310053 People's Republic of China
| | - Gang Cheng
- School of Pharmaceutical Sciences Zhejiang Chinese Medical University Hangzhou 310053 People's Republic of China
| |
Collapse
|
28
|
Hierarchical Clustering and Target-Independent QSAR for Antileishmanial Oxazole and Oxadiazole Derivatives. Int J Mol Sci 2022; 23:ijms23168898. [PMID: 36012163 PMCID: PMC9408707 DOI: 10.3390/ijms23168898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 07/22/2022] [Accepted: 07/25/2022] [Indexed: 11/16/2022] Open
Abstract
Leishmaniasis is a neglected tropical disease that kills more than 20,000 people each year. The chemotherapy available for the treatment of the disease is limited, and novel approaches to discover novel drugs are urgently needed. Herein, 2D- and 4D-quantitative structure–activity relationship (QSAR) models were developed for a series of oxazole and oxadiazole derivatives that are active against Leishmania infantum, the causative agent of visceral leishmaniasis. A clustering strategy based on structural similarity was applied with molecular fingerprints to divide the complete set of compounds into two groups. Hierarchical clustering was followed by the development of 2D- (R2 = 0.90, R2pred = 0.82) and 4D-QSAR models (R2 = 0.80, R2pred = 0.64), which showed improved statistical robustness and predictive ability.
Collapse
|
29
|
Sellami A, Réau M, Montes M, Lagarde N. Review of in silico studies dedicated to the nuclear receptor family: Therapeutic prospects and toxicological concerns. Front Endocrinol (Lausanne) 2022; 13:986016. [PMID: 36176461 PMCID: PMC9513233 DOI: 10.3389/fendo.2022.986016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 08/08/2022] [Indexed: 11/13/2022] Open
Abstract
Being in the center of both therapeutic and toxicological concerns, NRs are widely studied for drug discovery application but also to unravel the potential toxicity of environmental compounds such as pesticides, cosmetics or additives. High throughput screening campaigns (HTS) are largely used to detect compounds able to interact with this protein family for both therapeutic and toxicological purposes. These methods lead to a large amount of data requiring the use of computational approaches for a robust and correct analysis and interpretation. The output data can be used to build predictive models to forecast the behavior of new chemicals based on their in vitro activities. This atrticle is a review of the studies published in the last decade and dedicated to NR ligands in silico prediction for both therapeutic and toxicological purposes. Over 100 articles concerning 14 NR subfamilies were carefully read and analyzed in order to retrieve the most commonly used computational methods to develop predictive models, to retrieve the databases deployed in the model building process and to pinpoint some of the limitations they faced.
Collapse
|
30
|
Abstract
Quantitative structure-activity relationship (QSAR) models are routinely applied computational tools in the drug discovery process. QSAR models are regression or classification models that predict the biological activities of molecules based on the features derived from their molecular structures. These models are usually used to prioritize a list of candidate molecules for future laboratory experiments and to help chemists gain better insights into how structural changes affect a molecule's biological activities. Developing accurate and interpretable QSAR models is therefore of the utmost importance in the drug discovery process. Deep neural networks, which are powerful supervised learning algorithms, have shown great promise for addressing regression and classification problems in various research fields, including the pharmaceutical industry. In this chapter, we briefly review the applications of deep neural networks in QSAR modeling and describe commonly used techniques to improve model performance.
Collapse
|
31
|
Zhou S, Huang G. Some important inhibitors and mechanisms of rheumatoid arthritis. Chem Biol Drug Des 2021; 99:930-943. [PMID: 34942050 DOI: 10.1111/cbdd.14015] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 12/17/2021] [Accepted: 12/21/2021] [Indexed: 11/29/2022]
Abstract
Rheumatoid arthritis is a chronic disease that seriously affects human health and quality of life, and it is one of the main causes of labor loss and disability. Many countries have listed rheumatoid arthritis as one of the national a key diseases to tackle. The pathogenesis of RA in humans is still unknown, and medical researchers believe that the pathogenesis of RA may be the result of a combination of genetic and environmental factors. RA is an incurable condition that can only be controlled and treated with conventional drugs. In this paper, the pathologic features and pathogenesis of RA were introduced, and the research progress of new anti-rheumatoid arthritis chemical drugs in recent years was reviewed.
Collapse
Affiliation(s)
- Shiyang Zhou
- Chongqing Chemical Industry Vocational College, Chongqing, 401228, China.,College of Chemistry, Chongqing Normal University, Chongqing, 401331, China
| | - Gangliang Huang
- College of Chemistry, Chongqing Normal University, Chongqing, 401331, China
| |
Collapse
|
32
|
Grebner C, Matter H, Hessler G. Artificial Intelligence in Compound Design. Methods Mol Biol 2021; 2390:349-382. [PMID: 34731477 DOI: 10.1007/978-1-0716-1787-8_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Abstract
Artificial intelligence has seen an incredibly fast development in recent years. Many novel technologies for property prediction of drug molecules as well as for the design of novel molecules were introduced by different research groups. These artificial intelligence-based design methods can be applied for suggesting novel chemical motifs in lead generation or scaffold hopping as well as for optimization of desired property profiles during lead optimization. In lead generation, broad sampling of the chemical space for identification of novel motifs is required, while in the lead optimization phase, a detailed exploration of the chemical neighborhood of a current lead series is advantageous. These different requirements for successful design outcomes render different combinations of artificial intelligence technologies useful. Overall, we observe that a combination of different approaches with tailored scoring and evaluation schemes appears beneficial for efficient artificial intelligence-based compound design.
Collapse
Affiliation(s)
- Christoph Grebner
- Sanofi-Aventis Deutschland GmbH, R&D, Integrated Drug Discovery, Frankfurt am Main, Germany
| | - Hans Matter
- Sanofi-Aventis Deutschland GmbH, R&D, Integrated Drug Discovery, Frankfurt am Main, Germany
| | - Gerhard Hessler
- Sanofi-Aventis Deutschland GmbH, R&D, Integrated Drug Discovery, Frankfurt am Main, Germany.
| |
Collapse
|
33
|
Alonso-Jauregui M, Font M, González-Peñas E, López de Cerain A, Vettorazzi A. Prioritization of Mycotoxins Based on Their Genotoxic Potential with an In Silico-In Vitro Strategy. Toxins (Basel) 2021; 13:734. [PMID: 34679027 PMCID: PMC8540412 DOI: 10.3390/toxins13100734] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 10/08/2021] [Accepted: 10/12/2021] [Indexed: 11/16/2022] Open
Abstract
Humans are widely exposed to a great variety of mycotoxins and their mixtures. Therefore, it is important to design strategies that allow prioritizing mycotoxins based on their toxic potential in a time and cost-effective manner. A strategy combining in silico tools (Phase 1), including an expert knowledge-based (DEREK Nexus®, Lhasa Limited, Leeds, UK) and a statistical-based platform (VEGA QSAR©, Mario Negri Institute, Milan, Italy), followed by the in vitro SOS/umu test (Phase 2), was applied to a set of 12 mycotoxins clustered according to their structure into three groups. Phase 1 allowed us to clearly classify group 1 (aflatoxin and sterigmatocystin) as mutagenic and group 3 (ochratoxin A, zearalenone and fumonisin B1) as non-mutagenic. For group 2 (trichothecenes), contradictory conclusions were obtained between the two in silico tools, being out of the applicability domain of many models. Phase 2 confirmed the results obtained in the previous phase for groups 1 and 3. It also provided extra information regarding the role of metabolic activation in aflatoxin B1 and sterigmatocystin mutagenicity. Regarding group 2, equivocal results were obtained in few experiments; however, the group was finally classified as non-mutagenic. The strategy used correlated with the published Ames tests, which detect point mutations. Few alerts for chromosome aberrations could be detected. The SOS/umu test appeared as a good screening test for mutagenicity that can be used in the absence and presence of metabolic activation and independently of Phase 1, although the in silico-in vitro combination gave more information for decision making.
Collapse
Affiliation(s)
- Maria Alonso-Jauregui
- Department of Pharmacology and Toxicology, Research Group MITOX, School of Pharmacy and Nutrition, Universidad de Navarra, 31008 Pamplona, Spain; (M.A.-J.); (A.L.d.C.)
| | - María Font
- Department of Pharmaceutical Technology and Chemistry, Research Group MITOX, School of Pharmacy and Nutrition, Universidad de Navarra, 31008 Pamplona, Spain; (M.F.); (E.G.-P.)
- IdiSNA, Navarra Institute for Health Research, 31008 Pamplona, Spain
| | - Elena González-Peñas
- Department of Pharmaceutical Technology and Chemistry, Research Group MITOX, School of Pharmacy and Nutrition, Universidad de Navarra, 31008 Pamplona, Spain; (M.F.); (E.G.-P.)
| | - Adela López de Cerain
- Department of Pharmacology and Toxicology, Research Group MITOX, School of Pharmacy and Nutrition, Universidad de Navarra, 31008 Pamplona, Spain; (M.A.-J.); (A.L.d.C.)
- IdiSNA, Navarra Institute for Health Research, 31008 Pamplona, Spain
| | - Ariane Vettorazzi
- Department of Pharmacology and Toxicology, Research Group MITOX, School of Pharmacy and Nutrition, Universidad de Navarra, 31008 Pamplona, Spain; (M.A.-J.); (A.L.d.C.)
- IdiSNA, Navarra Institute for Health Research, 31008 Pamplona, Spain
| |
Collapse
|
34
|
Luo D, Tong JB, Feng Y. 3D-QSAR and Molecular Docking Analysis for Natural Aurone Derivatives as Anti-Malarial Agents. Polycycl Aromat Compd 2021. [DOI: 10.1080/10406638.2021.1973519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Ding Luo
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, China
- Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an, China
| | - Jian-Bo Tong
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, China
- Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an, China
| | - Yi Feng
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, China
- Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an, China
| |
Collapse
|
35
|
Mervin LH, Trapotsi MA, Afzal AM, Barrett IP, Bender A, Engkvist O. Probabilistic Random Forest improves bioactivity predictions close to the classification threshold by taking into account experimental uncertainty. J Cheminform 2021; 13:62. [PMID: 34412708 PMCID: PMC8375213 DOI: 10.1186/s13321-021-00539-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Accepted: 07/30/2021] [Indexed: 11/24/2022] Open
Abstract
Measurements of protein–ligand interactions have reproducibility limits due to experimental errors. Any model based on such assays will consequentially have such unavoidable errors influencing their performance which should ideally be factored into modelling and output predictions, such as the actual standard deviation of experimental measurements (σ) or the associated comparability of activity values between the aggregated heterogenous activity units (i.e., Ki versus IC50 values) during dataset assimilation. However, experimental errors are usually a neglected aspect of model generation. In order to improve upon the current state-of-the-art, we herein present a novel approach toward predicting protein–ligand interactions using a Probabilistic Random Forest (PRF) classifier. The PRF algorithm was applied toward in silico protein target prediction across ~ 550 tasks from ChEMBL and PubChem. Predictions were evaluated by taking into account various scenarios of experimental standard deviations in both training and test sets and performance was assessed using fivefold stratified shuffled splits for validation. The largest benefit in incorporating the experimental deviation in PRF was observed for data points close to the binary threshold boundary, when such information was not considered in any way in the original RF algorithm. For example, in cases when σ ranged between 0.4–0.6 log units and when ideal probability estimates between 0.4–0.6, the PRF outperformed RF with a median absolute error margin of ~ 17%. In comparison, the baseline RF outperformed PRF for cases with high confidence to belong to the active class (far from the binary decision threshold), although the RF models gave errors smaller than the experimental uncertainty, which could indicate that they were overtrained and/or over-confident. Finally, the PRF models trained with putative inactives decreased the performance compared to PRF models without putative inactives and this could be because putative inactives were not assigned an experimental pXC50 value, and therefore they were considered inactives with a low uncertainty (which in practice might not be true). In conclusion, PRF can be useful for target prediction models in particular for data where class boundaries overlap with the measurement uncertainty, and where a substantial part of the training data is located close to the classification threshold.
Collapse
Affiliation(s)
- Lewis H Mervin
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK.
| | - Maria-Anna Trapotsi
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
| | - Avid M Afzal
- Data Sciences & Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Ian P Barrett
- Data Sciences & Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Andreas Bender
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden.,Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, Sweden
| |
Collapse
|
36
|
Selvaraj C, Selvaraj G, Mohamed Ismail R, Vijayakumar R, Baazeem A, Wei DQ, Singh SK. Interrogation of Bacillus anthracis SrtA active site loop forming open/close lid conformations through extensive MD simulations for understanding binding selectivity of SrtA inhibitors. Saudi J Biol Sci 2021; 28:3650-3659. [PMID: 34220215 PMCID: PMC8241892 DOI: 10.1016/j.sjbs.2021.05.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 04/25/2021] [Accepted: 05/02/2021] [Indexed: 02/07/2023] Open
Abstract
Bacillus anthracis is a gram positive, deadly spore forming bacteria causing anthrax and these bacteria having the complex mechanism in the cell wall envelope, which can adopt the changes in environmental conditions. In this, the membrane bound cell wall proteins are said to progressive drug target for the inhibition of Bacillus anthracis. Among the cell wall proteins, the SrtA is one of the important mechanistic protein, which mediate the ligation with LPXTG motif by forming the amide bonds. The SrtA plays the vital role in cell signalling, cell wall formation, and biofilm formations. Inhibition of SrtA leads to rupture of the cell wall and biofilm formation, and that leads to inhibition of Bacillus anthracis and thus, SrtA is core important enzyme to study the inhibition mechanism. In this study, we have examined 28 compounds, which have the inhibitory activity against the Bacillus anthracis SrtA for developing the 3D-QSAR and also, compounds binding selectivity with both open and closed SrtA conformations, obtained from 100 ns of MD simulations. The binding site loop deviate in forming the open and closed gate mechanism is investigated to understand the inhibitory profile of reported compounds, and results show the closed state active site conformations are required for ligand binding specificity. Overall, the present study may offer an opportunity for better understanding of the mechanism of action and can be aided to further designing of a novel and highly potent SrtA inhibitors.
Collapse
Affiliation(s)
- Chandrabose Selvaraj
- Department of Bioinformatics, Computer Aided Drug Design and Molecular Modelling Lab, Science Block, Alagappa University, Karaikudi, Tamil Nadu, India
- Corresponding authors.
| | - Gurudeeban Selvaraj
- Centre for Research in Molecular Modelling, Concordia University, 5618 Montreal, Quebec, Canada
| | - Randa Mohamed Ismail
- Department of Medical Laboratory Sciences, College of Applied Medical Sciences, Majmaah University, Al Majmaah 11952, Saudi Arabia
- Department of Microbiology and Immunology, Veterinary Research Division, National Research Center (NRC), Giza, Egypt
| | - Rajendran Vijayakumar
- Department of Biology, College of Science in Zulfi, Majmaah University, Majmaah 11952, Saudi Arabia
| | - Alaa Baazeem
- Department of Biology, College of Science, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
| | - Dong-Qing Wei
- Department of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Sanjeev Kumar Singh
- Department of Bioinformatics, Computer Aided Drug Design and Molecular Modelling Lab, Science Block, Alagappa University, Karaikudi, Tamil Nadu, India
- Corresponding authors.
| |
Collapse
|
37
|
Chen CC, Guo YC. Prediction of minimum ignition energy using quantitative structure activity relationships approach. J Loss Prev Process Ind 2021. [DOI: 10.1016/j.jlp.2021.104443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
38
|
Meftahi N, Walker ML, Smith BJ. Predicting aqueous solubility by QSPR modeling. J Mol Graph Model 2021; 106:107901. [PMID: 33857890 DOI: 10.1016/j.jmgm.2021.107901] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 03/09/2021] [Accepted: 03/09/2021] [Indexed: 12/26/2022]
Abstract
The aqueous solubility is predicted here using quantitative structure property relationship (QSPR) models. In this study, we examine whether descriptors that individually yield favorable models for the prediction of the Gibbs energy of solvation and sublimation can be used in combination with octanol-water partition coefficient to produce QSPR models for the prediction of aqueous solubility. Based on this strategy, applied to seven distinct datasets, all models exhibited an R2 greater than 0.7 and Q2 greater than 0.6 for the estimation of aqueous solubility. We also determined how uncoupling the descriptors used to create QSPR models in the prediction of Gibbs energy of sublimation yielded an improved model. Model refinement using an artificial neural network applying the same descriptors generated significantly better models with improved R2 and standard deviation.
Collapse
Affiliation(s)
- Nastaran Meftahi
- La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria, 3086, Australia
| | - Michael L Walker
- La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria, 3086, Australia
| | - Brian J Smith
- La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria, 3086, Australia.
| |
Collapse
|
39
|
Medeiros AR, Ferreira LLG, de Souza ML, de Oliveira Rezende Junior C, Espinoza-Chávez RM, Dias LC, Andricopulo AD. Chemoinformatics Studies on a Series of Imidazoles as Cruzain Inhibitors. Biomolecules 2021; 11:biom11040579. [PMID: 33920961 PMCID: PMC8071344 DOI: 10.3390/biom11040579] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/05/2021] [Accepted: 04/13/2021] [Indexed: 11/16/2022] Open
Abstract
Natural products based on imidazole scaffolds have inspired the discovery of a wide variety of bioactive compounds. Herein, a series of imidazoles that act as competitive and potent cruzain inhibitors was investigated using a combination of ligand- and structure-based drug design strategies. Quantitative structure-activity relationships (QSARs) were generated along with the investigation of enzyme-inhibitor molecular interactions. Predictive hologram QSAR (HQSAR, r2pred = 0.80) and AutoQSAR (q2 = 0.90) models were built, and key structural properties that underpin cruzain inhibition were identified. Moreover, comparative molecular field analysis (CoMFA, r2pred = 0.81) and comparative molecular similarity indices analysis (CoMSIA, r2pred = 0.73) revealed 3D molecular features that strongly affect the activity of the inhibitors. These findings were examined along with molecular docking studies and were highly compatible with the intermolecular contacts that take place between cruzain and the inhibitors. The results gathered herein revealed the main factors that determine the activity of the imidazoles studied and provide novel knowledge for the design of improved cruzain inhibitors.
Collapse
Affiliation(s)
- Alex R. Medeiros
- Laboratório de Química Medicinal e Computacional, Centro de Pesquisa e Inovação em Biodiversidade e Fármacos, Instituto de Física de São Carlos, Universidade de São Paulo, Av. João Dagnone 1100, São Carlos, SP 13563-120, Brazil; (A.R.M.); (L.L.G.F.); (M.L.d.S.)
| | - Leonardo L. G. Ferreira
- Laboratório de Química Medicinal e Computacional, Centro de Pesquisa e Inovação em Biodiversidade e Fármacos, Instituto de Física de São Carlos, Universidade de São Paulo, Av. João Dagnone 1100, São Carlos, SP 13563-120, Brazil; (A.R.M.); (L.L.G.F.); (M.L.d.S.)
| | - Mariana L. de Souza
- Laboratório de Química Medicinal e Computacional, Centro de Pesquisa e Inovação em Biodiversidade e Fármacos, Instituto de Física de São Carlos, Universidade de São Paulo, Av. João Dagnone 1100, São Carlos, SP 13563-120, Brazil; (A.R.M.); (L.L.G.F.); (M.L.d.S.)
| | | | - Rocío Marisol Espinoza-Chávez
- Instituto de Química, Universidade Estadual de Campinas, Campinas, SP 13084-971, Brazil; (C.d.O.R.J.); (R.M.E.-C.); (L.C.D.)
| | - Luiz Carlos Dias
- Instituto de Química, Universidade Estadual de Campinas, Campinas, SP 13084-971, Brazil; (C.d.O.R.J.); (R.M.E.-C.); (L.C.D.)
| | - Adriano D. Andricopulo
- Laboratório de Química Medicinal e Computacional, Centro de Pesquisa e Inovação em Biodiversidade e Fármacos, Instituto de Física de São Carlos, Universidade de São Paulo, Av. João Dagnone 1100, São Carlos, SP 13563-120, Brazil; (A.R.M.); (L.L.G.F.); (M.L.d.S.)
- Correspondence: ; Tel.: +55-16-33739844
| |
Collapse
|
40
|
Abdizadeh R, Heidarian E, Hadizadeh F, Abdizadeh T. QSAR Modeling, Molecular Docking and Molecular Dynamics Simulations Studies of Lysine-Specific Demethylase 1 (LSD1) Inhibitors as Anticancer Agents. Anticancer Agents Med Chem 2021; 21:987-1018. [PMID: 32698753 DOI: 10.2174/1871520620666200721134010] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 05/07/2020] [Accepted: 05/17/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND Histone Lysine Demetylases1 (LSD1) is a promising medication to treat cancer, which plays a crucial role in epigenetic modulation of gene expression. Inhibition of LSD1with small molecules has emerged as a vital mechanism to treat cancer. OBJECTIVE In the present research, molecular modeling investigations, such as CoMFA, CoMFA-RF, CoMSIA and HQSAR, molecular docking and Molecular Dynamics (MD) simulations were carried out on some tranylcypromine derivatives as LSD1 inhibitors. METHODS The QSAR models were carried out on a series of Tranylcypromine derivatives as data set via the SYBYL-X2.1.1 program. Molecular docking and MD simulations were carried out by the MOE software and the SYBYL program, respectively. The internal and external predictability performances related to the generated models for these LSD1 inhibitors were justified by evaluating cross-validated correlation coefficient (q2), noncross- validated correlation coefficient (r2ncv) and predicted correlation coefficient (r2pred) of the training and test set molecules, respectively. RESULTS The CoMFA (q2, 0.670; r2ncv, 0.930; r2pred, 0.968), CoMFA-RF (q2, 0.694; r2ncr, 0.926; r2pred, 0.927), CoMSIA (q2, 0.834; r2ncv, 0.956; r2pred, 0.958) and HQSAR models (q2, 0.854; r2ncv, 0.900; r2pred, 0.728) for training as well as the test set of LSD1 inhibition resulted in significant findings. CONCLUSION These QSAR models were found to be perfect and strong with better predictability. Contour maps of all models were generated and it was proven by molecular docking studies and molecular dynamics simulation that the hydrophobic, electrostatic and hydrogen bonding fields are crucial in these models for improving the binding affinity and determining the structure-activity relationship. These theoretical results are possibly beneficial to design new strong LSD1 inhibitors with enhanced activity to treat cancer.
Collapse
Affiliation(s)
- Rahman Abdizadeh
- Department of Medical Parasitology and Mycology, Faculty of Medicine, Shahrekord University of Medical Sciences, Shahrekord, Iran
| | - Esfandiar Heidarian
- Clinical Biochemistry Research Center, Basic Health Sciences Institute, Sharekord University of Medical Sciences, Shahrekord, Iran
| | - Farzin Hadizadeh
- Biotechnology Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Tooba Abdizadeh
- Clinical Biochemistry Research Center, Basic Health Sciences Institute, Sharekord University of Medical Sciences, Shahrekord, Iran
| |
Collapse
|
41
|
Structural investigation of isatin-based benzenesulfonamides as carbonic anhydrase isoform IX inhibitors endowed with anticancer activity using molecular modeling approaches. J Mol Struct 2021. [DOI: 10.1016/j.molstruc.2020.129735] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
42
|
Bosc N, Felix E, Arcila R, Mendez D, Saunders MR, Green DVS, Ochoada J, Shelat AA, Martin EJ, Iyer P, Engkvist O, Verras A, Duffy J, Burrows J, Gardner JMF, Leach AR. MAIP: a web service for predicting blood-stage malaria inhibitors. J Cheminform 2021; 13:13. [PMID: 33618772 PMCID: PMC7898753 DOI: 10.1186/s13321-021-00487-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Accepted: 01/20/2021] [Indexed: 12/17/2022] Open
Abstract
Malaria is a disease affecting hundreds of millions of people across the world, mainly in developing countries and especially in sub-Saharan Africa. It is the cause of hundreds of thousands of deaths each year and there is an ever-present need to identify and develop effective new therapies to tackle the disease and overcome increasing drug resistance. Here, we extend a previous study in which a number of partners collaborated to develop a consensus in silico model that can be used to identify novel molecules that may have antimalarial properties. The performance of machine learning methods generally improves with the number of data points available for training. One practical challenge in building large training sets is that the data are often proprietary and cannot be straightforwardly integrated. Here, this was addressed by sharing QSAR models, each built on a private data set. We describe the development of an open-source software platform for creating such models, a comprehensive evaluation of methods to create a single consensus model and a web platform called MAIP available at https://www.ebi.ac.uk/chembl/maip/ . MAIP is freely available for the wider community to make large-scale predictions of potential malaria inhibiting compounds. This project also highlights some of the practical challenges in reproducing published computational methods and the opportunities that open-source software can offer to the community.
Collapse
Affiliation(s)
- Nicolas Bosc
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, CB10 1SD, Hinxton, Cambridge, United Kingdom.
| | - Eloy Felix
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, CB10 1SD, Hinxton, Cambridge, United Kingdom
| | - Ricardo Arcila
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, CB10 1SD, Hinxton, Cambridge, United Kingdom
| | - David Mendez
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, CB10 1SD, Hinxton, Cambridge, United Kingdom
| | - Martin R Saunders
- Department of Molecular Design, Data and Computational Sciences, GlaxoSmithKline, Gunnels Wood Road, Hertfordshire, SG1 2NY, Stevenage, UK
| | - Darren V S Green
- Department of Molecular Design, Data and Computational Sciences, GlaxoSmithKline, Gunnels Wood Road, Hertfordshire, SG1 2NY, Stevenage, UK
| | - Jason Ochoada
- Department of Chemical Biology and Therapeutics, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Tennessee, 38105, Memphis, USA
| | - Anang A Shelat
- Department of Chemical Biology and Therapeutics, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Tennessee, 38105, Memphis, USA
| | - Eric J Martin
- Novartis Institute for Biomedical Research, 5300 Chiron Way, California, 94608- 2916, Emeryville, USA
| | - Preeti Iyer
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Ola Engkvist
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Andreas Verras
- Schrodinger Inc, 120 West 45th Street, 10036-4041, New York, NY, USA
| | - James Duffy
- Medicines for Malaria Ventures Discovery, 1215, Geneva, Switzerland
| | - Jeremy Burrows
- Medicines for Malaria Ventures Discovery, 1215, Geneva, Switzerland
| | - J Mark F Gardner
- AMG Consultants Ltd, Discovery Park House, Discovery Park, Ramsgate Road, CT13 9ND, Sandwich, Kent, UK
| | - Andrew R Leach
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, CB10 1SD, Hinxton, Cambridge, United Kingdom.
| |
Collapse
|
43
|
Ojha PK, Kumar V, Roy J, Roy K. Recent advances in quantitative structure-activity relationship models of antimalarial drugs. Expert Opin Drug Discov 2021; 16:659-695. [PMID: 33356651 DOI: 10.1080/17460441.2021.1866535] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
INTRODUCTION Due to emerging resistance to the first-line artemisinin-based antimalarials and lack of efficient vaccines and limited chemotherapeutic alternatives, there is an urgent need to develop new antimalarial compounds. In this regard, quantitative structure-activity relationship (QSAR) modeling can provide essential information about required physicochemical properties and structural parameters of antimalarial drug candidates. AREAS COVERED The authors provide an overview of recent advances of QSAR models covering different classes of antimalarial compounds as well as molecular docking studies of compounds acting on different antimalarial targets reported in the last 5 years (2015-2019) to explore the mode of interactions between the molecules and the receptors. We have tried to cover most of the QSAR models of antimalarials (along with results from some other related computational methods) reported during 2015-2019. EXPERT OPINION Many QSAR reports for antimalarial compounds are based on small number of data points. This review infers that most of the present work deals with analog-based QSAR approach with a limited applicability domain (a very few cases with wide domain) whereas novel target-based computational approach is reported in very few cases, which leads to huge voids of computational work based on novel antimalarial targets.
Collapse
Affiliation(s)
- Probir Kumar Ojha
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - Vinay Kumar
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - Joyita Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| |
Collapse
|
44
|
Chen X, Xie W, Yang Y, Hua Y, Xing G, Liang L, Deng C, Wang Y, Fan Y, Liu H, Lu T, Chen Y, Zhang Y. Discovery of Dual FGFR4 and EGFR Inhibitors by Machine Learning and Biological Evaluation. J Chem Inf Model 2020; 60:4640-4652. [DOI: 10.1021/acs.jcim.0c00652] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Affiliation(s)
- Xingye Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Wuchen Xie
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Yan Yang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Yi Hua
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - GuoMeng Xing
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Li Liang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Chenglong Deng
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Yuchen Wang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Yuanrong Fan
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Haichun Liu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Tao Lu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
- State Key Laboratory of Natural Medicines, China Pharmaceutical University, 24 Tongjiaxiang, Nanjing 210009, China
| | - Yadong Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Yanmin Zhang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| |
Collapse
|
45
|
A quantitative structure activity relationship model for predicting minimum ignition energy of organic substance. J Loss Prev Process Ind 2020. [DOI: 10.1016/j.jlp.2020.104227] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
46
|
Zivkovic M, Zlatanovic M, Zlatanovic N, Golubović M, Veselinović AM. The Application of the Combination of Monte Carlo Optimization Method based QSAR Modeling and Molecular Docking in Drug Design and Development. Mini Rev Med Chem 2020; 20:1389-1402. [DOI: 10.2174/1389557520666200212111428] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Revised: 10/21/2019] [Accepted: 10/28/2019] [Indexed: 01/18/2023]
Abstract
In recent years, one of the promising approaches in the QSAR modeling Monte Carlo optimization
approach as conformation independent method, has emerged. Monte Carlo optimization has
proven to be a valuable tool in chemoinformatics, and this review presents its application in drug discovery
and design. In this review, the basic principles and important features of these methods are discussed
as well as the advantages of conformation independent optimal descriptors developed from the
molecular graph and the Simplified Molecular Input Line Entry System (SMILES) notation compared
to commonly used descriptors in QSAR modeling. This review presents the summary of obtained results
from Monte Carlo optimization-based QSAR modeling with the further addition of molecular
docking studies applied for various pharmacologically important endpoints. SMILES notation based
optimal descriptors, defined as molecular fragments, identified as main contributors to the increase/
decrease of biological activity, which are used further to design compounds with targeted activity
based on computer calculation, are presented. In this mini-review, research papers in which molecular
docking was applied as an additional method to design molecules to validate their activity further,
are summarized. These papers present a very good correlation among results obtained from Monte
Carlo optimization modeling and molecular docking studies.
Collapse
Affiliation(s)
| | | | | | - Mladjan Golubović
- Clinic for Anesthesiology and Intensive Care, Clinical Center Nis, Nis, Serbia
| | | |
Collapse
|
47
|
Jablonka K, Ongari D, Moosavi SM, Smit B. Big-Data Science in Porous Materials: Materials Genomics and Machine Learning. Chem Rev 2020; 120:8066-8129. [PMID: 32520531 PMCID: PMC7453404 DOI: 10.1021/acs.chemrev.0c00004] [Citation(s) in RCA: 153] [Impact Index Per Article: 38.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Indexed: 12/16/2022]
Abstract
By combining metal nodes with organic linkers we can potentially synthesize millions of possible metal-organic frameworks (MOFs). The fact that we have so many materials opens many exciting avenues but also create new challenges. We simply have too many materials to be processed using conventional, brute force, methods. In this review, we show that having so many materials allows us to use big-data methods as a powerful technique to study these materials and to discover complex correlations. The first part of the review gives an introduction to the principles of big-data science. We show how to select appropriate training sets, survey approaches that are used to represent these materials in feature space, and review different learning architectures, as well as evaluation and interpretation strategies. In the second part, we review how the different approaches of machine learning have been applied to porous materials. In particular, we discuss applications in the field of gas storage and separation, the stability of these materials, their electronic properties, and their synthesis. Given the increasing interest of the scientific community in machine learning, we expect this list to rapidly expand in the coming years.
Collapse
Affiliation(s)
- Kevin
Maik Jablonka
- Laboratory of Molecular Simulation
(LSMO), Institut des Sciences et Ingénierie Chimiques (ISIC), École Polytechnique Fédérale
de Lausanne (EPFL), Sion, Switzerland
| | - Daniele Ongari
- Laboratory of Molecular Simulation
(LSMO), Institut des Sciences et Ingénierie Chimiques (ISIC), École Polytechnique Fédérale
de Lausanne (EPFL), Sion, Switzerland
| | - Seyed Mohamad Moosavi
- Laboratory of Molecular Simulation
(LSMO), Institut des Sciences et Ingénierie Chimiques (ISIC), École Polytechnique Fédérale
de Lausanne (EPFL), Sion, Switzerland
| | - Berend Smit
- Laboratory of Molecular Simulation
(LSMO), Institut des Sciences et Ingénierie Chimiques (ISIC), École Polytechnique Fédérale
de Lausanne (EPFL), Sion, Switzerland
| |
Collapse
|
48
|
Abdizadeh R, Heidarian E, Hadizadeh F, Abdizadeh T. Investigation of pyrimidine analogues as xanthine oxidase inhibitors to treat of hyperuricemia and gout through combined QSAR techniques, molecular docking and molecular dynamics simulations. J Taiwan Inst Chem Eng 2020. [DOI: 10.1016/j.jtice.2020.08.028] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
49
|
QSAR investigations and structure-based virtual screening on a series of nitrobenzoxadiazole derivatives targeting human glutathione-S-transferases. J Mol Struct 2020. [DOI: 10.1016/j.molstruc.2020.128015] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
50
|
Bonini P, Kind T, Tsugawa H, Barupal DK, Fiehn O. Retip: Retention Time Prediction for Compound Annotation in Untargeted Metabolomics. Anal Chem 2020; 92:7515-7522. [PMID: 32390414 PMCID: PMC8715951 DOI: 10.1021/acs.analchem.9b05765] [Citation(s) in RCA: 113] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Unidentified peaks remain a major problem in untargeted metabolomics by LC-MS/MS. Confidence in peak annotations increases by combining MS/MS matching and retention time. We here show how retention times can be predicted from molecular structures. Two large, publicly available data sets were used for model training in machine learning: the Fiehn hydrophilic interaction liquid chromatography data set (HILIC) of 981 primary metabolites and biogenic amines,and the RIKEN plant specialized metabolome annotation (PlaSMA) database of 852 secondary metabolites that uses reversed-phase liquid chromatography (RPLC). Five different machine learning algorithms have been integrated into the Retip R package: the random forest, Bayesian-regularized neural network, XGBoost, light gradient-boosting machine (LightGBM), and Keras algorithms for building the retention time prediction models. A complete workflow for retention time prediction was developed in R. It can be freely downloaded from the GitHub repository (https://www.retip.app). Keras outperformed other machine learning algorithms in the test set with minimum overfitting, verified by small error differences between training, test, and validation sets. Keras yielded a mean absolute error of 0.78 min for HILIC and 0.57 min for RPLC. Retip is integrated into the mass spectrometry software tools MS-DIAL and MS-FINDER, allowing a complete compound annotation workflow. In a test application on mouse blood plasma samples, we found a 68% reduction in the number of candidate structures when searching all isomers in MS-FINDER compound identification software. Retention time prediction increases the identification rate in liquid chromatography and subsequently leads to an improved biological interpretation of metabolomics data.
Collapse
Affiliation(s)
- Paolo Bonini
- NGAlab, La Riera de Gaia, Tarragona 43762, Spain
| | - Tobias Kind
- West Coast Metabolomics Center, UC Davis Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, California 95616, United States
| | - Hiroshi Tsugawa
- RIKEN Center for Sustainable Resource Science, Yokohama 230-0045, Japan
- RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Japan
| | - Dinesh Kumar Barupal
- West Coast Metabolomics Center, UC Davis Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, California 95616, United States
| | - Oliver Fiehn
- West Coast Metabolomics Center, UC Davis Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, California 95616, United States
| |
Collapse
|