1
|
Fluetsch A, Trunzer M, Gerebtzoff G, Rodríguez-Pérez R. Deep Learning Models Compared to Experimental Variability for the Prediction of CYP3A4 Time-Dependent Inhibition. Chem Res Toxicol 2024; 37:549-560. [PMID: 38501689 DOI: 10.1021/acs.chemrestox.3c00305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Abstract
Most drugs are mainly metabolized by cytochrome P450 (CYP450), which can lead to drug-drug interactions (DDI). Specifically, time-dependent inhibition (TDI) of CYP3A4 isoenzyme has been associated with clinically relevant DDI. To overcome potential DDI issues, high-throughput in vitro assays were established to assess the TDI of CYP3A4 during the discovery and lead optimization phases. However, in silico machine learning models would enable an earlier and larger-scale assessment of TDI potential liabilities. For CYP inhibition, most modeling efforts have focused on highly imbalanced and small data sets. Moreover, assay variability is rarely considered, which is key to understand the model's quality and suitability for decision-making. In this work, machine learning models were built for the prediction of TDI of CYP3A4, evaluated prospectively, and compared to the variability of the experimental assay. Different modeling strategies were investigated to assess their influence on the model's performance. Through multitask learning, additional data sets were leveraged for model building, coming from public databases, in-house CYP-related assays, or other pharmaceutical companies (federated learning). Apart from the numerical prediction of inactivation rates of CYP3A4 TDI, three-class predictions were carried out, giving a negative (inactivation rate kobs < 0.01 min-1), weak positive (0.01 ≤ kobs ≤ 0.025 min-1), or positive (kobs > 0.025 min-1) output. The final multitask graph neural network model achieved misclassification rates of 8 and 7% for positive and negative TDI, respectively. Importantly, the presented deep learning-based predictions had a similar precision to the reproducibility of in vitro experiments and thus offered great opportunities for drug design, early derisk of DDI potential, and selection of experiments. To facilitate CYP inhibition modeling efforts in the public domain, the developed model was used to annotate ∼16 000 publicly available structures, and a surrogate data set is shared as Supporting Information.
Collapse
Affiliation(s)
- Andrin Fluetsch
- Novartis Biomedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Markus Trunzer
- Novartis Biomedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Grégori Gerebtzoff
- Novartis Biomedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | | |
Collapse
|
2
|
Xu T, Kabir M, Sakamuru S, Shah P, Padilha E, Ngan DK, Xia M, Xu X, Simeonov A, Huang R. Predictive Models for Human Cytochrome P450 3A7 Selective Inhibitors and Substrates. J Chem Inf Model 2023; 63:846-855. [PMID: 36719788 PMCID: PMC10664139 DOI: 10.1021/acs.jcim.2c01516] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Inappropriate use of prescription drugs is potentially more harmful in fetuses/neonates than in adults. Cytochrome P450 (CYP) 3A subfamily undergoes developmental changes in expression, such as a transition from CYP3A7 to CYP3A4 shortly after birth, which provides a potential way to distinguish medication effects on fetuses/neonates and adults. The purpose of this study was to build first-in-class predictive models for both inhibitors and substrates of CYP3A7/CYP3A4 using chemical structure analysis. Three metrics were used to evaluate model performance: area under the receiver operating characteristic curve (AUC-ROC), balanced accuracy (BA), and Matthews correlation coefficient (MCC). The performance varied for each CYP3A7/CYP3A4 inhibitor/substrate model depending on the data set type, model type, rebalancing method, and specific feature set. For the active inhibitor/substrate data set, the optimal models achieved AUC-ROC values ranging from 0.77 ± 0.01 to 0.84 ± 0.01. For the selective inhibitor/substrate data set, the optimal models achieved AUC-ROC values ranging from 0.72 ± 0.02 to 0.79 ± 0.04. The predictive power of the optimal models was validated by compounds with known potencies as CYP3A7/CYP3A4 inhibitors or substrates. In addition, we identified structural features significant for CYP3A7/CYP3A4 selective or common inhibitors and substrates. In summary, the top performing models can be further applied as a tool to rapidly evaluate the safety and efficacy of new drugs separately for fetuses/neonates and adults. The significant structural features could guide the design of new therapeutic drugs as well as aid in the optimization of existing medicine for fetuses/neonates.
Collapse
Affiliation(s)
- Tuan Xu
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Md Kabir
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
- The Graduate School of Biomedical Sciences, Departments of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Srilatha Sakamuru
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Pranav Shah
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Elias Padilha
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Deborah K. Ngan
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Menghang Xia
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Xin Xu
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Anton Simeonov
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Ruili Huang
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| |
Collapse
|
3
|
Kumar M, Nguyen TPN, Kaur J, Singh TG, Soni D, Singh R, Kumar P. Opportunities and challenges in application of artificial intelligence in pharmacology. Pharmacol Rep 2023; 75:3-18. [PMID: 36624355 PMCID: PMC9838466 DOI: 10.1007/s43440-022-00445-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 12/23/2022] [Accepted: 12/25/2022] [Indexed: 01/11/2023]
Abstract
Artificial intelligence (AI) is a machine science that can mimic human behaviour like intelligent analysis of data. AI functions with specialized algorithms and integrates with deep and machine learning. Living in the digital world can generate a huge amount of medical data every day. Therefore, we need an automated and reliable evaluation tool that can make decisions more accurately and faster. Machine learning has the potential to learn, understand and analyse the data used in healthcare systems. In the last few years, AI is known to be employed in various fields in pharmaceutical science especially in pharmacological research. It helps in the analysis of preclinical (laboratory animals) and clinical (in human) trial data. AI also plays important role in various processes such as drug discovery/manufacturing, diagnosis of big data for disease identification, personalized treatment, clinical trial research, radiotherapy, surgical robotics, smart electronic health records, and epidemic outbreak prediction. Moreover, AI has been used in the evaluation of biomarkers and diseases. In this review, we explain various models and general processes of machine learning and their role in pharmacological science. Therefore, AI with deep learning and machine learning could be relevant in pharmacological research.
Collapse
Affiliation(s)
- Mandeep Kumar
- Department of Pharmacy, Unit of Pharmacology and Toxicology, University of Genoa, Genoa, Italy
| | - T P Nhung Nguyen
- Department of Pharmacy, Unit of Pharmacology and Toxicology, University of Genoa, Genoa, Italy
- Department of Pharmacy, Da Nang University of Medical Technology and Pharmacy, Da Nang, Vietnam
| | - Jasleen Kaur
- Department of Pharmacology and Toxicology, National Institute of Pharmaceutical Education and Research (NIPER), Lucknow, Uttar Pradesh, 226002, India
| | | | - Divya Soni
- Department of Pharmacology, Central University of Punjab, Ghudda, Bathinda, Punjab, 151401, India
| | - Randhir Singh
- Department of Pharmacology, Central University of Punjab, Ghudda, Bathinda, Punjab, 151401, India
| | - Puneet Kumar
- Department of Pharmacology, Central University of Punjab, Ghudda, Bathinda, Punjab, 151401, India.
| |
Collapse
|
4
|
Qiu M, Liang X, Deng S, Li Y, Ke Y, Wang P, Mei H. A unified GCNN model for predicting CYP450 inhibitors by using graph convolutional neural networks with attention mechanism. Comput Biol Med 2022; 150:106177. [PMID: 36242811 DOI: 10.1016/j.compbiomed.2022.106177] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Revised: 09/19/2022] [Accepted: 10/01/2022] [Indexed: 11/17/2022]
Abstract
Undesirable drug-drug interactions (DDIs) may lead to serious adverse side effects when more than two drugs are administered to a patient simultaneously. One of the most common DDIs is caused by unexpected inhibition of a specific human cytochrome P450 (CYP450), which plays a dominant role in the metabolism of the co-administered drugs. Therefore, a unified and reliable method for predicting the potential inhibitors of CYP450 family is extremely important in drug development. In this work, graph convolutional neural network (GCN) with attention mechanism and 1-D convolutional neural network (CNN) were used to extract the features of CYP ligands and the binding sites of CYP450 respectively, which were then combined to establish a unified GCN-CNN (GCNN) model for predicting the inhibitors of 5 dominant CYP isoforms, i.e., 1A2, 2C9, 2C19, 2D6, and 3A4. Overall, the established GCNN model showed good performances on the test samples and achieved better performances than the recently proposed iCYP-MFE model by using the same datasets. Based on the heat-map analysis of the resulting molecular graphs, the key structural determinants of the CYP inhibitors were further explored.
Collapse
Affiliation(s)
- Minyao Qiu
- Key Laboratory of Biorheological Science and Technology (Ministry of Education), College of Bioengineering, Chongqing University, Chongqing, 400044, China; College of Bioengineering, Chongqing University, Chongqing, 400044, China
| | - Xiaoqi Liang
- College of Bioengineering, Chongqing University, Chongqing, 400044, China
| | - Siyao Deng
- College of Bioengineering, Chongqing University, Chongqing, 400044, China
| | - Yufang Li
- College of Bioengineering, Chongqing University, Chongqing, 400044, China
| | - Yanlan Ke
- College of Bioengineering, Chongqing University, Chongqing, 400044, China
| | - Pingqing Wang
- College of Bioengineering, Chongqing University, Chongqing, 400044, China
| | - Hu Mei
- Key Laboratory of Biorheological Science and Technology (Ministry of Education), College of Bioengineering, Chongqing University, Chongqing, 400044, China; College of Bioengineering, Chongqing University, Chongqing, 400044, China.
| |
Collapse
|
5
|
Guttman Y, Kerem Z. Computer-Aided (In Silico) Modeling of Cytochrome P450-Mediated Food–Drug Interactions (FDI). Int J Mol Sci 2022; 23:ijms23158498. [PMID: 35955630 PMCID: PMC9369352 DOI: 10.3390/ijms23158498] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Revised: 07/26/2022] [Accepted: 07/28/2022] [Indexed: 02/01/2023] Open
Abstract
Modifications of the activity of Cytochrome 450 (CYP) enzymes by compounds in food might impair medical treatments. These CYP-mediated food–drug interactions (FDI) play a major role in drug clearance in the intestine and liver. Inter-individual variation in both CYP expression and structure is an important determinant of FDI. Traditional targeted approaches have highlighted a limited number of dietary inhibitors and single-nucleotide variations (SNVs), each determining personal CYP activity and inhibition. These approaches are costly in time, money and labor. Here, we review computational tools and databases that are already available and are relevant to predicting CYP-mediated FDIs. Computer-aided approaches such as protein–ligand interaction modeling and the virtual screening of big data narrow down hundreds of thousands of items in databanks to a few putative targets, to which the research resources could be further directed. Structure-based methods are used to explore the structural nature of the interaction between compounds and CYP enzymes. However, while collections of chemical, biochemical and genetic data are available today and call for the implementation of big-data approaches, ligand-based machine-learning approaches for virtual screening are still scarcely used for FDI studies. This review of CYP-mediated FDIs promises to attract scientists and the general public.
Collapse
|
6
|
Guttman Y, Kerem Z. Dietary Inhibitors of CYP3A4 Are Revealed Using Virtual Screening by Using a New Deep-Learning Classifier. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2022; 70:2752-2761. [PMID: 35104412 PMCID: PMC8895463 DOI: 10.1021/acs.jafc.2c00237] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 01/17/2022] [Accepted: 01/20/2022] [Indexed: 05/29/2023]
Abstract
CYP3A4 is the main human enzyme responsible for phase I metabolism of dietary compounds, prescribed drugs and xenobiotics, steroid hormones, and bile acids. The inhibition of CYP3A4 activity might impair physiological mechanisms, including the endocrine system and response to drug admission. Here, we aimed to discover new CYP3A4 inhibitors from food and dietary supplements. A deep-learning model was built that classifies compounds as either an inhibitor or noninhibitor, with a high specificity of 0.997. We used this classifier to virtually screen ∼60,000 dietary compounds. Of the 115 identified potential inhibitors, only 31 were previously suggested. Many herbals, as predicted here, might cause impaired metabolism of drugs, and endogenous hormones and bile acids. Additionally, by applying Lipinski's rules of five, 17 compounds were also classified as potential intestine local inhibitors. New CYP3A4 inhibitors predicted by the model, bilobetin and picropodophyllin, were assayed in vitro.
Collapse
|
7
|
Goldwaser E, Laurent C, Lagarde N, Fabrega S, Nay L, Villoutreix BO, Jelsch C, Nicot AB, Loriot MA, Miteva MA. Machine learning-driven identification of drugs inhibiting cytochrome P450 2C9. PLoS Comput Biol 2022; 18:e1009820. [PMID: 35081108 PMCID: PMC8820617 DOI: 10.1371/journal.pcbi.1009820] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 02/07/2022] [Accepted: 01/10/2022] [Indexed: 11/19/2022] Open
Abstract
Cytochrome P450 2C9 (CYP2C9) is a major drug-metabolizing enzyme that represents 20% of the hepatic CYPs and is responsible for the metabolism of 15% of drugs. A general concern in drug discovery is to avoid the inhibition of CYP leading to toxic drug accumulation and adverse drug-drug interactions. However, the prediction of CYP inhibition remains challenging due to its complexity. We developed an original machine learning approach for the prediction of drug-like molecules inhibiting CYP2C9. We created new predictive models by integrating CYP2C9 protein structure and dynamics knowledge, an original selection of physicochemical properties of CYP2C9 inhibitors, and machine learning modeling. We tested the machine learning models on publicly available data and demonstrated that our models successfully predicted CYP2C9 inhibitors with an accuracy, sensitivity and specificity of approximately 80%. We experimentally validated the developed approach and provided the first identification of the drugs vatalanib, piriqualone, ticagrelor and cloperidone as strong inhibitors of CYP2C9 with IC values <18 μM and sertindole, asapiprant, duvelisib and dasatinib as moderate inhibitors with IC50 values between 40 and 85 μM. Vatalanib was identified as the strongest inhibitor with an IC50 value of 0.067 μM. Metabolism assays allowed the characterization of specific metabolites of abemaciclib, cloperidone, vatalanib and tarafenacin produced by CYP2C9. The obtained results demonstrate that such a strategy could improve the prediction of drug-drug interactions in clinical practice and could be utilized to prioritize drug candidates in drug discovery pipelines.
Collapse
Affiliation(s)
- Elodie Goldwaser
- INSERM U1268 « Medicinal Chemistry and Translational Research », UMR 8038 CiTCoM, CNRS—University of Paris, Paris, France
| | | | - Nathalie Lagarde
- Laboratoire GBCM, EA7528, Conservatoire National des Arts et Métiers, 2 Rue Conté, Hésam Université, Paris, France
| | - Sylvie Fabrega
- Viral Vector for Gene Transfer core facility, Université de Paris—Structure Fédérative de Recherche Necker, INSERM US24/CNRS UMS3633, Paris, France
| | - Laure Nay
- Viral Vector for Gene Transfer core facility, Université de Paris—Structure Fédérative de Recherche Necker, INSERM US24/CNRS UMS3633, Paris, France
| | | | | | - Arnaud B. Nicot
- INSERM, Nantes Université, Center for Research in Transplantation and Translational Immunology, UMR 1064, ITUN, Nantes, France
| | - Marie-Anne Loriot
- University of Paris, INSERM U1138, Paris, France
- Assistance Publique-Hôpitaux de Paris, Hôpital Européen Georges Pompidou, Service de Biochimie, Paris, France
| | - Maria A. Miteva
- INSERM U1268 « Medicinal Chemistry and Translational Research », UMR 8038 CiTCoM, CNRS—University of Paris, Paris, France
| |
Collapse
|
8
|
Plonka W, Stork C, Šícho M, Kirchmair J. CYPlebrity: Machine learning models for the prediction of inhibitors of cytochrome P450 enzymes. Bioorg Med Chem 2021; 46:116388. [PMID: 34488021 DOI: 10.1016/j.bmc.2021.116388] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 08/19/2021] [Accepted: 08/24/2021] [Indexed: 10/20/2022]
Abstract
The vast majority of approved drugs are metabolized by the five major cytochrome P450 (CYP) isozymes, 1A2, 2C9, 2C19, 2D6 and 3A4. Inhibition of CYP isozymes can cause drug-drug interactions with severe pharmacological and toxicological consequences. Computational methods for the fast and reliable prediction of the inhibition of CYP isozymes by small molecules are therefore of high interest and relevance to pharmaceutical companies and a host of other industries, including the cosmetics and agrochemical industries. Today, a large number of machine learning models for predicting the inhibition of the major CYP isozymes by small molecules are available. With this work we aim to go beyond the coverage of existing models, by combining data from several major public and proprietary sources. More specifically, we used up to 18815 compounds with measured bioactivities to train random forest classification models for the individual CYP isozymes. A major advantage of the new data collection over existing ones is the better representation of the minority class, the CYP inhibitors. With the new data collection we achieved inhibitor-to-non-inhibitor ratios in the order of 1:1 (CYP1A2) to 1:3 (CYP2D6). We show that our models reach competitive performance on external data, with Matthews correlation coefficients (MCCs) ranging from 0.62 (CYP2C19) to 0.70 (CYP2D6), and areas under the receiver operating characteristic curve (AUCs) between 0.89 (CYP2C19) and 0.92 (CYPs 2D6 and 3A4). Importantly, the models show a high level of robustness, reflected in a good predictivity also for compounds that are structurally dissimilar to the compounds represented in the training data. The best models presented in this work are freely accessible for academic research via a web service.
Collapse
Affiliation(s)
- Wojciech Plonka
- Universität Hamburg, Center for Bioinformatics (ZBH), Hamburg, Bundesstr. 43, 20146, Germany; FQS Poland (Fujitsu Group), Parkowa 11, 30-538 Cracow, Poland
| | - Conrad Stork
- Universität Hamburg, Center for Bioinformatics (ZBH), Hamburg, Bundesstr. 43, 20146, Germany
| | - Martin Šícho
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28, Prague, Czech Republic
| | - Johannes Kirchmair
- Universität Hamburg, Center for Bioinformatics (ZBH), Hamburg, Bundesstr. 43, 20146, Germany; Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, Althanstr. 14, 1090 Vienna, Austria.
| |
Collapse
|
9
|
Rácz A, Bajusz D, Miranda-Quintana RA, Héberger K. Machine learning models for classification tasks related to drug safety. Mol Divers 2021; 25:1409-1424. [PMID: 34110577 PMCID: PMC8342376 DOI: 10.1007/s11030-021-10239-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 05/27/2021] [Indexed: 12/23/2022]
Abstract
In this review, we outline the current trends in the field of machine learning-driven classification studies related to ADME (absorption, distribution, metabolism and excretion) and toxicity endpoints from the past six years (2015-2021). The study focuses only on classification models with large datasets (i.e. more than a thousand compounds). A comprehensive literature search and meta-analysis was carried out for nine different targets: hERG-mediated cardiotoxicity, blood-brain barrier penetration, permeability glycoprotein (P-gp) substrate/inhibitor, cytochrome P450 enzyme family, acute oral toxicity, mutagenicity, carcinogenicity, respiratory toxicity and irritation/corrosion. The comparison of the best classification models was targeted to reveal the differences between machine learning algorithms and modeling types, endpoint-specific performances, dataset sizes and the different validation protocols. Based on the evaluation of the data, we can say that tree-based algorithms are (still) dominating the field, with consensus modeling being an increasing trend in drug safety predictions. Although one can already find classification models with great performances to hERG-mediated cardiotoxicity and the isoenzymes of the cytochrome P450 enzyme family, these targets are still central to ADMET-related research efforts.
Collapse
Affiliation(s)
- Anita Rácz
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary.
| | - Dávid Bajusz
- Medicinal Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary
| | | | - Károly Héberger
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary.
| |
Collapse
|
10
|
Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, Li Q, Shoemaker BA, Thiessen PA, Yu B, Zaslavsky L, Zhang J, Bolton EE. PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res 2021; 49:D1388-D1395. [PMID: 33151290 PMCID: PMC7778930 DOI: 10.1093/nar/gkaa971] [Citation(s) in RCA: 1754] [Impact Index Per Article: 584.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/06/2020] [Accepted: 10/11/2020] [Indexed: 02/06/2023] Open
Abstract
PubChem (https://pubchem.ncbi.nlm.nih.gov) is a popular chemical information resource that serves the scientific community as well as the general public, with millions of unique users per month. In the past two years, PubChem made substantial improvements. Data from more than 100 new data sources were added to PubChem, including chemical-literature links from Thieme Chemistry, chemical and physical property links from SpringerMaterials, and patent links from the World Intellectual Properties Organization (WIPO). PubChem's homepage and individual record pages were updated to help users find desired information faster. This update involved a data model change for the data objects used by these pages as well as by programmatic users. Several new services were introduced, including the PubChem Periodic Table and Element pages, Pathway pages, and Knowledge panels. Additionally, in response to the coronavirus disease 2019 (COVID-19) outbreak, PubChem created a special data collection that contains PubChem data related to COVID-19 and the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).
Collapse
Affiliation(s)
- Sunghwan Kim
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, 20894, USA
| | - Jie Chen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, 20894, USA
| | - Tiejun Cheng
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, 20894, USA
| | - Asta Gindulyte
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, 20894, USA
| | - Jia He
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, 20894, USA
| | - Siqian He
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, 20894, USA
| | - Qingliang Li
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, 20894, USA
| | - Benjamin A Shoemaker
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, 20894, USA
| | - Paul A Thiessen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, 20894, USA
| | - Bo Yu
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, 20894, USA
| | - Leonid Zaslavsky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, 20894, USA
| | - Jian Zhang
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, 20894, USA
| | - Evan E Bolton
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, 20894, USA
| |
Collapse
|
11
|
Banerjee P, Dunkel M, Kemmler E, Preissner R. SuperCYPsPred-a web server for the prediction of cytochrome activity. Nucleic Acids Res 2020; 48:W580-W585. [PMID: 32182358 PMCID: PMC7319455 DOI: 10.1093/nar/gkaa166] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 02/26/2020] [Accepted: 03/05/2020] [Indexed: 02/06/2023] Open
Abstract
Cytochrome P450 enzymes (CYPs)-mediated drug metabolism influences drug pharmacokinetics and results in adverse outcomes in patients through drug–drug interactions (DDIs). Absorption, distribution, metabolism, excretion and toxicity (ADMET) issues are the leading causes for the failure of a drug in the clinical trials. As details on their metabolism are known for just half of the approved drugs, a tool for reliable prediction of CYPs specificity is needed. The SuperCYPsPred web server is currently focused on five major CYPs isoenzymes, which includes CYP1A2, CYP2C19, CYP2D6, CYP2C9 and CYP3A4 that are responsible for more than 80% of the metabolism of clinical drugs. The prediction models for classification of the CYPs inhibition are based on well-established machine learning methods. The models were validated both on cross-validation and external validation sets and achieved good performance. The web server takes a 2D chemical structure as input and reports the CYP inhibition profile of the chemical for 10 models using different molecular fingerprints, along with confidence scores, similar compounds, known CYPs information of drugs—published in literature, detailed interaction profile of individual cytochromes including a DDIs table and an overall CYPs prediction radar chart (http://insilico-cyp.charite.de/SuperCYPsPred/). The web server does not require log in or registration and is free to use.
Collapse
Affiliation(s)
- Priyanka Banerjee
- Structural Bioinformatics Group, Institute for Physiology & ECRC, Charité, University Medicine Berlin, 10115 Berlin, Germany
| | - Mathias Dunkel
- Structural Bioinformatics Group, Institute for Physiology & ECRC, Charité, University Medicine Berlin, 10115 Berlin, Germany
| | - Emanuel Kemmler
- Structural Bioinformatics Group, Institute for Physiology & ECRC, Charité, University Medicine Berlin, 10115 Berlin, Germany
| | - Robert Preissner
- Structural Bioinformatics Group, Institute for Physiology & ECRC, Charité, University Medicine Berlin, 10115 Berlin, Germany
| |
Collapse
|
12
|
Large-scale evaluation of cytochrome P450 2C9 mediated drug interaction potential with machine learning-based consensus modeling. J Comput Aided Mol Des 2020; 34:831-839. [PMID: 32221780 PMCID: PMC7320947 DOI: 10.1007/s10822-020-00308-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Accepted: 03/09/2020] [Indexed: 11/17/2022]
Abstract
Cytochrome P450 (CYP) enzymes play an important role in the metabolism of xenobiotics. Since they are connected to drug interactions, screening for potential inhibitors is of utmost importance in drug discovery settings. Our study provides an extensive classification model for P450-drug interactions with one of the most prominent members, the 2C9 isoenzyme. Our model involved the largest set of 45,000 molecules ever used for developing prediction models. The models are based on three different types of descriptors, (a) typical one, two and three dimensional molecular descriptors, (b) chemical and pharmacophore fingerprints and (c) interaction fingerprints with docking scores. Two machine learning algorithms, the boosted tree and the multilayer feedforward of resilient backpropagation network were used and compared based on their performances. The models were validated both internally and using external validation sets. The results showed that the consensus voting technique with custom probability thresholds could provide promising results even in large-scale cases without any restrictions on the applicability domain. Our best model was capable to predict the 2C9 inhibitory activity with the area under the receiver operating characteristic curve (AUC) of 0.85 and 0.84 for the internal and the external test sets, respectively. The chemical space covered with the largest available dataset has reached its limit encompassing publicly available bioactivity data for the 2C9 isoenzyme.
Collapse
|
13
|
Kato H. Computational prediction of cytochrome P450 inhibition and induction. Drug Metab Pharmacokinet 2019; 35:30-44. [PMID: 31902468 DOI: 10.1016/j.dmpk.2019.11.006] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Revised: 10/27/2019] [Accepted: 11/17/2019] [Indexed: 12/14/2022]
Abstract
Cytochrome P450 (CYP) enzymes play an important role in the phase I metabolism of many xenobiotics. Most drug-drug interactions (DDIs) associated with CYP are caused by either CYP inhibition or induction. The early detection of potential DDIs is highly desirable in the pharmaceutical industry because DDIs can cause serious adverse events, which can lead to poor patient health and drug development failures. Recently, many computational studies predicting CYP inhibition and induction have been reported. The current computational modeling approaches for CYP metabolism are classified as ligand- and structure-based; various techniques, such as quantitative structure-activity relationships, machine learning, docking, and molecular dynamic simulation, are involved in both the approaches. Recently, combining these two approaches have resulted in improvements in the prediction accuracy of DDIs. In this review, we present important, recent developments in the computational prediction of the inhibition of four clinically crucial CYP isoforms (CYP1A2, 2C9, 2D6, and 3A4) and three nuclear receptors (aryl hydrocarbon receptor, constitutive androstane receptor, and pregnane X receptor) involved in the induction of CYP1A2, 2B6, and 3A4, respectively.
Collapse
Affiliation(s)
- Harutoshi Kato
- DMPK Research Laboratories, Mitsubishi Tanabe Pharma Corporation, Aoba-ku, Yokohama-shi, 227-0033, Japan.
| |
Collapse
|
14
|
Kiani YS, Ranaghan KE, Jabeen I, Mulholland AJ. Molecular Dynamics Simulation Framework to Probe the Binding Hypothesis of CYP3A4 Inhibitors. Int J Mol Sci 2019; 20:E4468. [PMID: 31510073 PMCID: PMC6769491 DOI: 10.3390/ijms20184468] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Revised: 08/22/2019] [Accepted: 09/01/2019] [Indexed: 12/20/2022] Open
Abstract
The Cytochrome P450 family of heme-containing proteins plays a major role in catalyzing phase I metabolic reactions, and the CYP3A4 subtype is responsible for the metabolism of many currently marketed drugs. Additionally, CYP3A4 has an inherent affinity for a broad spectrum of structurally diverse chemical entities, often leading to drug-drug interactions mediated by the inhibition or induction of the metabolic enzyme. The current study explores the binding of selected highly efficient CYP3A4 inhibitors by docking and molecular dynamics (MD) simulation protocols and their binding free energy calculated using the WaterSwap method. The results indicate the importance of binding pocket residues including Phe57, Arg105, Arg106, Ser119, Arg212, Phe213, Thr309, Ser312, Ala370, Arg372, Glu374, Gly481 and Leu483 for interaction with CYP3A4 inhibitors. The residue-wise decomposition of the binding free energy from the WaterSwap method revealed the importance of binding site residues Arg106 and Arg372 in the stabilization of all the selected CYP3A4-inhibitor complexes. The WaterSwap binding energies were further complemented with the MM(GB/PB)SA results and it was observed that the binding energies calculated by both methods do not differ significantly. Overall, our results could guide towards the use of multiple computational approaches to achieve a better understanding of CYP3A4 inhibition, subsequently leading to the design of highly specific and efficient new chemical entities with suitable ADMETox properties and reduced side effects.
Collapse
Affiliation(s)
- Yusra Sajid Kiani
- Research Center for Modeling and Simulation (RCMS), National University of Sciences and Technology (NUST), Islamabad 44000, Pakistan.
| | - Kara E Ranaghan
- Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol BS8 1TS, UK.
| | - Ishrat Jabeen
- Research Center for Modeling and Simulation (RCMS), National University of Sciences and Technology (NUST), Islamabad 44000, Pakistan.
| | - Adrian J Mulholland
- Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol BS8 1TS, UK.
| |
Collapse
|
15
|
Dmitriev AV, Filimonov DA, Rudik AV, Pogodin PV, Karasev DA, Lagunin AA, Poroikov VV. Drug-drug interaction prediction using PASS. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2019; 30:655-664. [PMID: 31482727 DOI: 10.1080/1062936x.2019.1653966] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 08/06/2019] [Indexed: 06/10/2023]
Abstract
Simultaneous use of the drugs may lead to undesirable Drug-Drug Interactions (DDIs) in the human body. Many DDIs are associated with changes in drug metabolism that performed by Drug-Metabolizing Enzymes (DMEs). In this case, DDI manifests itself as a result of the effect of one drug on the biotransformation of other drug(s), its slowing down (in the case of inhibiting DME) or acceleration (in case of induction of DME), which leads to a change in the pharmacological effect of the drugs combination. We used OpeRational ClassificAtion (ORCA) system for categorizing DDIs. ORCA divides DDIs into five classes: contraindicated (class 1), provisionally contraindicated (class 2), conditional (class 3), minimal risk (class 4), no interaction (class 5). We collected a training set consisting of several thousands of drug pairs. Algorithm of PASS program was used for the first, second and third classes DDI prediction. Chemical descriptors called PoSMNA (Pairs of Substances Multilevel Neighbourhoods of Atoms) were developed and implemented in PASS software to describe in a machine-readable format drug substances pairs instead of the single molecules. The average accuracy of DDI class prediction is about 0.84. A freely available web resource for DDI prediction was developed (http://way2drug.com/ddi/).
Collapse
Affiliation(s)
- A V Dmitriev
- Department for Bioinformatics, Institute of Biomedical Chemistry (IBMC), Moscow, Russia
| | - D A Filimonov
- Department for Bioinformatics, Institute of Biomedical Chemistry (IBMC), Moscow, Russia
| | - A V Rudik
- Department for Bioinformatics, Institute of Biomedical Chemistry (IBMC), Moscow, Russia
| | - P V Pogodin
- Department for Bioinformatics, Institute of Biomedical Chemistry (IBMC), Moscow, Russia
| | - D A Karasev
- Department for Bioinformatics, Institute of Biomedical Chemistry (IBMC), Moscow, Russia
| | - A A Lagunin
- Department for Bioinformatics, Institute of Biomedical Chemistry (IBMC), Moscow, Russia
- Medico-biological Faculty, Pirogov Russian National Research Medical University, Moscow, Russia
| | - V V Poroikov
- Department for Bioinformatics, Institute of Biomedical Chemistry (IBMC), Moscow, Russia
| |
Collapse
|
16
|
Exploring the Chemical Space of Cytochrome P450 Inhibitors Using Integrated Physicochemical Parameters, Drug Efficiency Metrics and Decision Tree Models. COMPUTATION 2019. [DOI: 10.3390/computation7020026] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The cytochrome P450s (CYPs) play a central role in the metabolism of various endogenous and exogenous compounds including drugs. CYPs are vulnerable to inhibition and induction which can lead to adverse drug reactions. Therefore, insights into the underlying mechanism of CYP450 inhibition and the estimation of overall CYP inhibitor properties might serve as valuable tools during the early phases of drug discovery. Herein, we present a large data set of inhibitors against five major metabolic CYPs (CYP1A2, CYP2C9, CYP2C19, CYP2D6 and CYP3A4) for the evaluation of important physicochemical properties and ligand efficiency metrics to define property trends across various activity levels (active, efficient and inactive). Decision tree models for CYP inhibition were developed with an accuracy >90% for both the training set and 10-folds cross validation. Overall, molecular weight (MW), hydrogen bond acceptors/donors (HBA/HBD) and lipophilicity (clogP/logPo/w) represent important physicochemical descriptors for CYP450 inhibitors. However, highly efficient CYP inhibitors show mean MW, HBA, HBD and logP values between 294.18–482.40,5.0–8.2,1–7.29 and 1.68–2.57, respectively. Our results might help in optimization of toxicological profiles associated with new chemical entities (NCEs), through a better understanding of inhibitor properties leading to CYP-mediated interactions.
Collapse
|
17
|
Manavalan B, Govindaraj RG, Shin TH, Kim MO, Lee G. iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction. Front Immunol 2018; 9:1695. [PMID: 30100904 PMCID: PMC6072840 DOI: 10.3389/fimmu.2018.01695] [Citation(s) in RCA: 108] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Accepted: 07/10/2018] [Indexed: 11/13/2022] Open
Abstract
Identification of B-cell epitopes (BCEs) is a fundamental step for epitope-based vaccine development, antibody production, and disease prevention and diagnosis. Due to the avalanche of protein sequence data discovered in postgenomic age, it is essential to develop an automated computational method to enable fast and accurate identification of novel BCEs within vast number of candidate proteins and peptides. Although several computational methods have been developed, their accuracy is unreliable. Thus, developing a reliable model with significant prediction improvements is highly desirable. In this study, we first constructed a non-redundant data set of 5,550 experimentally validated BCEs and 6,893 non-BCEs from the Immune Epitope Database. We then developed a novel ensemble learning framework for improved linear BCE predictor called iBCE-EL, a fusion of two independent predictors, namely, extremely randomized tree (ERT) and gradient boosting (GB) classifiers, which, respectively, uses a combination of physicochemical properties (PCP) and amino acid composition and a combination of dipeptide and PCP as input features. Cross-validation analysis on a benchmarking data set showed that iBCE-EL performed better than individual classifiers (ERT and GB), with a Matthews correlation coefficient (MCC) of 0.454. Furthermore, we evaluated the performance of iBCE-EL on the independent data set. Results show that iBCE-EL significantly outperformed the state-of-the-art method with an MCC of 0.463. To the best of our knowledge, iBCE-EL is the first ensemble method for linear BCEs prediction. iBCE-EL was implemented in a web-based platform, which is available at http://thegleelab.org/iBCE-EL. iBCE-EL contains two prediction modes. The first one identifying peptide sequences as BCEs or non-BCEs, while later one is aimed at providing users with the option of mining potential BCEs from protein sequences.
Collapse
Affiliation(s)
| | - Rajiv Gandhi Govindaraj
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
| | - Tae Hwan Shin
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea.,Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| | - Myeong Ok Kim
- Division of Life Science and Applied Life Science (BK21 Plus), College of Natural Sciences, Gyeongsang National University, Jinju, South Korea
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea.,Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| |
Collapse
|
18
|
Basith S, Cui M, Macalino SJY, Park J, Clavio NAB, Kang S, Choi S. Exploring G Protein-Coupled Receptors (GPCRs) Ligand Space via Cheminformatics Approaches: Impact on Rational Drug Design. Front Pharmacol 2018; 9:128. [PMID: 29593527 PMCID: PMC5854945 DOI: 10.3389/fphar.2018.00128] [Citation(s) in RCA: 79] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Accepted: 02/06/2018] [Indexed: 01/14/2023] Open
Abstract
The primary goal of rational drug discovery is the identification of selective ligands which act on single or multiple drug targets to achieve the desired clinical outcome through the exploration of total chemical space. To identify such desired compounds, computational approaches are necessary in predicting their drug-like properties. G Protein-Coupled Receptors (GPCRs) represent one of the largest and most important integral membrane protein families. These receptors serve as increasingly attractive drug targets due to their relevance in the treatment of various diseases, such as inflammatory disorders, metabolic imbalances, cardiac disorders, cancer, monogenic disorders, etc. In the last decade, multitudes of three-dimensional (3D) structures were solved for diverse GPCRs, thus referring to this period as the "golden age for GPCR structural biology." Moreover, accumulation of data about the chemical properties of GPCR ligands has garnered much interest toward the exploration of GPCR chemical space. Due to the steady increase in the structural, ligand, and functional data of GPCRs, several cheminformatics approaches have been implemented in its drug discovery pipeline. In this review, we mainly focus on the cheminformatics-based paradigms in GPCR drug discovery. We provide a comprehensive view on the ligand- and structure-based cheminformatics approaches which are best illustrated via GPCR case studies. Furthermore, an appropriate combination of ligand-based knowledge with structure-based ones, i.e., integrated approach, which is emerging as a promising strategy for cheminformatics-based GPCR drug design is also discussed.
Collapse
Affiliation(s)
| | | | | | | | | | - Soosung Kang
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul, South Korea
| | - Sun Choi
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul, South Korea
| |
Collapse
|