1
|
Čivić J, McFarlane NR, Masschelein J, Harvey JN. Exploring the selectivity of cytochrome P450 for enhanced novel anticancer agent synthesis. Faraday Discuss 2024; 252:69-88. [PMID: 38855920 DOI: 10.1039/d4fd00004h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Cytochrome P450 monooxygenases are an extensive and unique class of enzymes, which can regio- and stereo-selectively functionalise hydrocarbons by way of oxidation reactions. These enzymes are naturally occurring but have also been extensively applied in a synthesis context, where they are used as efficient biocatalysts. Recently, a biosynthetic pathway where a cytochrome P450 monooxygenase catalyses a critical step of the pathway was uncovered, leading to the production of a number of products that display high antitumour potency. In this work, we use computational techniques to gain insight into the factors that determine the relative yields of the different products. We use conformational search algorithms to understand the substrate stereochemistry. On a machine-learned 3D protein structure, we use molecular docking to obtain a library of favourable poses for substrate-protein interaction. With molecular dynamics, we investigate the most favourable poses for reactivity on a molecular level, allowing us to investigate which protein-substrate interactions favour a given product and thus gain insight into the product selectivity.
Collapse
Affiliation(s)
- Janko Čivić
- Department of Chemistry, KU Leuven, Celestijnenlaan 200F, B-3001 Leuven, Belgium.
| | - Neil R McFarlane
- Department of Chemistry, KU Leuven, Celestijnenlaan 200F, B-3001 Leuven, Belgium.
| | - Joleen Masschelein
- Department of Biology, Vlaams Instituut voor Biotechnologie VIB-KU Leuven Center for Microbiology, Leuven, Belgium
| | - Jeremy N Harvey
- Department of Chemistry, KU Leuven, Celestijnenlaan 200F, B-3001 Leuven, Belgium.
| |
Collapse
|
2
|
Gao X, Baimacheva N, Aires-de-Sousa J. Exploring Molecular Heteroencoders with Latent Space Arithmetic: Atomic Descriptors and Molecular Operators. Molecules 2024; 29:3969. [PMID: 39203047 PMCID: PMC11357237 DOI: 10.3390/molecules29163969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 08/04/2024] [Accepted: 08/06/2024] [Indexed: 09/03/2024] Open
Abstract
A variational heteroencoder based on recurrent neural networks, trained with SMILES linear notations of molecular structures, was used to derive the following atomic descriptors: delta latent space vectors (DLSVs) obtained from the original SMILES of the whole molecule and the SMILES of the same molecule with the target atom replaced. Different replacements were explored, namely, changing the atomic element, replacement with a character of the model vocabulary not used in the training set, or the removal of the target atom from the SMILES. Unsupervised mapping of the DLSV descriptors with t-distributed stochastic neighbor embedding (t-SNE) revealed a remarkable clustering according to the atomic element, hybridization, atomic type, and aromaticity. Atomic DLSV descriptors were used to train machine learning (ML) models to predict 19F NMR chemical shifts. An R2 of up to 0.89 and mean absolute errors of up to 5.5 ppm were obtained for an independent test set of 1046 molecules with random forests or a gradient-boosting regressor. Intermediate representations from a Transformer model yielded comparable results. Furthermore, DLSVs were applied as molecular operators in the latent space: the DLSV of a halogenation (H→F substitution) was summed to the LSVs of 4135 new molecules with no fluorine atom and decoded into SMILES, yielding 99% of valid SMILES, with 75% of the SMILES incorporating fluorine and 56% of the structures incorporating fluorine with no other structural change.
Collapse
Affiliation(s)
- Xinyue Gao
- Faculty of Sciences, Université Paris Cité, 75013 Paris, France
| | - Natalia Baimacheva
- Faculty of Chemistry, University of Strasbourg, 4, Blaise Pascal Str., 67081 Strasbourg, France
| | - Joao Aires-de-Sousa
- LAQV and REQUIMTE, Chemistry Department, NOVA School of Science and Technology, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal
| |
Collapse
|
3
|
Metz TO, Chang CH, Gautam V, Anjum A, Tian S, Wang F, Colby SM, Nunez JR, Blumer MR, Edison AS, Fiehn O, Jones DP, Li S, Morgan ET, Patti GJ, Ross DH, Shapiro MR, Williams AJ, Wishart DS. Introducing 'identification probability' for automated and transferable assessment of metabolite identification confidence in metabolomics and related studies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.30.605945. [PMID: 39131324 PMCID: PMC11312557 DOI: 10.1101/2024.07.30.605945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Methods for assessing compound identification confidence in metabolomics and related studies have been debated and actively researched for the past two decades. The earliest effort in 2007 focused primarily on mass spectrometry and nuclear magnetic resonance spectroscopy and resulted in four recommended levels of metabolite identification confidence - the Metabolite Standards Initiative (MSI) Levels. In 2014, the original MSI Levels were expanded to five levels (including two sublevels) to facilitate communication of compound identification confidence in high resolution mass spectrometry studies. Further refinement in identification levels have occurred, for example to accommodate use of ion mobility spectrometry in metabolomics workflows, and alternate approaches to communicate compound identification confidence also have been developed based on identification points schema. However, neither qualitative levels of identification confidence nor quantitative scoring systems address the degree of ambiguity in compound identifications in context of the chemical space being considered, are easily automated, or are transferable between analytical platforms. In this perspective, we propose that the metabolomics and related communities consider identification probability as an approach for automated and transferable assessment of compound identification and ambiguity in metabolomics and related studies. Identification probability is defined simply as 1/N, where N is the number of compounds in a reference library or chemical space that match to an experimentally measured molecule within user-defined measurement precision(s), for example mass measurement or retention time accuracy, etc. We demonstrate the utility of identification probability in an in silico analysis of multi-property reference libraries constructed from the Human Metabolome Database and computational property predictions, provide guidance to the community in transparent implementation of the concept, and invite the community to further evaluate this concept in parallel with their current preferred methods for assessing metabolite identification confidence.
Collapse
Affiliation(s)
- Thomas O. Metz
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA USA
| | - Christine H. Chang
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA USA
| | - Vasuk Gautam
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| | - Afia Anjum
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| | - Siyang Tian
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| | - Fei Wang
- Department of Computing Science, University of Alberta, Edmonton, AB, Canada
- Alberta Machine Intelligence Institute, Edmonton, AB, Canada
| | - Sean M. Colby
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA USA
| | - Jamie R. Nunez
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA USA
| | - Madison R. Blumer
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA USA
| | - Arthur S. Edison
- Department of Biochemistry & Molecular Biology, Complex Carbohydrate Research Center and Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| | - Oliver Fiehn
- West Coast Metabolomics Center, University of California Davis, Davis, CA, USA
| | - Dean P. Jones
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University, Atlanta, Georgia, USA
| | - Shuzhao Li
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Edward T. Morgan
- Department of Pharmacology and Chemical Biology, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Gary J. Patti
- Center for Mass Spectrometry and Metabolic Tracing, Department of Chemistry, Department of Medicine, Washington University, Saint Louis, Missouri, USA
| | - Dylan H. Ross
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA USA
| | - Madelyn R. Shapiro
- Artificial Intelligence & Data Analytics Division, Pacific Northwest National Laboratory, Richland, WA USA
| | - Antony J. Williams
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), Research Triangle Park, NC USA
| | - David S. Wishart
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| |
Collapse
|
4
|
Zhu K, Huang M, Wang Y, Gu Y, Li W, Liu G, Tang Y. MetaPredictor: in silico prediction of drug metabolites based on deep language models with prompt engineering. Brief Bioinform 2024; 25:bbae374. [PMID: 39082648 PMCID: PMC11289679 DOI: 10.1093/bib/bbae374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 07/02/2024] [Accepted: 07/16/2024] [Indexed: 08/03/2024] Open
Abstract
Metabolic processes can transform a drug into metabolites with different properties that may affect its efficacy and safety. Therefore, investigation of the metabolic fate of a drug candidate is of great significance for drug discovery. Computational methods have been developed to predict drug metabolites, but most of them suffer from two main obstacles: the lack of model generalization due to restrictions on metabolic transformation rules or specific enzyme families, and high rate of false-positive predictions. Here, we presented MetaPredictor, a rule-free, end-to-end and prompt-based method to predict possible human metabolites of small molecules including drugs as a sequence translation problem. We innovatively introduced prompt engineering into deep language models to enrich domain knowledge and guide decision-making. The results showed that using prompts that specify the sites of metabolism (SoMs) can steer the model to propose more accurate metabolite predictions, achieving a 30.4% increase in recall and a 16.8% reduction in false positives over the baseline model. The transfer learning strategy was also utilized to tackle the limited availability of metabolic data. For the adaptation to automatic or non-expert prediction, MetaPredictor was designed as a two-stage schema consisting of automatic identification of SoMs followed by metabolite prediction. Compared to four available drug metabolite prediction tools, our method showed comparable performance on the major enzyme families and better generalization that could additionally identify metabolites catalyzed by less common enzymes. The results indicated that MetaPredictor could provide a more comprehensive and accurate prediction of drug metabolism through the effective combination of transfer learning and prompt-based learning strategies.
Collapse
Affiliation(s)
- Keyun Zhu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Mengting Huang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yimeng Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yaxin Gu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| |
Collapse
|
5
|
Öeren M, Hunt PA, Wharrick CE, Tabatabaei Ghomi H, Segall MD. Predicting routes of phase I and II metabolism based on quantum mechanics and machine learning. Xenobiotica 2024; 54:379-393. [PMID: 37966132 DOI: 10.1080/00498254.2023.2284251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 11/13/2023] [Indexed: 11/16/2023]
Abstract
Unexpected metabolism could lead to the failure of many late-stage drug candidates or even the withdrawal of approved drugs. Thus, it is critical to predict and study the dominant routes of metabolism in the early stages of research.We describe the development and validation of a 'WhichEnzyme' model that accurately predicts the enzyme families most likely to be responsible for a drug-like molecule's metabolism. Furthermore, we combine this model with our previously published regioselectivity models for Cytochromes P450, Aldehyde Oxidases, Flavin-containing Monooxygenases, UDP-glucuronosyltransferases and Sulfotransferases - the most important Phase I and Phase II drug metabolising enzymes - and a 'WhichP450' model that predicts the Cytochrome P450 isoform(s) responsible for a compound's metabolism.The regioselectivity models are based on a mechanistic understanding of these enzymes' actions and use quantum mechanical simulations with machine learning methods to accurately predict sites of metabolism and the resulting metabolites. We train heuristics based on the outputs of the 'WhichEnzyme', 'WhichP450', and regioselectivity models to determine the most likely routes of metabolism and metabolites to be observed experimentally.Finally, we demonstrate that this combination delivers high sensitivity in identifying experimentally reported metabolites and higher precision than other methods for predicting in vivo metabolite profiles.
Collapse
Affiliation(s)
- Mario Öeren
- Optibrium Limited, Cambridge Innovation Park, Cambridge, UK
| | - Peter A Hunt
- Optibrium Limited, Cambridge Innovation Park, Cambridge, UK
| | | | | | | |
Collapse
|
6
|
Groff L, Williams A, Shah I, Patlewicz G. MetSim: Integrated Programmatic Access and Pathway Management for Xenobiotic Metabolism Simulators. Chem Res Toxicol 2024; 37:685-697. [PMID: 38598715 PMCID: PMC11325951 DOI: 10.1021/acs.chemrestox.3c00398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
Xenobiotic metabolism is a key consideration in evaluating the hazards and risks posed by environmental chemicals. A number of software tools exist that are capable of simulating metabolites, but each reports its predictions in a different format and with varying levels of detail. This makes comparing the performance and coverage of the tools a practical challenge. To address this shortcoming, we developed a metabolic simulation framework called MetSim, which comprises three main components. A graph-based schema was developed to allow metabolism information to be harmonized. The schema was implemented in MongoDB to store and retrieve metabolic graphs for subsequent analysis. MetSim currently includes an application programming interface for four metabolic simulators: BioTransformer, the OECD Toolbox, EPA's chemical transformation simulator (CTS), and tissue metabolism simulator (TIMES). Lastly, MetSim provides functions to help evaluate simulator performance for specific data sets. In this study, a set of 112 drugs with 432 reported metabolites were compiled, and predictions were made using the 4 simulators. Fifty-nine of the 112 drugs were taken from the Small Molecule Pathway Database, with the remainder sourced from the literature. The human models within BioTransformer and CTS (Phase I only) and the rat models within TIMES and the OECD Toolbox (Phase I only) were used to make predictions for the chemicals in the data set. The recall and precision (recall, precision) ranked in order of highest recall for each individual tool were CTS (0.54, 0.017), BioTransformer (0.50, 0.008), Toolbox in vitro (0.40, 0.144), TIMES in vivo (0.40, 0.133), Toolbox in vivo (0.40, 0.118), and TIMES in vitro (0.39, 0.128). Combining all of the model predictions together increased the overall recall (0.73, 0.008). MetSim enabled insights into the performance and coverage of in silico metabolic simulators to be more efficiently derived, which in turn should aid future efforts to evaluate other data sets.
Collapse
Affiliation(s)
- Louis Groff
- Center for Computational Toxicology and Exposure (CCTE), Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Antony Williams
- Center for Computational Toxicology and Exposure (CCTE), Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Imran Shah
- Center for Computational Toxicology and Exposure (CCTE), Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Grace Patlewicz
- Center for Computational Toxicology and Exposure (CCTE), Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| |
Collapse
|
7
|
Yang H, Liu J, Chen K, Cong S, Cai S, Li Y, Jia Z, Wu H, Lou T, Wei Z, Yang X, Xiao H. D-CyPre: a machine learning-based tool for accurate prediction of human CYP450 enzyme metabolic sites. PeerJ Comput Sci 2024; 10:e2040. [PMID: 38855237 PMCID: PMC11157575 DOI: 10.7717/peerj-cs.2040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 04/15/2024] [Indexed: 06/11/2024]
Abstract
The advancement of graph neural networks (GNNs) has made it possible to accurately predict metabolic sites. Despite the combination of GNNs with XGBOOST showing impressive performance, this technology has not yet been applied in the realm of metabolic site prediction. Previous metabolic site prediction tools focused on bonds and atoms, regardless of the overall molecular skeleton. This study introduces a novel tool, named D-CyPre, that amalgamates atom, bond, and molecular skeleton information via two directed message-passing neural networks (D-MPNN) to predict the metabolic sites of the nine cytochrome P450 enzymes using XGBOOST. In D-CyPre Precision Mode, the model produces fewer, but more accurate results (Jaccard score: 0.497, F1: 0.660, and precision: 0.737 in the test set). In D-CyPre Recall Mode, the model produces less accurate, but more comprehensive results (Jaccard score: 0.506, F1: 0.669, and recall: 0.720 in the test set). In the test set of 68 reactants, D-CyPre outperformed BioTransformer on all isoenzymes and CyProduct on most isoenzymes (5/9). For the subtypes where D-CyPre outperformed CyProducts, the Jaccard score and F1 scores increased by 24% and 16% in Precision Mode (4/9) and 19% and 12% in Recall Mode (5/9), respectively, relative to the second-best CyProduct. Overall, D-CyPre provides more accurate prediction results for human CYP450 enzyme metabolic sites.
Collapse
Affiliation(s)
- Haolan Yang
- School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China
- Beijing University of Chinese Medicine, Research Center of Chinese Medicine Analysis and Transformation, Beijing, China
| | - Jie Liu
- Beijing University of Chinese Medicine, Research Center of Chinese Medicine Analysis and Transformation, Beijing, China
| | - Kui Chen
- School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China
- Beijing University of Chinese Medicine, Research Center of Chinese Medicine Analysis and Transformation, Beijing, China
| | - Shiyu Cong
- School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China
- Beijing University of Chinese Medicine, Research Center of Chinese Medicine Analysis and Transformation, Beijing, China
| | - Shengnan Cai
- School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China
- Beijing University of Chinese Medicine, Research Center of Chinese Medicine Analysis and Transformation, Beijing, China
| | - Yueting Li
- School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China
- Beijing University of Chinese Medicine, Research Center of Chinese Medicine Analysis and Transformation, Beijing, China
| | - Zhixin Jia
- Beijing University of Chinese Medicine, Research Center of Chinese Medicine Analysis and Transformation, Beijing, China
| | - Hao Wu
- School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China
- Beijing University of Chinese Medicine, Research Center of Chinese Medicine Analysis and Transformation, Beijing, China
| | - Tianyu Lou
- School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China
- Beijing University of Chinese Medicine, Research Center of Chinese Medicine Analysis and Transformation, Beijing, China
| | - Zuying Wei
- School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China
- Beijing University of Chinese Medicine, Research Center of Chinese Medicine Analysis and Transformation, Beijing, China
| | - Xiaoqin Yang
- School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China
- Beijing University of Chinese Medicine, Research Center of Chinese Medicine Analysis and Transformation, Beijing, China
| | - Hongbin Xiao
- Beijing University of Chinese Medicine, Research Center of Chinese Medicine Analysis and Transformation, Beijing, China
| |
Collapse
|
8
|
Dvořák Z, Vyhlídalová B, Pečinková P, Li H, Anzenbacher P, Špičáková A, Anzenbacherová E, Chow V, Liu J, Krause H, Wilson D, Berés T, Tarkowski P, Chen D, Mani S. In vitro safety signals for potential clinical development of the anti-inflammatory pregnane X receptor agonist FKK6. Bioorg Chem 2024; 144:107137. [PMID: 38245951 DOI: 10.1016/j.bioorg.2024.107137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 12/25/2023] [Accepted: 01/14/2024] [Indexed: 01/23/2024]
Abstract
Based on the mimicry of microbial metabolites, functionalized indoles were demonstrated as the ligands and agonists of the pregnane X receptor (PXR). The lead indole, FKK6, displayed PXR-dependent protective effects in DSS-induced colitis in mice and in vitro cytokine-treated intestinal organoid cultures. Here, we report on the initial in vitro pharmacological profiling of FKK6. FKK6-PXR interactions were characterized by hydrogen-deuterium exchange mass spectrometry. Screening FKK6 against potential cellular off-targets (G protein-coupled receptors, steroid and nuclear receptors, ion channels, and xenobiotic membrane transporters) revealed high PXR selectivity. FKK6 has poor aqueous solubility but was highly soluble in simulated gastric and intestinal fluids. A large fraction of FKK6 was bound to plasma proteins and chemically stable in plasma. The partition coefficient of FKK6 was 2.70, and FKK6 moderately partitioned into red blood cells. In Caco2 cells, FKK6 displayed high permeability (A-B: 22.8 × 10-6 cm.s-1) and no active efflux. These data are indicative of essentially complete in vivo absorption of FKK6. The data from human liver microsomes indicated that FKK6 is rapidly metabolized by cytochromes P450 (t1/2 5 min), notably by CYP3A4. Two oxidized FKK6 derivatives, including DC73 (N6-oxide) and DC97 (C19-phenol), were detected, and these metabolites had 5-7 × lower potency as PXR agonists than FKK6. This implies that despite high intestinal absorption, FKK6 is rapidly eliminated by the liver, and its PXR effects are predicted to be predominantly in the intestines. In conclusion, the PXR ligand and agonist FKK6 has a suitable pharmacological profile supporting its potential preclinical development.
Collapse
Affiliation(s)
- Zdeněk Dvořák
- Department of Cell Biology and Genetics, Faculty of Science, Palacký University, Šlechtitelů 27, 783 71 Olomouc, Czech Republic.
| | - Barbora Vyhlídalová
- Department of Cell Biology and Genetics, Faculty of Science, Palacký University, Šlechtitelů 27, 783 71 Olomouc, Czech Republic
| | - Petra Pečinková
- Department of Cell Biology and Genetics, Faculty of Science, Palacký University, Šlechtitelů 27, 783 71 Olomouc, Czech Republic
| | - Hao Li
- Department of Medicine and Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Pavel Anzenbacher
- Department of Pharmacology, Faculty of Medicine and Dentistry, Palacký University, Hněvotínská 5, 779 00 Olomouc, Czech Republic
| | - Alena Špičáková
- Department of Pharmacology, Faculty of Medicine and Dentistry, Palacký University, Hněvotínská 5, 779 00 Olomouc, Czech Republic
| | - Eva Anzenbacherová
- Department of Medical Chemistry and Biochemistry, Faculty of Medicine and Dentistry, Palacký University, Hněvotínská 5, 779 00 Olomouc, Czech Republic
| | - Vimanda Chow
- Department of Chemistry, York University, 6 Thompson Road, M3J 1L3, ON, Toronto, Canada
| | - Jiabao Liu
- Department of Molecular Genetics, Donnelly Centre for Cellular and Biomolecular Research, 160 College Street, M5S 3E1, ON, Toronto, Canada
| | - Henry Krause
- Department of Molecular Genetics, Donnelly Centre for Cellular and Biomolecular Research, 160 College Street, M5S 3E1, ON, Toronto, Canada
| | - Derek Wilson
- Department of Chemistry, York University, 6 Thompson Road, M3J 1L3, ON, Toronto, Canada
| | - Tibor Berés
- Czech Advanced Technology and Research Institute, Palacký University, Šlechtitelů 27, 783 71 Olomouc, Czech Republic
| | - Petr Tarkowski
- Czech Advanced Technology and Research Institute, Palacký University, Šlechtitelů 27, 783 71 Olomouc, Czech Republic; Department of Genetic Resources for Vegetables, Medicinal and Special Plants, Centre of the Region Haná for Biotechnological and Agricultural Research, Crop Research Institute, Šlechtitelů 27, 783 71 Olomouc, Czech Republic
| | - Dajun Chen
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Sridhar Mani
- Department of Medicine and Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA.
| |
Collapse
|
9
|
Chen Y, Seidel T, Jacob RA, Hirte S, Mazzolari A, Pedretti A, Vistoli G, Langer T, Miljković F, Kirchmair J. Active Learning Approach for Guiding Site-of-Metabolism Measurement and Annotation. J Chem Inf Model 2024; 64:348-358. [PMID: 38170877 PMCID: PMC10806800 DOI: 10.1021/acs.jcim.3c01588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 11/30/2023] [Accepted: 12/18/2023] [Indexed: 01/05/2024]
Abstract
The ability to determine and predict metabolically labile atom positions in a molecule (also called "sites of metabolism" or "SoMs") is of high interest to the design and optimization of bioactive compounds, such as drugs, agrochemicals, and cosmetics. In recent years, several in silico models for SoM prediction have become available, many of which include a machine-learning component. The bottleneck in advancing these approaches is the coverage of distinct atom environments and rare and complex biotransformation events with high-quality experimental data. Pharmaceutical companies typically have measured metabolism data available for several hundred to several thousand compounds. However, even for metabolism experts, interpreting these data and assigning SoMs are challenging and time-consuming. Therefore, a significant proportion of the potential of the existing metabolism data, particularly in machine learning, remains dormant. Here, we report on the development and validation of an active learning approach that identifies the most informative atoms across molecular data sets for SoM annotation. The active learning approach, built on a highly efficient reimplementation of SoM predictor FAME 3, enables experts to prioritize their SoM experimental measurements and annotation efforts on the most rewarding atom environments. We show that this active learning approach yields competitive SoM predictors while requiring the annotation of only 20% of the atom positions required by FAME 3. The source code of the approach presented in this work is publicly available.
Collapse
Affiliation(s)
- Ya Chen
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry,
Faculty of Life Sciences, University of
Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Thomas Seidel
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry,
Faculty of Life Sciences, University of
Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, Department
for Pharmaceutical Sciences, University
of Vienna, 1090 Vienna, Austria
| | - Roxane Axel Jacob
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry,
Faculty of Life Sciences, University of
Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, Department
for Pharmaceutical Sciences, University
of Vienna, 1090 Vienna, Austria
- Vienna
Doctoral School of Pharmaceutical, Nutritional and Sport Sciences
(PhaNuSpo), University of Vienna, 1090 Vienna, Austria
| | - Steffen Hirte
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry,
Faculty of Life Sciences, University of
Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- Vienna
Doctoral School of Pharmaceutical, Nutritional and Sport Sciences
(PhaNuSpo), University of Vienna, 1090 Vienna, Austria
| | - Angelica Mazzolari
- Dipartimento
di Scienze Farmaceutiche, Università
degli Studi di Milano, I-20133 Milano, Italy
| | - Alessandro Pedretti
- Dipartimento
di Scienze Farmaceutiche, Università
degli Studi di Milano, I-20133 Milano, Italy
| | - Giulio Vistoli
- Dipartimento
di Scienze Farmaceutiche, Università
degli Studi di Milano, I-20133 Milano, Italy
| | - Thierry Langer
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry,
Faculty of Life Sciences, University of
Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, Department
for Pharmaceutical Sciences, University
of Vienna, 1090 Vienna, Austria
| | - Filip Miljković
- Medicinal
Chemistry, Research and Early Development, Cardiovascular, Renal and
Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Pepparedsleden 1, SE-43183 Gothenburg, Sweden
| | - Johannes Kirchmair
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry,
Faculty of Life Sciences, University of
Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, Department
for Pharmaceutical Sciences, University
of Vienna, 1090 Vienna, Austria
| |
Collapse
|
10
|
Scholz VA, Stork C, Frericks M, Kirchmair J. Computational prediction of the metabolites of agrochemicals formed in rats. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 895:165039. [PMID: 37355108 DOI: 10.1016/j.scitotenv.2023.165039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 06/17/2023] [Accepted: 06/19/2023] [Indexed: 06/26/2023]
Abstract
Today, computational tools for the prediction of the metabolite structures of xenobiotics are widely available and employed in small-molecule research. Reflecting the availability of measured data, these in silico tools are trained and validated primarily on drug metabolism data. In this work, we assessed the capacity of five leading metabolite structure predictors to represent the metabolism of agrochemicals observed in rats. More specifically, we tested the ability of SyGMa, GLORY, GLORYx, BioTransformer 3.0, and MetaTrans to correctly predict and rank the experimentally observed metabolites of a set of 85 parent compounds. We found that the models were able to recover about one to two-thirds of the experimentally observed first-generation, second-generation and third-generation metabolites, confirming their value in applications such as metabolite identification. However, precision was low for all investigated tools and did not exceed approximately 18 % for the pool of first-generation metabolites and 2 % for the pool of compounds representing the first three generations of metabolites. The variance in prediction success rates was high across the individual metabolic maps, meaning that outcomes depend strongly on the specific compound under investigation. We also found that the predictions for individual parent compounds differed strongly between the tools, particularly between those built on orthogonal technologies (e.g., rule-based and end-to-end machine learning approaches). This renders ensemble model strategies promising for improving success rates. Overall, the results of this benchmark study show that there is still considerable room for the improvement of metabolite structure predictors left. Our discussion points out several avenues to progress. The bottleneck in method development certainly has been, and will remain, for the foreseeable future, the limited quantity and quality of available measured data on small-molecule metabolism.
Collapse
Affiliation(s)
- Vincent-Alexander Scholz
- Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria; Vienna Doctoral School of Pharmaceutical, Nutritional and Sport Sciences (PhaNuSpo), University of Vienna, 1090 Vienna, Austria
| | | | | | - Johannes Kirchmair
- Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria; Christian Doppler Laboratory for Molecular Informatics in the Biosciences, Department for Pharmaceutical Sciences, University of Vienna, 1090 Vienna, Austria.
| |
Collapse
|
11
|
Kate A, Seth E, Singh A, Chakole CM, Chauhan MK, Singh RK, Maddalwar S, Mishra M. Artificial Intelligence for Computer-Aided Drug Discovery. Drug Res (Stuttg) 2023; 73:369-377. [PMID: 37276884 DOI: 10.1055/a-2076-3359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The continuous implementation of Artificial Intelligence (AI) in multiple scientific domains and the rapid advancement in computer software and hardware, along with other parameters, have rapidly fuelled this development. The technology can contribute effectively in solving many challenges and constraints in the traditional development of the drug. Traditionally, large-scale chemical libraries are screened to find one promising medicine. In recent years, more reasonable structure-based drug design approaches have avoided the first screening phases while still requiring chemists to design, synthesize, and test a wide range of compounds to produce possible novel medications. The process of turning a promising chemical into a medicinal candidate can be expensive and time-consuming. Additionally, a new medication candidate may still fail in clinical trials even after demonstrating promise in laboratory research. In fact, less than 10% of medication candidates that undergo Phase I trials really reach the market. As a consequence, the unmatched data processing power of AI systems may expedite and enhance the drug development process in four different ways: by opening up links to novel biological systems, superior or distinctive chemistry, greater success rates, and faster and less expensive innovation trials. Since these technologies may be used to address a variety of discovery scenarios and biological targets, it is essential to comprehend and distinguish between use cases. As a result, we have emphasized how AI may be used in a variety of areas of the pharmaceutical sciences, including in-depth opportunities for drug research and development.
Collapse
Affiliation(s)
- Aditya Kate
- Amity Institute of Biotechnology, Amity University, Chhattisgarh, India
| | - Ekkita Seth
- Amity Institute of Biotechnology, Amity University, Chhattisgarh, India
| | - Ananya Singh
- Amity Institute of Biotechnology, Amity University, Chhattisgarh, India
| | - Chandrashekhar Mahadeo Chakole
- Bajiraoji Karanjekar college of Pharmacy, Sakoli, Dist-Bhandara, India
- NDDS Research Lab, Delhi Institute of Pharmaceutical Sciences and Research, DPSR-University, New Delhi
| | - Meenakshi Kanwar Chauhan
- NDDS Research Lab, Delhi Institute of Pharmaceutical Sciences and Research, DPSR-University, New Delhi
| | - Ravi Kant Singh
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Noida, India
| | | | - Mohit Mishra
- Amity Institute of Biotechnology, Amity University, Chhattisgarh, India
| |
Collapse
|
12
|
Feng Y, Gong C, Zhu J, Liu G, Tang Y, Li W. Prediction of Sites of Metabolism of CYP3A4 Substrates Utilizing Docking-Derived Geometric Features. J Chem Inf Model 2023. [PMID: 37336765 DOI: 10.1021/acs.jcim.3c00549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/21/2023]
Abstract
Cytochrome P450 3A4 (CYP3A4) is one of the major drug-metabolizing enzymes in the human body and is responsible for the metabolism of ∼50% of clinically used drugs. Therefore, the identification of the compound's sites of metabolism (SOMs) mediated by CYP3A4 is of utmost importance in the early stage of drug discovery and development. Herein, docking-based approaches incorporating geometric features were used for SOMs prediction of CYP3A4 substrates. The cross-docking poses of a relatively large data set containing 474 substrates were analyzed in depth, and a widely observed geometric pattern called the close proximity of SOMs was derived from the poses. On the basis of the close proximity, several structure-based models have been constructed, which demonstrated better performance than those structure-based models using the criterion of Fe-SOM distance. For further improving the prediction performance, the structure-based models were also combined with the well-known ligand-based model SMARTCyp. One combined model exhibited good performance on the SOMs prediction of an external substrate set containing kinase inhibitors, PROTACs, approved drugs, and some lead compounds.
Collapse
Affiliation(s)
- Yanjun Feng
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Changda Gong
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Jieyu Zhu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| |
Collapse
|
13
|
Öeren M, Kaempf SC, Ponting DJ, Hunt PA, Segall MD. Predicting Regioselectivity of Cytosolic Sulfotransferase Metabolism for Drugs. J Chem Inf Model 2023. [PMID: 37229540 DOI: 10.1021/acs.jcim.3c00275] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Cytosolic sulfotransferases (SULTs) are a family of enzymes responsible for the sulfation of small endogenous and exogenous compounds. SULTs contribute to the conjugation phase of metabolism and share substrates with the uridine 5'-diphospho-glucuronosyltransferase (UGT) family of enzymes. UGTs are considered to be the most important enzymes in the conjugation phase, and SULTs are an auxiliary enzyme system to them. Understanding how the regioselectivity of SULTs differs from that of UGTs is essential from the perspective of developing novel drug candidates. We present a general ligand-based SULT model trained and tested using high-quality experimental regioselectivity data. The current study suggests that, unlike other metabolic enzymes in the modification and conjugation phases, the SULT regioselectivity is not strongly influenced by the activation energy of the rate-limiting step of the catalysis. Instead, the prominent role is played by the substrate binding site of SULT. Thus, the model is trained only on steric and orientation descriptors, which mimic the binding pocket of SULT. The resulting classification model, which predicts whether a site is metabolized, achieved a Cohen's kappa of 0.71.
Collapse
Affiliation(s)
- Mario Öeren
- Cambridge Innovation Park, Optibrium Limited, Denny End Road, Cambridge CB25 9GL, U.K
| | - Sylvia C Kaempf
- Cambridge Innovation Park, Optibrium Limited, Denny End Road, Cambridge CB25 9GL, U.K
- School of Chemistry, North Haugh, University of St Andrews, St Andrews KY16 9ST, U.K
| | - David J Ponting
- Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds LS11 5PS, U.K
| | - Peter A Hunt
- Cambridge Innovation Park, Optibrium Limited, Denny End Road, Cambridge CB25 9GL, U.K
| | - Matthew D Segall
- Cambridge Innovation Park, Optibrium Limited, Denny End Road, Cambridge CB25 9GL, U.K
| |
Collapse
|
14
|
Tran TTV, Tayara H, Chong KT. Artificial Intelligence in Drug Metabolism and Excretion Prediction: Recent Advances, Challenges, and Future Perspectives. Pharmaceutics 2023; 15:1260. [PMID: 37111744 PMCID: PMC10143484 DOI: 10.3390/pharmaceutics15041260] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 04/07/2023] [Accepted: 04/14/2023] [Indexed: 04/29/2023] Open
Abstract
Drug metabolism and excretion play crucial roles in determining the efficacy and safety of drug candidates, and predicting these processes is an essential part of drug discovery and development. In recent years, artificial intelligence (AI) has emerged as a powerful tool for predicting drug metabolism and excretion, offering the potential to speed up drug development and improve clinical success rates. This review highlights recent advances in AI-based drug metabolism and excretion prediction, including deep learning and machine learning algorithms. We provide a list of public data sources and free prediction tools for the research community. We also discuss the challenges associated with the development of AI models for drug metabolism and excretion prediction and explore future perspectives in the field. We hope this will be a helpful resource for anyone who is researching in silico drug metabolism, excretion, and pharmacokinetic properties.
Collapse
Affiliation(s)
- Thi Tuyet Van Tran
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea;
- Faculty of Information Technology, An Giang University, Long Xuyen 880000, Vietnam
- Vietnam National University—Ho Chi Minh City, Ho Chi Minh 700000, Vietnam
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, Republic of Korea
| | - Kil To Chong
- Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, Republic of Korea
| |
Collapse
|
15
|
Öeren M, Walton PJ, Suri J, Ponting DJ, Hunt PA, Segall MD. Predicting Regioselectivity of AO, CYP, FMO, and UGT Metabolism Using Quantum Mechanical Simulations and Machine Learning. J Med Chem 2022; 65:14066-14081. [PMID: 36239985 DOI: 10.1021/acs.jmedchem.2c01303] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Unexpected metabolism in modification and conjugation phases can lead to the failure of many late-stage drug candidates or even withdrawal of approved drugs. Thus, it is critical to predict the sites of metabolism (SoM) for enzymes, which interact with drug-like molecules, in the early stages of the research. This study presents methods for predicting the isoform-specific metabolism for human AOs, FMOs, and UGTs and general CYP metabolism for preclinical species. The models use semi-empirical quantum mechanical simulations, validated using experimentally obtained data and DFT calculations, to estimate the reactivity of each SoM in the context of the whole molecule. Ligand-based models, trained and tested using high-quality regioselectivity data, combine the reactivity of the potential SoM with the orientation and steric effects of the binding pockets of the different enzyme isoforms. The resulting models achieve κ values of up to 0.94 and AUC of up to 0.92.
Collapse
Affiliation(s)
- Mario Öeren
- Optibrium Limited, Cambridge Innovation Park, Denny End Road, Cambridge CB25 9GL, U.K
| | - Peter J Walton
- Optibrium Limited, Cambridge Innovation Park, Denny End Road, Cambridge CB25 9GL, U.K.,School of Chemistry, University of Nottingham, University Park, Nottingham NG7 2RD, U.K
| | - James Suri
- Optibrium Limited, Cambridge Innovation Park, Denny End Road, Cambridge CB25 9GL, U.K.,School of Chemistry, University of St Andrews, North Haugh, St Andrews KY16 9ST, U.K
| | - David J Ponting
- Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds LS11 5PS, U.K
| | - Peter A Hunt
- Optibrium Limited, Cambridge Innovation Park, Denny End Road, Cambridge CB25 9GL, U.K
| | - Matthew D Segall
- Optibrium Limited, Cambridge Innovation Park, Denny End Road, Cambridge CB25 9GL, U.K
| |
Collapse
|
16
|
Goncalves R, Pelletier R, Couette A, Gicquel T, Le Daré B. Suitability of high-resolution mass spectrometry in analytical toxicology: Focus on drugs of abuse. TOXICOLOGIE ANALYTIQUE ET CLINIQUE 2022. [DOI: 10.1016/j.toxac.2021.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
17
|
Ertl P, Gerebtzoff G, Lewis RA, Muenkler H, Schneider N, Sirockin F, Stiefl N, Tosco P. Chemical reactivity prediction: current methods and different application areas. Mol Inform 2021; 41:e2100277. [PMID: 34964302 DOI: 10.1002/minf.202100277] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 12/28/2021] [Indexed: 11/10/2022]
Abstract
The ability to predict chemical reactivity of a molecule is highly desirable in drug discovery, both ex vivo (synthetic route planning, formulation, stability) and in vivo: metabolic reactions determine pharmacodynamics, pharmacokinetics and potential toxic effects, and early assessment of liabilities is vital to reduce attrition rates in later stages of development. Quantum mechanics offer a precise description of the interactions between electrons and orbitals in the breaking and forming of new bonds. Modern algorithms and faster computers have allowed the study of more complex systems in a punctual and accurate fashion, and answers for chemical questions around stability and reactivity can now be provided. Through machine learning, predictive models can be built out of descriptors derived from quantum mechanics and cheminformatics, even in the absence of experimental data to train on. In this article, current progress on computational reactivity prediction is reviewed: applications to problems in drug design, such as modelling of metabolism and covalent inhibition, are highlighted and unmet challenges are posed.
Collapse
Affiliation(s)
| | | | - Richard A Lewis
- Computer-Aided Drug Design, Eli Lilly and Company Limited, Windlesham, SWITZERLAND
| | - Hagen Muenkler
- Novartis Institutes for BioMedical Research Inc, SWITZERLAND
| | | | | | | | - Paolo Tosco
- Novartis Institutes for BioMedical Research Inc, SWITZERLAND
| |
Collapse
|
18
|
Muller C, Rabal O, Diaz Gonzalez C. Artificial Intelligence, Machine Learning, and Deep Learning in Real-Life Drug Design Cases. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021; 2390:383-407. [PMID: 34731478 DOI: 10.1007/978-1-0716-1787-8_16] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The discovery and development of drugs is a long and expensive process with a high attrition rate. Computational drug discovery contributes to ligand discovery and optimization, by using models that describe the properties of ligands and their interactions with biological targets. In recent years, artificial intelligence (AI) has made remarkable modeling progress, driven by new algorithms and by the increase in computing power and storage capacities, which allow the processing of large amounts of data in a short time. This review provides the current state of the art of AI methods applied to drug discovery, with a focus on structure- and ligand-based virtual screening, library design and high-throughput analysis, drug repurposing and drug sensitivity, de novo design, chemical reactions and synthetic accessibility, ADMET, and quantum mechanics.
Collapse
Affiliation(s)
- Christophe Muller
- Evotec (France) SAS, Computational Drug Discovery, Integrated Drug Discovery, Toulouse, France
| | - Obdulia Rabal
- Evotec (France) SAS, Computational Drug Discovery, Integrated Drug Discovery, Toulouse, France
| | | |
Collapse
|
19
|
Thevis M, Piper T, Thomas A. Recent advances in identifying and utilizing metabolites of selected doping agents in human sports drug testing. J Pharm Biomed Anal 2021; 205:114312. [PMID: 34391136 DOI: 10.1016/j.jpba.2021.114312] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 08/03/2021] [Accepted: 08/05/2021] [Indexed: 12/29/2022]
Abstract
Probing for evidence of the administration of prohibited therapeutics, drugs and/or drug candidates as well as the use of methods of doping in doping control samples is a central assignment of anti-doping laboratories. In order to accomplish the desired analytical sensitivity, retrospectivity, and comprehensiveness, a considerable portion of anti-doping research has been invested into studying metabolic biotransformation and elimination profiles of doping agents. As these doping agents include lower molecular mass drugs such as e.g. stimulants and anabolic androgenic steroids, some of which further necessitate the differentiation of their natural/endogenous or xenobiotic origin, but also higher molecular mass substances such as e.g. insulins, growth hormone, or siRNA/anti-sense oligonucleotides, a variety of different strategies towards the identification of employable and informative metabolites have been developed. In this review, approaches supporting the identification, characterization, and implementation of metabolites exemplified by means of selected doping agents into routine doping controls are presented, and challenges as well as solutions reported and published between 2010 and 2020 are discussed.
Collapse
Affiliation(s)
- Mario Thevis
- Center for Preventive Doping Research - Institute of Biochemistry, German Sport University Cologne, Am Sportpark Müngersdorf 6, 50933, Cologne, Germany; European Monitoring Center for Emerging Doping Agents (EuMoCEDA), Cologne, Bonn, Germany.
| | - Thomas Piper
- Center for Preventive Doping Research - Institute of Biochemistry, German Sport University Cologne, Am Sportpark Müngersdorf 6, 50933, Cologne, Germany
| | - Andreas Thomas
- Center for Preventive Doping Research - Institute of Biochemistry, German Sport University Cologne, Am Sportpark Müngersdorf 6, 50933, Cologne, Germany
| |
Collapse
|
20
|
Tian S, Cao X, Greiner R, Li C, Guo A, Wishart DS. CyProduct: A Software Tool for Accurately Predicting the Byproducts of Human Cytochrome P450 Metabolism. J Chem Inf Model 2021; 61:3128-3140. [PMID: 34038112 DOI: 10.1021/acs.jcim.1c00144] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In silico metabolism prediction is a cheminformatic task of autonomously predicting the set of metabolic byproducts produced from a specified molecule and a set of enzymes or reactions. Here, we describe a novel machine learned in silico cytochrome P450 (CYP450) metabolism prediction suite, called CyProduct, that accurately predicts metabolic byproducts for a specified molecule and a human CYP450 isoform. It includes three modules: (1) CypReact, a tool that predicts if the query compound reacts with a given CYP450 enzyme, (2) CypBoM, a tool that accurately predicts the "bond site" of the reaction (i.e., which specific bonds within the query molecule react with the CYP isoform), and (3) MetaboGen, a tool that generates the metabolic byproducts based on CypBoM's bond-site prediction. CyProduct predicts metabolic biotransformation products for each of the nine most important human CYP450 enzymes. CypBoM uses an important new concept called "bond of metabolism" (BoM), which extends the traditional "site of metabolism" (SoM) by specifying the information about the set of chemical bonds that is modified or formed in a metabolic reaction (rather than the specific atom). We created a BoM database for 1845 CYP450-mediated Phase I reactions, then used this to train the CypBoM Predictor to predict the reactive bond locations on substrate molecules. CypBoM Predictor's cross-validated Jaccard score for reactive bond prediction ranged from 0.380 to 0.452 over the nine CYP450 enzymes. Over variants of a test set of 68 known CYP450 substrates and 30 nonreactants, CyProduct outperformed the other packages, including ADMET Predictor, BioTransformer, and GLORY, by an average of 200% (with respect to Jaccard score) in terms of predicting metabolites. The CyProduct suite and the data sets are freely available at https://bitbucket.org/wishartlab/cyproduct/src/master/.
Collapse
Affiliation(s)
- Siyang Tian
- Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada T6G 2E8.,Department of Biological Science, University of Alberta, Edmonton, Alberta, Canada T6G 2E9.,Alberta Machine Intelligence Institute (AMII), University of Alberta, 2-21 Athabasca Hall, Edmonton, Alberta Canada T6G 2E8
| | - Xuan Cao
- Department of Biological Science, University of Alberta, Edmonton, Alberta, Canada T6G 2E9
| | - Russell Greiner
- Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada T6G 2E8.,Alberta Machine Intelligence Institute (AMII), University of Alberta, 2-21 Athabasca Hall, Edmonton, Alberta Canada T6G 2E8
| | | | - AnChi Guo
- Department of Biological Science, University of Alberta, Edmonton, Alberta, Canada T6G 2E9
| | - David S Wishart
- Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada T6G 2E8.,Department of Biological Science, University of Alberta, Edmonton, Alberta, Canada T6G 2E9.,Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| |
Collapse
|
21
|
Schleiff MA, Dhaware D, Sodhi JK. Recent advances in computational metabolite structure predictions and altered metabolic pathways assessment to inform drug development processes. Drug Metab Rev 2021; 53:173-187. [PMID: 33840322 DOI: 10.1080/03602532.2021.1910292] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Many drug candidates fail during preclinical and clinical trials due to variable or unexpected metabolism which may lead to variability in drug efficacy or adverse drug reactions. The drug metabolism field aims to address this important issue from many angles which range from the study of drug-drug interactions, pharmacogenomics, computational metabolic modeling, and others. This manuscript aims to provide brief but comprehensive manuscript summaries highlighting the conclusions and scientific importance of seven exceptional manuscripts published in recent years within the field of drug metabolism. Two main topics within the field are reviewed: novel computational metabolic modeling approaches which provide complex outputs beyond site of metabolism predictions, and experimental approaches designed to discern the impacts of interindividual variability and species differences on drug metabolism. The computational approaches discussed provide novel outputs in metabolite structure and formation likelihood and/or extend beyond the saturated field of drug phase I metabolism, while the experimental metabolic pathways assessments aim to highlight the impacts of genetic polymorphisms and clinical animal model metabolic differences on human metabolism and subsequent health outcomes.
Collapse
Affiliation(s)
- Mary Alexandra Schleiff
- Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, AR, USA
| | - Deepika Dhaware
- Biotransformation and ADME, Research and Development, Orion Corporation, Espoo, Finland
| | - Jasleen K Sodhi
- Department of Bioengineering and Therapeutic Sciences, Schools of Pharmacy and Medicine, University of California San Francisco, San Francisco, CA, USA
| |
Collapse
|
22
|
Raju B, Choudhary S, Narendra G, Verma H, Silakari O. Molecular modeling approaches to address drug-metabolizing enzymes (DMEs) mediated chemoresistance: a review. Drug Metab Rev 2021; 53:45-75. [PMID: 33535824 DOI: 10.1080/03602532.2021.1874406] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Resistance against clinically approved anticancer drugs is the main roadblock in cancer treatment. Drug metabolizing enzymes (DMEs) that are capable of metabolizing a variety of xenobiotic get overexpressed in malignant cells, therefore, catalyzing drug inactivation. As evident from the literature reports, the levels of DMEs increase in cancer cells that ultimately lead to drug inactivation followed by drug resistance. To puzzle out this issue, several strategies inclusive of analog designing, prodrug designing, and inhibitor designing have been forged. On that front, the implementation of computational tools can be considered a fascinating approach to address the problem of chemoresistance. Various research groups have adopted different molecular modeling tools for the investigation of DMEs mediated toxicity problems. However, the utilization of these in-silico tools in maneuvering the DME mediated chemoresistance is least considered and yet to be explored. These tools can be employed in the designing of such chemotherapeutic agents that are devoid of the resistance problem. The current review canvasses various molecular modeling approaches that can be implemented to address this issue. Special focus was laid on the development of specific inhibitors of DMEs. Additionally, the strategies to bypass the DMEs mediated drug metabolism were also contemplated in this report that includes analogs and pro-drugs designing. Different strategies discussed in the review will be beneficial in designing novel chemotherapeutic agents that depreciate the resistance problem.
Collapse
Affiliation(s)
- Baddipadige Raju
- Molecular Modeling Lab (MML), Department of Pharmaceutical Sciences and Drug Research, Punjabi University, Patiala, India
| | - Shalki Choudhary
- Molecular Modeling Lab (MML), Department of Pharmaceutical Sciences and Drug Research, Punjabi University, Patiala, India
| | - Gera Narendra
- Molecular Modeling Lab (MML), Department of Pharmaceutical Sciences and Drug Research, Punjabi University, Patiala, India
| | - Himanshu Verma
- Molecular Modeling Lab (MML), Department of Pharmaceutical Sciences and Drug Research, Punjabi University, Patiala, India
| | - Om Silakari
- Molecular Modeling Lab (MML), Department of Pharmaceutical Sciences and Drug Research, Punjabi University, Patiala, India
| |
Collapse
|
23
|
|
24
|
Kim H, Kim E, Lee I, Bae B, Park M, Nam H. Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches. BIOTECHNOL BIOPROC E 2021; 25:895-930. [PMID: 33437151 PMCID: PMC7790479 DOI: 10.1007/s12257-020-0049-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 05/27/2020] [Accepted: 06/03/2020] [Indexed: 02/07/2023]
Abstract
As expenditure on drug development increases exponentially, the overall drug discovery process requires a sustainable revolution. Since artificial intelligence (AI) is leading the fourth industrial revolution, AI can be considered as a viable solution for unstable drug research and development. Generally, AI is applied to fields with sufficient data such as computer vision and natural language processing, but there are many efforts to revolutionize the existing drug discovery process by applying AI. This review provides a comprehensive, organized summary of the recent research trends in AI-guided drug discovery process including target identification, hit identification, ADMET prediction, lead optimization, and drug repositioning. The main data sources in each field are also summarized in this review. In addition, an in-depth analysis of the remaining challenges and limitations will be provided, and proposals for promising future directions in each of the aforementioned areas.
Collapse
Affiliation(s)
- Hyunho Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Eunyoung Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Ingoo Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Bongsung Bae
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Minsu Park
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| |
Collapse
|
25
|
Don CG, Smieško M. Deciphering Reaction Determinants of Altered-Activity CYP2D6 Variants by Well-Tempered Metadynamics Simulation and QM/MM Calculations. J Chem Inf Model 2020; 60:6642-6653. [PMID: 33269921 DOI: 10.1021/acs.jcim.0c01091] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The xenobiotic metabolizing enzyme CYP2D6 is the P450 cytochrome family member with the highest rate of polymorphism. This causes changes in the enzyme activity and specificity, which can ultimately lead to adverse reactions during drug treatment. To avoid or lower CYP-related toxicity risks, prediction of the most likely positions within a molecule where a metabolic reaction might occur is paramount. In order to obtain accurate predictions, it is crucial to understand all phenomena within the active site of the enzyme that contribute to an efficient substrate recognition and the subsequent catalytic reaction together with their relative weight within the overall thermodynamic context. This study aims to define the weight of the driving forces upon the C-H bond activation within CYP2D6 wild-type and a clinically relevant allelic variant with increased activity (CYP2D6*53) featuring two amino acid mutations in close vicinity of the heme. First, we investigated the steric and electrostatic complementarity of the substrate bufuralol using well-tempered metadynamics simulations with the aim to obtain the free energy profiles for each site of metabolism (SoM) within the different active sites. Second, the stereoelectronic complementarity was determined for each SoM within the two different active-site environments. Relying on the well-tempered metadynamics simulation energy profiles of each SoM, we identified the binding mode that was closest to the preferred transition-state geometry for efficient C-H bond activation. The binding modes were then used as starting structures for the quantum mechanics/molecular mechanics calculations performed to quantify the corresponding activation barriers. Our results show the relevance of the steric component in orienting the SoM in an energetically accessible position toward the heme. However, the corresponding intrinsic reactivity and electronic complementarity within the active site must be accurately evaluated in order to obtain a meaningful reaction prediction, from which the predominant SoM can be determined. The F120I mutation lowered the activation barrier for the major site and one of the minor SoMs. However, it had an impact neither on the CYP2D6 enantioselectivity preference of the oxidation reaction nor on the stereoselectivity from the substrate point of view.
Collapse
Affiliation(s)
- Charleen G Don
- Computational Pharmacy Group, Department of Pharmaceutical Sciences, University of Basel, Klingelbergstrasse 50, 4056 Basel, Switzerland
| | - Martin Smieško
- Computational Pharmacy Group, Department of Pharmaceutical Sciences, University of Basel, Klingelbergstrasse 50, 4056 Basel, Switzerland
| |
Collapse
|
26
|
de Bruyn Kops C, Šícho M, Mazzolari A, Kirchmair J. GLORYx: Prediction of the Metabolites Resulting from Phase 1 and Phase 2 Biotransformations of Xenobiotics. Chem Res Toxicol 2020; 34:286-299. [PMID: 32786543 PMCID: PMC7887798 DOI: 10.1021/acs.chemrestox.0c00224] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
Predicting
the structures of metabolites formed in humans can provide
advantageous insights for the development of drugs and other compounds.
Here we present GLORYx, which integrates machine learning-based site
of metabolism (SoM) prediction with reaction rule sets to predict
and rank the structures of metabolites that could potentially be formed
by phase 1 and/or phase 2 metabolism. GLORYx extends the approach
from our previously developed tool GLORY, which predicted metabolite
structures for cytochrome P450-mediated metabolism only. A robust
approach to ranking the predicted metabolites is attained by using
the SoM probabilities predicted by the FAME 3 machine learning models
to score the predicted metabolites. On a manually curated test data
set containing both phase 1 and phase 2 metabolites, GLORYx achieves
a recall of 77% and an area under the receiver operating characteristic
curve (AUC) of 0.79. Separate analysis of performance on a large amount
of freely available phase 1 and phase 2 metabolite data indicates
that achieving a meaningful ranking of predicted metabolites is more
difficult for phase 2 than for phase 1 metabolites. GLORYx is freely
available as a web server at https://nerdd.zbh.uni-hamburg.de/ and is also provided as a software package upon request. The data
sets as well as all the reaction rules from this work are also made
freely available.
Collapse
Affiliation(s)
- Christina de Bruyn Kops
- Center for Bioinformatics (ZBH), Department of Informatics, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, 20146 Hamburg, Germany
| | - Martin Šícho
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, 166 28 Prague 6, Czech Republic
| | - Angelica Mazzolari
- Facoltà di Scienze del Farmaco, Dipartimento di Scienze Farmaceutiche "Pietro Pratesi", Università degli Studi di Milano, I-20133 Milan, Italy
| | - Johannes Kirchmair
- Center for Bioinformatics (ZBH), Department of Informatics, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, 20146 Hamburg, Germany.,Department of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, 1090 Vienna, Austria
| |
Collapse
|
27
|
Öeren M, Walton PJ, Hunt PA, Ponting DJ, Segall MD. Predicting reactivity to drug metabolism: beyond P450s-modelling FMOs and UGTs. J Comput Aided Mol Des 2020; 35:541-555. [PMID: 32533369 DOI: 10.1007/s10822-020-00321-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Accepted: 06/07/2020] [Indexed: 11/28/2022]
Abstract
We present a study based on density functional theory calculations to explore the rate limiting steps of product formation for oxidation by Flavin-containing Monooxygenase (FMO) and glucuronidation by the UDP-glucuronosyltransferase (UGT) family of enzymes. FMOs are responsible for the modification phase of metabolism of a wide diversity of drugs, working in conjunction with Cytochrome P450 (CYP) family of enzymes, and UGTs are the most important class of drug conjugation enzymes. Reactivity calculations are important for prediction of metabolism by CYPs and reactivity alone explains around 70-85% of the experimentally observed sites of metabolism within CYP substrates. In the current work we extend this approach to propose model systems which can be used to calculate the activation energies, i.e. reactivity, for the rate-limiting steps for both FMO oxidation and glucuronidation of potential sites of metabolism. These results are validated by comparison with the experimentally observed reaction rates and sites of metabolism, indicating that the presented models are suitable to provide the basis of a reactivity component within generalizable models to predict either FMO or UGT metabolism.
Collapse
Affiliation(s)
- Mario Öeren
- Optibrium Limited, Cambridge Innovation Park, Denny End Road, Cambridge, CB25 9PB, UK.
| | - Peter J Walton
- Optibrium Limited, Cambridge Innovation Park, Denny End Road, Cambridge, CB25 9PB, UK.,School of Chemistry, University of Nottingham, University Park, Nottingham, NG7 2RD, UK
| | - Peter A Hunt
- Optibrium Limited, Cambridge Innovation Park, Denny End Road, Cambridge, CB25 9PB, UK
| | - David J Ponting
- Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds, LS11 5PS, UK
| | - Matthew D Segall
- Optibrium Limited, Cambridge Innovation Park, Denny End Road, Cambridge, CB25 9PB, UK
| |
Collapse
|
28
|
RF-MaloSite and DL-Malosite: Methods based on random forest and deep learning to identify malonylation sites. Comput Struct Biotechnol J 2020; 18:852-860. [PMID: 32322367 PMCID: PMC7160427 DOI: 10.1016/j.csbj.2020.02.012] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Revised: 01/27/2020] [Accepted: 02/19/2020] [Indexed: 12/19/2022] Open
Abstract
Malonylation, which has recently emerged as an important lysine modification, regulates diverse biological activities and has been implicated in several pervasive disorders, including cardiovascular disease and cancer. However, conventional global proteomics analysis using tandem mass spectrometry can be time-consuming, expensive and technically challenging. Therefore, to complement and extend existing experimental methods for malonylation site identification, we developed two novel computational methods for malonylation site prediction based on random forest and deep learning machine learning algorithms, RF-MaloSite and DL-MaloSite, respectively. DL-MaloSite requires the primary amino acid sequence as an input and RF-MaloSite utilizes a diverse set of biochemical, physiochemical and sequence-based features. While systematic assessment of performance metrics suggests that both ‘RF-MaloSite’ and ‘DL-MaloSite’ perform well in all metrics tested, our methods perform particularly well in the areas of accuracy, sensitivity and overall method performance (assessed by the Matthew’s Correlation Coefficient). For instance, RF-MaloSite exhibited MCC scores of 0.42 and 0.40 using 10-fold cross-validation and an independent test set, respectively. Meanwhile, DL-MaloSite was characterized by MCC scores of 0.51 and 0.49 based on 10-fold cross-validation and an independent set, respectively. Importantly, both methods exhibited efficiency scores that were on par or better than those achieved by existing malonylation site prediction methods. The identification of these sites may also provide important insights into the mechanisms of crosstalk between malonylation and other lysine modifications, such as acetylation, glutarylation and succinylation. To facilitate their use, both methods have been made freely available to the research community at https://github.com/dukkakc/DL-MaloSite-and-RF-MaloSite.
Collapse
|
29
|
Dang NL, Matlock MK, Hughes TB, Swamidass SJ. The Metabolic Rainbow: Deep Learning Phase I Metabolism in Five Colors. J Chem Inf Model 2020; 60:1146-1164. [DOI: 10.1021/acs.jcim.9b00836] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Na Le Dang
- Department of Pathology and Immunology, Washington University School of Medicine, Campus Box 8118, 660 S. Euclid Ave., St. Louis, Missouri 63110, United States
| | - Matthew K. Matlock
- Department of Pathology and Immunology, Washington University School of Medicine, Campus Box 8118, 660 S. Euclid Ave., St. Louis, Missouri 63110, United States
| | - Tyler B. Hughes
- Department of Pathology and Immunology, Washington University School of Medicine, Campus Box 8118, 660 S. Euclid Ave., St. Louis, Missouri 63110, United States
| | - S. Joshua Swamidass
- Department of Pathology and Immunology, Washington University School of Medicine, Campus Box 8118, 660 S. Euclid Ave., St. Louis, Missouri 63110, United States
| |
Collapse
|
30
|
4mCpred-EL: An Ensemble Learning Framework for Identification of DNA N4-methylcytosine Sites in the Mouse Genome. Cells 2019; 8:cells8111332. [PMID: 31661923 PMCID: PMC6912380 DOI: 10.3390/cells8111332] [Citation(s) in RCA: 74] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Revised: 10/21/2019] [Accepted: 10/24/2019] [Indexed: 12/24/2022] Open
Abstract
DNA N4-methylcytosine (4mC) is one of the key epigenetic alterations, playing essential roles in DNA replication, differentiation, cell cycle, and gene expression. To better understand 4mC biological functions, it is crucial to gain knowledge on its genomic distribution. In recent times, few computational studies, in particular machine learning (ML) approaches have been applied in the prediction of 4mC site predictions. Although ML-based methods are promising for 4mC identification in other species, none are available for detecting 4mCs in the mouse genome. Our novel computational approach, called 4mCpred-EL, is the first method for identifying 4mC sites in the mouse genome where four different ML algorithms with a wide range of seven feature encodings are utilized. Subsequently, those feature encodings predicted probabilistic values are used as a feature vector and are once again inputted to ML algorithms, whose corresponding models are integrated into ensemble learning. Our benchmarking results demonstrated that 4mCpred-EL achieved an accuracy and MCC values of 0.795 and 0.591, which significantly outperformed seven other classifiers by more than 1.5–5.9% and 3.2–11.7%, respectively. Additionally, 4mCpred-EL attained an overall accuracy of 79.80%, which is 1.8–5.1% higher than that yielded by seven other classifiers in the independent evaluation. We provided a user-friendly web server, namely 4mCpred-EL which could be implemented as a pre-screening tool for the identification of potential 4mC sites in the mouse genome.
Collapse
|
31
|
Šícho M, Stork C, Mazzolari A, de Bruyn Kops C, Pedretti A, Testa B, Vistoli G, Svozil D, Kirchmair J. FAME 3: Predicting the Sites of Metabolism in Synthetic Compounds and Natural Products for Phase 1 and Phase 2 Metabolic Enzymes. J Chem Inf Model 2019; 59:3400-3412. [PMID: 31361490 DOI: 10.1021/acs.jcim.9b00376] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
In this work we present the third generation of FAst MEtabolizer (FAME 3), a collection of extra trees classifiers for the prediction of sites of metabolism (SoMs) in small molecules such as drugs, druglike compounds, natural products, agrochemicals, and cosmetics. FAME 3 was derived from the MetaQSAR database ( Pedretti et al. J. Med. Chem. 2018 , 61 , 1019 ), a recently published data resource on xenobiotic metabolism that contains more than 2100 substrates annotated with more than 6300 experimentally confirmed SoMs related to redox reactions, hydrolysis and other nonredox reactions, and conjugation reactions. In tests with holdout data, FAME 3 models reached competitive performance, with Matthews correlation coefficients (MCCs) ranging from 0.50 for a global model covering phase 1 and phase 2 metabolism, to 0.75 for a focused model for phase 2 metabolism. A model focused on cytochrome P450 metabolism yielded an MCC of 0.57. Results from case studies with several synthetic compounds, natural products, and natural product derivatives demonstrate the agreement between model predictions and literature data even for molecules with structural patterns clearly distinct from those present in the training data. The applicability domains of the individual models were estimated by a new, atom-based distance measure (FAMEscore) that is based on a nearest-neighbor search in the space of atom environments. FAME 3 is available via a public web service at https://nerdd.zbh.uni-hamburg.de/ and as a self-contained Java software package, free for academic and noncommercial research.
Collapse
Affiliation(s)
- Martin Šícho
- Faculty of Mathematics, Informatics and Natural Sciences, Department of Informatics, Center for Bioinformatics , Universität Hamburg , 20146 Hamburg , Germany.,Faculty of Chemical Technology, Department of Informatics and Chemistry, CZ-OPENSCREEN: National Infrastructure for Chemical Biology , University of Chemistry and Technology Prague , 166 28 Prague 6 , Czech Republic
| | - Conrad Stork
- Faculty of Mathematics, Informatics and Natural Sciences, Department of Informatics, Center for Bioinformatics , Universität Hamburg , 20146 Hamburg , Germany
| | - Angelica Mazzolari
- Facoltà di Scienze del Farmaco, Dipartimento di Scienze Farmaceutiche "Pietro Pratesi" , Università degli Studi di Milano , I-20133 Milan , Italy
| | - Christina de Bruyn Kops
- Faculty of Mathematics, Informatics and Natural Sciences, Department of Informatics, Center for Bioinformatics , Universität Hamburg , 20146 Hamburg , Germany
| | - Alessandro Pedretti
- Facoltà di Scienze del Farmaco, Dipartimento di Scienze Farmaceutiche "Pietro Pratesi" , Università degli Studi di Milano , I-20133 Milan , Italy
| | - Bernard Testa
- University of Lausanne , 1015 Lausanne , Switzerland
| | - Giulio Vistoli
- Facoltà di Scienze del Farmaco, Dipartimento di Scienze Farmaceutiche "Pietro Pratesi" , Università degli Studi di Milano , I-20133 Milan , Italy
| | - Daniel Svozil
- Faculty of Chemical Technology, Department of Informatics and Chemistry, CZ-OPENSCREEN: National Infrastructure for Chemical Biology , University of Chemistry and Technology Prague , 166 28 Prague 6 , Czech Republic
| | - Johannes Kirchmair
- Faculty of Mathematics, Informatics and Natural Sciences, Department of Informatics, Center for Bioinformatics , Universität Hamburg , 20146 Hamburg , Germany
| |
Collapse
|
32
|
Yang X, Wang Y, Byrne R, Schneider G, Yang S. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem Rev 2019; 119:10520-10594. [PMID: 31294972 DOI: 10.1021/acs.chemrev.8b00728] [Citation(s) in RCA: 351] [Impact Index Per Article: 70.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Artificial intelligence (AI), and, in particular, deep learning as a subcategory of AI, provides opportunities for the discovery and development of innovative drugs. Various machine learning approaches have recently (re)emerged, some of which may be considered instances of domain-specific AI which have been successfully employed for drug discovery and design. This review provides a comprehensive portrayal of these machine learning techniques and of their applications in medicinal chemistry. After introducing the basic principles, alongside some application notes, of the various machine learning algorithms, the current state-of-the art of AI-assisted pharmaceutical discovery is discussed, including applications in structure- and ligand-based virtual screening, de novo drug design, physicochemical and pharmacokinetic property prediction, drug repurposing, and related aspects. Finally, several challenges and limitations of the current methods are summarized, with a view to potential future directions for AI-assisted drug discovery and design.
Collapse
Affiliation(s)
- Xin Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Yifei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Ryan Byrne
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Gisbert Schneider
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Shengyong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| |
Collapse
|
33
|
Kaitoh K, Kotera M, Funatsu K. Novel Electrotopological Atomic Descriptors for the Prediction of Xenobiotic Cytochrome P450 Reactions. Mol Inform 2019; 38:e1900010. [PMID: 31187601 DOI: 10.1002/minf.201900010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 04/28/2019] [Indexed: 01/06/2023]
Abstract
Cytochrome P450 (CYP) is an enzyme family that plays a crucial role in metabolism, mainly metabolizing xenobiotics to produce non-toxic structures, however, some metabolized products can cause hepatotoxicity. Hence, predicting the structures of CYP products is an important task in designing non-hepatotoxic drugs. Here, we have developed novel atomic descriptors to predict the sites of metabolism (SoM) in CYP substrates. We proposed descriptors that describe topological and electrostatic characteristics of CYP substrates using Gasteiger charge. The proposed descriptors were applied to CYP3A4 data analysis as a case study. As a result of the descriptor selection, we obtained a gradient boosting decision tree-based SoM classification model that used 139 existing descriptors and the proposed 45 descriptors, and the model performed well in terms of the Matthews correlation coefficient. We also developed a structure converter to predict CYP products. This converter correctly generated 51 structural formulas of experimentally observed CYP3A4 products according to a manual evaluation.
Collapse
Affiliation(s)
- Kazuma Kaitoh
- Department of Chemical System Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
| | - Masaaki Kotera
- Department of Chemical System Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
| | - Kimito Funatsu
- Department of Chemical System Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
| |
Collapse
|
34
|
de Bruyn Kops C, Stork C, Šícho M, Kochev N, Svozil D, Jeliazkova N, Kirchmair J. GLORY: Generator of the Structures of Likely Cytochrome P450 Metabolites Based on Predicted Sites of Metabolism. Front Chem 2019; 7:402. [PMID: 31249827 PMCID: PMC6582643 DOI: 10.3389/fchem.2019.00402] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Accepted: 05/17/2019] [Indexed: 01/10/2023] Open
Abstract
Computational prediction of xenobiotic metabolism can provide valuable information to guide the development of drugs, cosmetics, agrochemicals, and other chemical entities. We have previously developed FAME 2, an effective tool for predicting sites of metabolism (SoMs). In this work, we focus on the prediction of the chemical structures of metabolites, in particular metabolites of xenobiotics. To this end, we have developed a new tool, GLORY, which combines SoM prediction with FAME 2 and a new collection of rules for metabolic reactions mediated by the cytochrome P450 enzyme family. GLORY has two modes: MaxEfficiency and MaxCoverage. For MaxEfficiency mode, the use of predicted SoMs to restrict the locations in the molecule at which the reaction rules could be applied was explored. For MaxCoverage mode, the predicted SoM probabilities were instead used to develop a new scoring approach for the predicted metabolites. With this scoring approach, GLORY achieves a recall of 0.83 and can predict at least one known metabolite within the top three ranked positions for 76% of the molecules of a new, manually curated test set. GLORY is freely available as a web server at https://acm.zbh.uni-hamburg.de/glory/, and the datasets and reaction rules are provided in the Supplementary Material.
Collapse
Affiliation(s)
- Christina de Bruyn Kops
- Department of Computer Science, Center for Bioinformatics (ZBH), Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, Hamburg, Germany
| | - Conrad Stork
- Department of Computer Science, Center for Bioinformatics (ZBH), Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, Hamburg, Germany
| | - Martin Šícho
- Department of Computer Science, Center for Bioinformatics (ZBH), Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, Hamburg, Germany.,CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Prague, Czechia
| | - Nikolay Kochev
- Ideaconsult Ltd., Sofia, Bulgaria.,Department of Analytical Chemistry and Computer Chemistry, University of Plovdiv, Plovdiv, Bulgaria
| | - Daniel Svozil
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Prague, Czechia
| | | | - Johannes Kirchmair
- Department of Computer Science, Center for Bioinformatics (ZBH), Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, Hamburg, Germany.,Department of Chemistry, University of Bergen, Bergen, Norway.,Computational Biology Unit (CBU), University of Bergen, Bergen, Norway
| |
Collapse
|
35
|
Ferreira LL, Andricopulo AD. ADMET modeling approaches in drug discovery. Drug Discov Today 2019; 24:1157-1165. [DOI: 10.1016/j.drudis.2019.03.015] [Citation(s) in RCA: 102] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Revised: 02/08/2019] [Accepted: 03/14/2019] [Indexed: 12/31/2022]
|
36
|
Manavalan B, Basith S, Shin TH, Wei L, Lee G. Meta-4mCpred: A Sequence-Based Meta-Predictor for Accurate DNA 4mC Site Prediction Using Effective Feature Representation. MOLECULAR THERAPY. NUCLEIC ACIDS 2019; 16:733-744. [PMID: 31146255 PMCID: PMC6540332 DOI: 10.1016/j.omtn.2019.04.019] [Citation(s) in RCA: 164] [Impact Index Per Article: 32.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Revised: 04/16/2019] [Accepted: 04/22/2019] [Indexed: 11/19/2022]
Abstract
DNA N4-methylcytosine (4mC) is an important genetic modification and plays crucial roles in differentiation between self and non-self DNA and in controlling DNA replication, cell cycle, and gene-expression levels. Accurate 4mC site identification is fundamental to improve the understanding of 4mC biological functions and mechanisms. Hence, it is necessary to develop in silico approaches for efficient and high-throughput 4mC site identification. Although some bioinformatic tools have been developed in this regard, their prediction accuracy and generalizability require improvement to optimize their usability in practical applications. For this purpose, we here proposed Meta-4mCpred, a meta-predictor for 4mC site prediction. In Meta-4mCpred, we employed a feature representation learning scheme and generated 56 probabilistic features based on four different machine-learning algorithms and seven feature encodings covering diverse sequence information, including compositional, physicochemical, and position-specific information. Subsequently, the probabilistic features were used as an input to support vector machine and developed a final meta-predictor. To the best of our knowledge, this is the first meta-predictor for 4mC site prediction. Cross-validation results show that Meta-4mCpred achieved an overall average accuracy of 84.2% from six different species, which is ∼2%–4% higher than those attainable using the state-of-the-art predictors. Furthermore, Meta-4mCpred achieved an overall average accuracy of 86% on independent datasets evaluation, which is over 4% higher than those yielded by the state-of-the-art predictors. The user-friendly webserver employed to implement the proposed Meta-4mCpred is freely accessible at http://thegleelab.org/Meta-4mCpred.
Collapse
Affiliation(s)
| | - Shaherin Basith
- Department of Physiology, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Tae Hwan Shin
- Department of Physiology, Ajou University School of Medicine, Suwon, Republic of Korea; Institute of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea
| | - Leyi Wei
- School of Computer Science and Technology, Tianjin University, China.
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon, Republic of Korea; Institute of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea.
| |
Collapse
|
37
|
Mazzolari A, Afzal AM, Pedretti A, Testa B, Vistoli G, Bender A. Prediction of UGT-mediated Metabolism Using the Manually Curated MetaQSAR Database. ACS Med Chem Lett 2019; 10:633-638. [PMID: 30996809 DOI: 10.1021/acsmedchemlett.8b00603] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Accepted: 02/12/2019] [Indexed: 11/30/2022] Open
Abstract
Even though glucuronidations are the most frequent metabolic reactions of conjugation, both in quantitative and qualitative terms, they have rather seldom been investigated using computational approaches. To fill this gap, we have used the manually collected MetaQSAR metabolic reaction database to generate two models for the prediction of UGT-mediated metabolism, both based on molecular descriptors and implementing the Random Forest algorithm. The first model predicts the occurrence of the reaction and was internally validated with a Matthew correlation coefficient (MCC) of 0.76 and an area under the ROC curve (AUC) of 0.94, and further externally validated using a test set composed of 120 additional xenobiotics (MCC of 0.70 and AUC of 0.90). The second model distinguishes between O- and N-glucuronidations and was optimized by the random undersampling procedure to improve the predictive accuracy during the internal validation, with the recall measure of the minority class increasing from 0.55 to 0.78.
Collapse
Affiliation(s)
- Angelica Mazzolari
- Dipartimento di Scienze Farmaceutiche, Facoltà di Scienze del Farmaco, Università degli Studi di Milano, Via Mangiagalli, I-20133 Milano, Italy
| | - Avid M. Afzal
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, CB2 1EW Cambridge, U.K
| | - Alessandro Pedretti
- Dipartimento di Scienze Farmaceutiche, Facoltà di Scienze del Farmaco, Università degli Studi di Milano, Via Mangiagalli, I-20133 Milano, Italy
| | | | - Giulio Vistoli
- Dipartimento di Scienze Farmaceutiche, Facoltà di Scienze del Farmaco, Università degli Studi di Milano, Via Mangiagalli, I-20133 Milano, Italy
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, CB2 1EW Cambridge, U.K
| |
Collapse
|
38
|
Zhang Y, Wang Y, Zhou W, Fan Y, Zhao J, Zhu L, Lu S, Lu T, Chen Y, Liu H. A combined drug discovery strategy based on machine learning and molecular docking. Chem Biol Drug Des 2019; 93:685-699. [PMID: 30688405 DOI: 10.1111/cbdd.13494] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 01/04/2019] [Accepted: 01/19/2019] [Indexed: 12/14/2022]
Abstract
Data mining methods based on machine learning play an increasingly important role in drug design and discovery. In the current work, eight machine learning methods including decision trees, k-Nearest neighbor, support vector machines, random forests, extremely randomized trees, AdaBoost, gradient boosting trees, and XGBoost were evaluated comprehensively through a case study of ACC inhibitor data sets. Internal and external data sets were employed for cross-validation of the eight machine learning methods. Results showed that the extremely randomized trees model performed best and was adopted as the first step of virtual screening. Together with structure-based virtual screening in the second step, this combined strategy obtained desirable results. This work indicates that the combination of machine learning methods with traditional structure-based virtual screening can effectively strengthen the ability in finding potential hits from large compound database for a given target.
Collapse
Affiliation(s)
- Yanmin Zhang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Yuchen Wang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Weineng Zhou
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Yuanrong Fan
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Junnan Zhao
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Lu Zhu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Shuai Lu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Tao Lu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China.,State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing, China
| | - Yadong Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Haichun Liu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| |
Collapse
|
39
|
Fast Methods for Prediction of Aldehyde Oxidase-Mediated Site-of-Metabolism. Comput Struct Biotechnol J 2019; 17:345-351. [PMID: 30949305 PMCID: PMC6429535 DOI: 10.1016/j.csbj.2019.03.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 02/26/2019] [Accepted: 03/01/2019] [Indexed: 12/11/2022] Open
Abstract
Aldehyde Oxidase (AO) is an enzyme involved in the metabolism of aldehydes and N-containing heterocyclic compounds. Many drug compounds contain heterocyclic moieties, and AO metabolism has lead to failure of several late-stage drug candidates. Therefore, it is important to take AO-mediated metabolism into account early in the drug discovery process, and thus, to have fast and reliable models to predict the site of metabolism (SOM). We have collected a dataset of 78 substrates of human AO with a total of 89 SOMs and 347 non-SOMs and determined atomic descriptors for each compound. The descriptors comprise NMR shielding and ESP charges from density functional theory (DFT), NMR chemical shift from ChemBioDraw, and Gasteiger charges from RDKit. Additionally, atomic accessibility was considered using 2D-SASA and relative span descriptors from SMARTCyp. Finally, stability of the product, the metabolite, was determined with DFT and also used as a descriptor. All descriptors have AUC larger than 0.75. In particular, descriptors related to the chemical shielding and chemical shift (AUC = 0.96) and ESP charges (AUC = 0.96) proved to be good descriptors. We recommend two simple methods to identify the SOM for a given molecule: 1) use ChemBioDraw to calculate the chemical shift or 2) calculate ESP charges or chemical shift using DFT. The first approach is fast but somewhat difficult to automate, while the second is more time-consuming, but can easily be automated. The two methods predict correctly 93% and 91%, respectively, of the 89 experimentally observed SOMs.
Collapse
|
40
|
Tyzack JD, Kirchmair J. Computational methods and tools to predict cytochrome P450 metabolism for drug discovery. Chem Biol Drug Des 2019; 93:377-386. [PMID: 30471192 PMCID: PMC6590657 DOI: 10.1111/cbdd.13445] [Citation(s) in RCA: 95] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Revised: 11/05/2018] [Accepted: 11/11/2018] [Indexed: 01/08/2023]
Abstract
In this review, we present important, recent developments in the computational prediction of cytochrome P450 (CYP) metabolism in the context of drug discovery. We discuss in silico models for the various aspects of CYP metabolism prediction, including CYP substrate and inhibitor predictors, site of metabolism predictors (i.e., metabolically labile sites within potential substrates) and metabolite structure predictors. We summarize the different approaches taken by these models, such as rule‐based methods, machine learning, data mining, quantum chemical methods, molecular interaction fields, and docking. We highlight the scope and limitations of each method and discuss future implications for the field of metabolism prediction in drug discovery.
Collapse
Affiliation(s)
| | - Johannes Kirchmair
- Department of Chemistry, University of Bergen, Bergen, Norway.,Computational Biology Unit (CBU), University of Bergen, Bergen, Norway.,Center for Bioinformatics, Universität Hamburg, Hamburg, Germany
| |
Collapse
|
41
|
Galster M, Löppenberg M, Galla F, Börgel F, Agoglitta O, Kirchmair J, Holl R. Phenylethylene glycol-derived LpxC inhibitors with diverse Zn2+-binding groups. Tetrahedron 2019. [DOI: 10.1016/j.tet.2018.12.011] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
42
|
AL-barakati HJ, Saigo H, Newman RH, KC DB. RF-GlutarySite: a random forest based predictor for glutarylation sites. Mol Omics 2019; 15:189-204. [DOI: 10.1039/c9mo00028c] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Glutarylation, which is a newly identified posttranslational modification that occurs on lysine residues, has recently emerged as an important regulator of several metabolic and mitochondrial processes. Here, we describe the development of RF-GlutarySite, a random forest-based predictor designed to predict glutarylation sites based on protein primary amino acid sequence.
Collapse
Affiliation(s)
- Hussam J. AL-barakati
- Department of Computational Science and Engineering
- North Carolina Agricultural & Technical State University
- Greensboro
- USA
| | - Hiroto Saigo
- Department of Informatics
- Kyushu University
- Fukuoka 819-0395
- Japan
| | - Robert H. Newman
- Department of Biology
- North Carolina Agricultural & Technical State University
- Greensboro
- USA
| | - Dukka B. KC
- Department of Computational Science and Engineering
- North Carolina Agricultural & Technical State University
- Greensboro
- USA
| |
Collapse
|
43
|
Finkelmann AR, Goldmann D, Schneider G, Göller AH. MetScore: Site of Metabolism Prediction Beyond Cytochrome P450 Enzymes. ChemMedChem 2018; 13:2281-2289. [PMID: 30184341 DOI: 10.1002/cmdc.201800309] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Revised: 08/31/2018] [Indexed: 12/20/2022]
Abstract
The metabolism of xenobiotics by humans and other organisms is a complex process involving numerous enzymes that catalyze phase I (functionalization) and phase II (conjugation) reactions. Herein we introduce MetScore, a machine learning model that can predict both phase I and phase II reaction sites of drugs in a single prediction run. We developed cheminformatics workflows to filter and process reactions to obtain suitable phase I and phase II data sets for model training. Employing a recently developed molecular representation based on quantum chemical partial charges, we constructed random forest machine learning models for phase I and phase II reactions. After combining these models with our previous cytochrome P450 model and calibrating the combination against Bayer in-house data, we obtained the MetScore model that shows good performance, with Matthews correlation coefficients of 0.61 and 0.76 for diverse phase I and phase II reaction types, respectively. We validated its potential applicability to lead optimization campaigns for a new and independent data set compiled from recent publications. The results of this study demonstrate the usefulness of quantum-chemistry-derived molecular representations for reactivity prediction.
Collapse
Affiliation(s)
- Arndt R Finkelmann
- ETH Zurich, Department of Chemistry and Applied Biosciences, Zurich, Switzerland
| | - Daria Goldmann
- KNIME GmbH, Reichenaustrasse 11, 78467, Konstanz, Germany
| | - Gisbert Schneider
- ETH Zurich, Department of Chemistry and Applied Biosciences, Zurich, Switzerland
| | - Andreas H Göller
- Bayer AG, Pharmaceuticals, Research & Development, 42096, Wuppertal, Germany
| |
Collapse
|
44
|
Manavalan B, Govindaraj RG, Shin TH, Kim MO, Lee G. iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction. Front Immunol 2018; 9:1695. [PMID: 30100904 PMCID: PMC6072840 DOI: 10.3389/fimmu.2018.01695] [Citation(s) in RCA: 113] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Accepted: 07/10/2018] [Indexed: 11/13/2022] Open
Abstract
Identification of B-cell epitopes (BCEs) is a fundamental step for epitope-based vaccine development, antibody production, and disease prevention and diagnosis. Due to the avalanche of protein sequence data discovered in postgenomic age, it is essential to develop an automated computational method to enable fast and accurate identification of novel BCEs within vast number of candidate proteins and peptides. Although several computational methods have been developed, their accuracy is unreliable. Thus, developing a reliable model with significant prediction improvements is highly desirable. In this study, we first constructed a non-redundant data set of 5,550 experimentally validated BCEs and 6,893 non-BCEs from the Immune Epitope Database. We then developed a novel ensemble learning framework for improved linear BCE predictor called iBCE-EL, a fusion of two independent predictors, namely, extremely randomized tree (ERT) and gradient boosting (GB) classifiers, which, respectively, uses a combination of physicochemical properties (PCP) and amino acid composition and a combination of dipeptide and PCP as input features. Cross-validation analysis on a benchmarking data set showed that iBCE-EL performed better than individual classifiers (ERT and GB), with a Matthews correlation coefficient (MCC) of 0.454. Furthermore, we evaluated the performance of iBCE-EL on the independent data set. Results show that iBCE-EL significantly outperformed the state-of-the-art method with an MCC of 0.463. To the best of our knowledge, iBCE-EL is the first ensemble method for linear BCEs prediction. iBCE-EL was implemented in a web-based platform, which is available at http://thegleelab.org/iBCE-EL. iBCE-EL contains two prediction modes. The first one identifying peptide sequences as BCEs or non-BCEs, while later one is aimed at providing users with the option of mining potential BCEs from protein sequences.
Collapse
Affiliation(s)
| | - Rajiv Gandhi Govindaraj
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
| | - Tae Hwan Shin
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea.,Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| | - Myeong Ok Kim
- Division of Life Science and Applied Life Science (BK21 Plus), College of Natural Sciences, Gyeongsang National University, Jinju, South Korea
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea.,Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| |
Collapse
|