1
|
Yarish D, Garkot S, Grygorenko OO, Radchenko DS, Moroz YS, Gurbych O. Advancing molecular graphs with descriptors for the prediction of chemical reaction yields. J Comput Chem 2022; 44:76-92. [PMID: 36264601 DOI: 10.1002/jcc.27016] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 08/31/2022] [Accepted: 09/05/2022] [Indexed: 11/08/2022]
Abstract
Chemical yield is the percentage of the reactants converted to the desired products. Chemists use predictive algorithms to select high-yielding reactions and score synthesis routes, saving time and reagents. This study suggests a novel graph neural network architecture for chemical yield prediction. The network combines structural information about participants of the transformation as well as molecular and reaction-level descriptors. It works with incomplete chemical reactions and generates reactants-product atom mapping. We show that the network benefits from advanced information by comparing it with several machine learning models and molecular representations. Models included logistic regression, support vector machine, CatBoost, and Bidirectional Encoder Representations from Transformers. Molecular representations included extended-connectivity fingerprints, Morgan fingerprints, SMILESVec embeddings, and textual. Classification and regression objectives were assessed for each model and feature set. The goal of each classification model was to separate zero- and non-zero-yielding reactions. The models were trained and evaluated on a proprietary dataset of 10 reaction types. Also, the models were benchmarked on two public single reaction type datasets. The study was supplemented with analysis of data, results, and errors, as well as the impact of steric factors, side reactions, isolation, and purification efficiency. The supplementary code is available at https://github.com/SoftServeInc/yield-paper.
Collapse
Affiliation(s)
| | - Sofiya Garkot
- SoftServe, Inc., Lviv, Ukraine.,Ukrainian Catholic University, Lviv, Ukraine
| | - Oleksandr O Grygorenko
- Enamine Ltd., Kyiv, Ukraine.,Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
| | - Dmytro S Radchenko
- Enamine Ltd., Kyiv, Ukraine.,Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
| | - Yurii S Moroz
- Taras Shevchenko National University of Kyiv, Kyiv, Ukraine.,Chemspace LLC, Kyiv, Ukraine
| | - Oleksandr Gurbych
- Lviv Polytechnic National University, Lviv, Ukraine.,Blackthorn AI, Ltd., London, UK
| |
Collapse
|
2
|
Artificial intelligence and machine-learning approaches in structure and ligand-based discovery of drugs affecting central nervous system. Mol Divers 2022; 27:959-985. [PMID: 35819579 DOI: 10.1007/s11030-022-10489-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 06/21/2022] [Indexed: 12/11/2022]
Abstract
CNS disorders are indications with a very high unmet medical needs, relatively smaller number of available drugs, and a subpar satisfaction level among patients and caregiver. Discovery of CNS drugs is extremely expensive affair with its own unique challenges leading to extremely high attrition rates and low efficiency. With explosion of data in information age, there is hardly any aspect of life that has not been touched by data driven technologies such as artificial intelligence (AI) and machine learning (ML). Drug discovery is no exception, emergence of big data via genomic, proteomic, biological, and chemical technologies has driven pharmaceutical giants to collaborate with AI oriented companies to revolutionise drug discovery, with the goal of increasing the efficiency of the process. In recent years many examples of innovative applications of AI and ML techniques in CNS drug discovery has been reported. Research on therapeutics for diseases such as schizophrenia, Alzheimer's and Parkinsonism has been provided with a new direction and thrust from these developments. AI and ML has been applied to both ligand-based and structure-based drug discovery and design of CNS therapeutics. In this review, we have summarised the general aspects of AI and ML from the perspective of drug discovery followed by a comprehensive coverage of the recent developments in the applications of AI/ML techniques in CNS drug discovery.
Collapse
|
3
|
Kariofillis SK, Jiang S, Żurański AM, Gandhi SS, Martinez Alvarado JI, Doyle AG. Using Data Science To Guide Aryl Bromide Substrate Scope Analysis in a Ni/Photoredox-Catalyzed Cross-Coupling with Acetals as Alcohol-Derived Radical Sources. J Am Chem Soc 2022; 144:1045-1055. [PMID: 34985904 PMCID: PMC8810294 DOI: 10.1021/jacs.1c12203] [Citation(s) in RCA: 64] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Ni/photoredox catalysis has emerged as a powerful platform for C(sp2)-C(sp3) bond formation. While many of these methods typically employ aryl bromides as the C(sp2) coupling partner, a variety of aliphatic radical sources have been investigated. In principle, these reactions enable access to the same product scaffolds, but it can be hard to discern which method to employ because nonstandardized sets of aryl bromides are used in scope evaluation. Herein, we report a Ni/photoredox-catalyzed (deutero)methylation and alkylation of aryl halides where benzaldehyde di(alkyl) acetals serve as alcohol-derived radical sources. Reaction development, mechanistic studies, and late-stage derivatization of a biologically relevant aryl chloride, fenofibrate, are presented. Then, we describe the integration of data science techniques, including DFT featurization, dimensionality reduction, and hierarchical clustering, to delineate a diverse and succinct collection of aryl bromides that is representative of the chemical space of the substrate class. By superimposing scope examples from published Ni/photoredox methods on this same chemical space, we identify areas of sparse coverage and high versus low average yields, enabling comparisons between prior art and this new method. Additionally, we demonstrate that the systematically selected scope of aryl bromides can be used to quantify population-wide reactivity trends and reveal sources of possible functional group incompatibility with supervised machine learning.
Collapse
Affiliation(s)
- Stavros K. Kariofillis
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
- Department of Chemistry & Biochemistry, University of California, Los Angeles, Los Angeles, California 90095, United States
| | - Shutian Jiang
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Andrzej M. Żurański
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Shivaani S. Gandhi
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
- Department of Chemistry & Biochemistry, University of California, Los Angeles, Los Angeles, California 90095, United States
| | | | - Abigail G. Doyle
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
- Department of Chemistry & Biochemistry, University of California, Los Angeles, Los Angeles, California 90095, United States
| |
Collapse
|
4
|
Rinehart NI, Zahrt AF, Henle JJ, Denmark SE. Dreams, False Starts, Dead Ends, and Redemption: A Chronicle of the Evolution of a Chemoinformatic Workflow for the Optimization of Enantioselective Catalysts. Acc Chem Res 2021; 54:2041-2054. [PMID: 33856771 DOI: 10.1021/acs.accounts.0c00826] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Catalyst design in enantioselective catalysis has historically been driven by empiricism. In this endeavor, experimentalists attempt to qualitatively identify trends in structure that lead to a desired catalyst function. In this body of work, we lay the groundwork for an improved, alternative workflow that uses quantitative methods to inform decision making at every step of the process. At the outset, we define a library of synthetically accessible permutations of a catalyst scaffold with the philosophy that the library contains every potential catalyst we are willing to make. To represent these chiral molecules, we have developed general 3D representations, which can be calculated for tens of thousands of structures. This defines the total chemical space of a given catalyst scaffold; it is constructed on the basis of catalyst structure only without regard to a specific reaction or mechanism. As such, any algorithmic subset selection method, which is unsupervised (i.e., only considers catalyst structure), should provide an ideal initial screening set for any new reaction that can be catalyzed by that scaffold. Notably, because this design strategy, the same set of catalysts can be used for any reaction that can be catalyzed with that parent catalyst scaffold. These are tested experimentally, and statistical learning tools can be used to create a model relating catalyst structure to catalyst function. Further, this model can be used to predict the performance of each catalyst candidate in the greater database of virtual catalyst candidates. In this way, it is possible estimate the performance of tens of thousands of catalysts by experimentally testing a smaller subset. Using error assessment metrics, it is possible to understand the confidence in new predictions. An experimentalist using this tool can balance the predicted results (reward) with the prediction confidence (risk) when deciding which catalysts to synthesize next in an optimization campaign. These catalysts are synthesized and tested experimentally. At this stage, either the optimization is a success or the predicted values were incorrect and further optimization is required. In the case of the latter, the information can be fed back into the statistical learning model to refine the model, and this iterative process can be used to determine the optimal catalyst. In this body of work, we not only establish this workflow but quantitatively establish how best to execute each step. Herein, we evaluate several 3D molecular representations to determine how best to represent molecules. Several selection protocols are examined to best decide which set of molecules can be used to represent the library of interest. In addition, the number of reactions needed to make accurate, statistical learning models is evaluated. Taken together these components establish a tool ready to progress from the development stage to the utility stage. As such, current research endeavors focus on applying these tools to optimize new reactions.
Collapse
Affiliation(s)
- N. Ian Rinehart
- Roger Adams Laboratory, Department of Chemistry, University of Illinois, Urbana, Illinois 61801, United States
| | - Andrew F. Zahrt
- Roger Adams Laboratory, Department of Chemistry, University of Illinois, Urbana, Illinois 61801, United States
| | - Jeremy J. Henle
- Roger Adams Laboratory, Department of Chemistry, University of Illinois, Urbana, Illinois 61801, United States
| | - Scott E. Denmark
- Roger Adams Laboratory, Department of Chemistry, University of Illinois, Urbana, Illinois 61801, United States
| |
Collapse
|
5
|
Żurański AM, Martinez Alvarado JI, Shields BJ, Doyle AG. Predicting Reaction Yields via Supervised Learning. Acc Chem Res 2021; 54:1856-1865. [PMID: 33788552 DOI: 10.1021/acs.accounts.0c00770] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Numerous disciplines, such as image recognition and language translation, have been revolutionized by using machine learning (ML) to leverage big data. In organic synthesis, providing accurate chemical reactivity predictions with supervised ML could assist chemists with reaction prediction, optimization, and mechanistic interrogation.To apply supervised ML to chemical reactions, one needs to define the object of prediction (e.g., yield, enantioselectivity, solubility, or a recommendation) and represent reactions with descriptive data. Our group's effort has focused on representing chemical reactions using DFT-derived physical features of the reacting molecules and conditions, which serve as features for building supervised ML models.In this Account, we present a review and perspective on three studies conducted by our group where ML models have been employed to predict reaction yield. First, we focus on a small reaction data set where 16 phosphine ligands were evaluated in a single Ni-catalyzed Suzuki-Miyaura cross-coupling reaction, and the reaction yield was modeled with linear regression. In this setting, where the regression complexity is strongly limited by the amount of available data, we emphasize the importance of identifying single features that are directly relevant to reactivity. Next, we focus on models trained on two larger data sets obtained with high-throughput experimentation (HTE). With hundreds to thousands of reactions available, more complex models can be explored, for example, models that algorithmically perform feature selection from a broad set of candidate features. We examine how a variety of ML algorithms model these data sets and how well these models generalize to out-of-sample substrates. Specifically, we compare the ML models that use DFT-based featurization to a baseline model that is obtained with features that carry no physical information, that is, random features, and to a naive non-ML model that averages yields of reactions that share the same conditions and substrate combinations. We find that for only one of the two data sets, DFT-based featurization leads to a significant, although moderate, out-of-sample prediction improvement. The source of this improvement was further isolated to specific features which allowed us to formulate a testable mechanistic hypothesis that was validated experimentally. Finally, we offer remarks on supervised ML model building on HTE data sets focusing on algorithmic improvements in model training.Statistical methods in chemistry have a rich history, but only recently has ML gained widespread attention in reaction development. As the untapped potential of ML is explored, novel tools are likely to arise from future research. Our studies suggest that supervised ML can lead to improved predictions of reaction yield over simpler modeling methods and facilitate mechanistic understanding of reaction dynamics. However, further research and development is required to establish ML as an indispensable tool in reactivity modeling.
Collapse
Affiliation(s)
- Andrzej M. Żurański
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | | | - Benjamin J. Shields
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Abigail G. Doyle
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| |
Collapse
|
6
|
Xie L, Xu L, Kong R, Chang S, Xu X. Improvement of Prediction Performance With Conjoint Molecular Fingerprint in Deep Learning. Front Pharmacol 2021; 11:606668. [PMID: 33488387 PMCID: PMC7819282 DOI: 10.3389/fphar.2020.606668] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 11/23/2020] [Indexed: 12/27/2022] Open
Abstract
The accurate predicting of physical properties and bioactivity of drug molecules in deep learning depends on how molecules are represented. Many types of molecular descriptors have been developed for quantitative structure-activity/property relationships quantitative structure-activity relationships (QSPR). However, each molecular descriptor is optimized for a specific application with encoding preference. Considering that standalone featurization methods may only cover parts of information of the chemical molecules, we proposed to build the conjoint fingerprint by combining two supplementary fingerprints. The impact of conjoint fingerprint and each standalone fingerprint on predicting performance was systematically evaluated in predicting the logarithm of the partition coefficient (logP) and binding affinity of protein-ligand by using machine learning/deep learning (ML/DL) methods, including random forest (RF), support vector regression (SVR), extreme gradient boosting (XGBoost), long short-term memory network (LSTM), and deep neural network (DNN). The results demonstrated that the conjoint fingerprint yielded improved predictive performance, even outperforming the consensus model using two standalone fingerprints among four out of five examined methods. Given that the conjoint fingerprint scheme shows easy extensibility and high applicability, we expect that the proposed conjoint scheme would create new opportunities for continuously improving predictive performance of deep learning by harnessing the complementarity of various types of fingerprints.
Collapse
Affiliation(s)
- Liangxu Xie
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China.,Jiangsu Sino-Israel Industrial Technology Research Institute, Changzhou, China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Ren Kong
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| |
Collapse
|
7
|
Hsiao Y, Su BH, Tseng YJ. Current development of integrated web servers for preclinical safety and pharmacokinetics assessments in drug development. Brief Bioinform 2020; 22:5881374. [PMID: 32770190 DOI: 10.1093/bib/bbaa160] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 06/22/2020] [Accepted: 06/24/2020] [Indexed: 12/27/2022] Open
Abstract
In drug development, preclinical safety and pharmacokinetics assessments of candidate drugs to ensure the safety profile are a must. While in vivo and in vitro tests are traditionally used, experimental determinations have disadvantages, as they are usually time-consuming and costly. In silico predictions of these preclinical endpoints have each been developed in the past decades. However, only a few web-based tools have integrated different models to provide a simple one-step platform to help researchers thoroughly evaluate potential drug candidates. To efficiently achieve this approach, a platform for preclinical evaluation must not only predict key ADMET (absorption, distribution, metabolism, excretion and toxicity) properties but also provide some guidance on structural modifications to improve the undesired properties. In this review, we organized and compared several existing integrated web servers that can be adopted in preclinical drug development projects to evaluate the subject of interest. We also introduced our new web server, Virtual Rat, as an alternative choice to profile the properties of drug candidates. In Virtual Rat, we provide not only predictions of important ADMET properties but also possible reasons as to why the model made those structural predictions. Multiple models were implemented into Virtual Rat, including models for predicting human ether-a-go-go-related gene (hERG) inhibition, cytochrome P450 (CYP) inhibition, mutagenicity (Ames test), blood-brain barrier penetration, cytotoxicity and Caco-2 permeability. Virtual Rat is free and has been made publicly available at https://virtualrat.cmdm.tw/.
Collapse
|
8
|
Sandfort F, Strieth-Kalthoff F, Kühnemund M, Beecks C, Glorius F. A Structure-Based Platform for Predicting Chemical Reactivity. Chem 2020. [DOI: 10.1016/j.chempr.2020.02.017] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
9
|
Schaduangrat N, Lampa S, Simeon S, Gleeson MP, Spjuth O, Nantasenamat C. Towards reproducible computational drug discovery. J Cheminform 2020; 12:9. [PMID: 33430992 PMCID: PMC6988305 DOI: 10.1186/s13321-020-0408-x] [Citation(s) in RCA: 78] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 01/02/2020] [Indexed: 12/11/2022] Open
Abstract
The reproducibility of experiments has been a long standing impediment for further scientific progress. Computational methods have been instrumental in drug discovery efforts owing to its multifaceted utilization for data collection, pre-processing, analysis and inference. This article provides an in-depth coverage on the reproducibility of computational drug discovery. This review explores the following topics: (1) the current state-of-the-art on reproducible research, (2) research documentation (e.g. electronic laboratory notebook, Jupyter notebook, etc.), (3) science of reproducible research (i.e. comparison and contrast with related concepts as replicability, reusability and reliability), (4) model development in computational drug discovery, (5) computational issues on model development and deployment, (6) use case scenarios for streamlining the computational drug discovery protocol. In computational disciplines, it has become common practice to share data and programming codes used for numerical calculations as to not only facilitate reproducibility, but also to foster collaborations (i.e. to drive the project further by introducing new ideas, growing the data, augmenting the code, etc.). It is therefore inevitable that the field of computational drug design would adopt an open approach towards the collection, curation and sharing of data/code.
Collapse
Affiliation(s)
- Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, 10700, Bangkok, Thailand
| | - Samuel Lampa
- Department of Pharmaceutical Biosciences, Uppsala University, 751 24, Uppsala, Sweden
| | - Saw Simeon
- Interdisciplinary Graduate Program in Bioscience, Faculty of Science, Kasetsart University, 10900, Bangkok, Thailand
| | - Matthew Paul Gleeson
- Department of Biomedical Engineering, Faculty of Engineering, King Mongkut's Institute of Technology Ladkrabang, 10520, Bangkok, Thailand.
| | - Ola Spjuth
- Department of Pharmaceutical Biosciences, Uppsala University, 751 24, Uppsala, Sweden.
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, 10700, Bangkok, Thailand.
| |
Collapse
|
10
|
Bellera CL, Talevi A. Quantitative structure-activity relationship models for compounds with anticonvulsant activity. Expert Opin Drug Discov 2019; 14:653-665. [PMID: 31072145 DOI: 10.1080/17460441.2019.1613368] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Introduction: Third-generation antiepileptic drugs have seemingly failed to improve the global figures of seizure control and can still be regarded as symptomatic treatments. Quantitative structure-activity relationships (QSAR) can be used to guide hit-to-lead and lead optimization projects and applied to the large-scale virtual screening of chemical libraries. Areas covered: In this review, the authors cover reports on QSAR models related to antiepileptic drugs and drug targets in epilepsy, analyzing whether they refer to classic or non-classic QSAR and if they apply QSAR as a descriptive or predictive approach, among other considerations. The article finally focuses on a more detailed discussion of those predictive studies which include some sort of experimental validation, i.e. papers in which the reported models have been used to identify novel active compounds which have been tested in vitro and/or in vivo. Expert opinion: There are significant opportunities to apply the QSAR methodology to assist the discovery of more efficacious antiepileptic drugs. Considering the intrinsic complexity of the disorder, such applications should focus on state-of-the-art approximations (e.g. systemic, multi-target and multi-scale QSAR as well as ensemble and deep learning) and modeling the effects on novel drug targets and modern screening tools.
Collapse
Affiliation(s)
- Carolina L Bellera
- a Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences , University of La Plata (UNLP) , La Plata, Buenos Aires , Argentina.,b CCT La Plata , Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) , Buenos Aires , Argentina
| | - Alan Talevi
- a Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences , University of La Plata (UNLP) , La Plata, Buenos Aires , Argentina.,b CCT La Plata , Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) , Buenos Aires , Argentina
| |
Collapse
|
11
|
Hanser T, Steinmetz FP, Plante J, Rippmann F, Krier M. Avoiding hERG-liability in drug design via synergetic combinations of different (Q)SAR methodologies and data sources: a case study in an industrial setting. J Cheminform 2019; 11:9. [PMID: 30712151 PMCID: PMC6689868 DOI: 10.1186/s13321-019-0334-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Accepted: 01/25/2019] [Indexed: 11/25/2022] Open
Abstract
In this paper, we explore the impact of combining different in silico prediction approaches and data sources on the predictive performance of the resulting system. We use inhibition of the hERG ion channel target as the endpoint for this study as it constitutes a key safety concern in drug development and a potential cause of attrition. We will show that combining data sources can improve the relevance of the training set in regard of the target chemical space, leading to improved performance. Similarly we will demonstrate that combining multiple statistical models together, and with expert systems, can lead to positive synergistic effects when taking into account the confidence in the predictions of the merged systems. The best combinations analyzed display a good hERG predictivity. Finally, this work demonstrates the suitability of the SOHN methodology for building models in the context of receptor based endpoints like hERG inhibition when using the appropriate pharmacophoric descriptors.
Collapse
|
12
|
Zahrt AF, Henle JJ, Rose BT, Wang Y, Darrow WT, Denmark SE. Prediction of higher-selectivity catalysts by computer-driven workflow and machine learning. Science 2019; 363:363/6424/eaau5631. [PMID: 30655414 DOI: 10.1126/science.aau5631] [Citation(s) in RCA: 236] [Impact Index Per Article: 47.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2018] [Accepted: 12/03/2018] [Indexed: 12/18/2022]
Abstract
Catalyst design in asymmetric reaction development has traditionally been driven by empiricism, wherein experimentalists attempt to qualitatively recognize structural patterns to improve selectivity. Machine learning algorithms and chemoinformatics can potentially accelerate this process by recognizing otherwise inscrutable patterns in large datasets. Herein we report a computationally guided workflow for chiral catalyst selection using chemoinformatics at every stage of development. Robust molecular descriptors that are agnostic to the catalyst scaffold allow for selection of a universal training set on the basis of steric and electronic properties. This set can be used to train machine learning methods to make highly accurate predictive models over a broad range of selectivity space. Using support vector machines and deep feed-forward neural networks, we demonstrate accurate predictive modeling in the chiral phosphoric acid-catalyzed thiol addition to N-acylimines.
Collapse
Affiliation(s)
- Andrew F Zahrt
- Roger Adams Laboratory, Department of Chemistry, University of Illinois, Urbana, IL 61801, USA
| | - Jeremy J Henle
- Roger Adams Laboratory, Department of Chemistry, University of Illinois, Urbana, IL 61801, USA
| | - Brennan T Rose
- Roger Adams Laboratory, Department of Chemistry, University of Illinois, Urbana, IL 61801, USA
| | - Yang Wang
- Roger Adams Laboratory, Department of Chemistry, University of Illinois, Urbana, IL 61801, USA
| | - William T Darrow
- Roger Adams Laboratory, Department of Chemistry, University of Illinois, Urbana, IL 61801, USA
| | - Scott E Denmark
- Roger Adams Laboratory, Department of Chemistry, University of Illinois, Urbana, IL 61801, USA.
| |
Collapse
|
13
|
Türkmenoğlu B, Güzel Y. Molecular docking and 4D-QSAR studies of metastatic cancer inhibitor thiazoles. Comput Biol Chem 2018; 76:327-337. [PMID: 30145406 DOI: 10.1016/j.compbiolchem.2018.07.003] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Revised: 06/29/2018] [Accepted: 07/03/2018] [Indexed: 11/28/2022]
Abstract
By using the molecular docking and 4D-QSAR analysis, it is aimed to find the interaction points in the receptor binding site of transforming growth factor-beta (TGF-beta) used to inhibit invasion and metastasis. To elucidate the interaction points of receptor, different types of local reactive descriptor (LRD) of ligands have been used. Activity values related to interaction energy between the ligand-receptor (L-R) were determined by nonlinear least squares (NLLS) using the Levenberg-Marquardt (LM) algorithm. Using the Molecule Comparative Electron Topology (MCET) method, the 3D pharmacophore model (3D-PhaM) was obtained after alignment and superimposition of the molecules, and also confirmed by molecular docking method. With the leave one out-cross validation (LOO-CV) method, the best predictions are q2 or rCV2 = 0.789 for the 51 compounds in the internal training set and r2 = 0.785 for the 13 compounds in the external test set. Furthermore, the predictive capability of the advanced QSAR model is more precisely calculated with the rm2 metric (rm2 = 0.769).
Collapse
Affiliation(s)
- Burçin Türkmenoğlu
- Department of Chemistry, Faculty of Science, Erciyes University, 38039, Kayseri, Turkey.
| | - Yahya Güzel
- Department of Chemistry, Faculty of Science, Erciyes University, 38039, Kayseri, Turkey
| |
Collapse
|
14
|
Türkmenoğlu B, Yilmaz H, Su EM, Alp Tokat T, Güzel Y. 4D-QSAR study of flavonoid derivatives with MCET method. ACTA ACUST UNITED AC 2017. [DOI: 10.32571/ijct.338920] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
15
|
Kumar RP, Kulkarni N. A receptor dependent-4D QSAR approach to predict the activity of mutated enzymes. Sci Rep 2017; 7:6273. [PMID: 28740233 PMCID: PMC5524700 DOI: 10.1038/s41598-017-06625-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2016] [Accepted: 06/15/2017] [Indexed: 11/29/2022] Open
Abstract
Screening and selection tools to obtain focused libraries play a key role in successfully engineering enzymes of desired qualities. The quality of screening depends on efficient assays; however, a focused library generated with a priori information plays a major role in effectively identifying the right enzyme. As a proof of concept, for the first time, receptor dependent - 4D Quantitative Structure Activity Relationship (RD-4D-QSAR) has been implemented to predict kinetic properties of an enzyme. The novelty of this study is that the mutated enzymes also form a part of the training data set. The mutations were modeled in a serine protease and molecular dynamics simulations were conducted to derive enzyme-substrate (E-S) conformations. The E-S conformations were enclosed in a high resolution grid consisting of 156,250 grid points that stores interaction energies to generate QSAR models to predict the enzyme activity. The QSAR predictions showed similar results as reported in the kinetic studies with >80% specificity and >50% sensitivity revealing that the top ranked models unambiguously differentiated enzymes with high and low activity. The interaction energy descriptors of the best QSAR model were used to identify residues responsible for enzymatic activity and substrate specificity.
Collapse
Affiliation(s)
- R Pravin Kumar
- Polyclone Bioservices, #437, 40th Cross, Jayanagar 5th Block, Bangalore, 560041, India.
| | - Naveen Kulkarni
- Polyclone Bioservices, #437, 40th Cross, Jayanagar 5th Block, Bangalore, 560041, India
| |
Collapse
|
16
|
Quantitative Structure-Activity Relationship Studies for Potential Rho-Associated Protein Kinase Inhibitors. J CHEM-NY 2016. [DOI: 10.1155/2016/9198582] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
A series of pyridylthiazole derivatives developed by Lawrence et al. as Rho-associated protein kinase inhibitors were subjected to four-dimensional quantitative structure-activity relationship (4D-QSAR) analysis. The models were generated applying genetic algorithm (GA) optimization combined with partial least squares (PLS) regression. The best model presented validation values ofr2=0.773,qCV2=0.672,rpred2=0.503,Δrm2=0.197,rm test2=0.520,rY-rand2=0.19, andRp2=0.590. Furthermore, analyzing the descriptors it was possible to propose new compounds that predicted higher inhibitory concentration values than the most active compound of the series.
Collapse
|
17
|
Esposito EX, Hopfinger AJ, Shao CY, Su BH, Chen SZ, Tseng YJ. Exploring possible mechanisms of action for the nanotoxicity and protein binding of decorated nanotubes: interpretation of physicochemical properties from optimal QSAR models. Toxicol Appl Pharmacol 2015. [PMID: 26200234 DOI: 10.1016/j.taap.2015.07.008] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Carbon nanotubes have become widely used in a variety of applications including biosensors and drug carriers. Therefore, the issue of carbon nanotube toxicity is increasingly an area of focus and concern. While previous studies have focused on the gross mechanisms of action relating to nanomaterials interacting with biological entities, this study proposes detailed mechanisms of action, relating to nanotoxicity, for a series of decorated (functionalized) carbon nanotube complexes based on previously reported QSAR models. Possible mechanisms of nanotoxicity for six endpoints (bovine serum albumin, carbonic anhydrase, chymotrypsin, hemoglobin along with cell viability and nitrogen oxide production) have been extracted from the corresponding optimized QSAR models. The molecular features relevant to each of the endpoint respective mechanism of action for the decorated nanotubes are also discussed. Based on the molecular information contained within the optimal QSAR models for each nanotoxicity endpoint, either the decorator attached to the nanotube is directly responsible for the expression of a particular activity, irrespective of the decorator's 3D-geometry and independent of the nanotube, or those decorators having structures that place the functional groups of the decorators as far as possible from the nanotube surface most strongly influence the biological activity. These molecular descriptors are further used to hypothesize specific interactions involved in the expression of each of the six biological endpoints.
Collapse
Affiliation(s)
- Emilio Xavier Esposito
- exeResearch, LLC, 32 University Drive, East Lansing, MI 48823, USA; The Chem21 Group, Inc., 1780 Wilson Drive, Lake Forest, IL 60045, USA.
| | - Anton J Hopfinger
- The Chem21 Group, Inc., 1780 Wilson Drive, Lake Forest, IL 60045, USA; College of Pharmacy MSC09 5360, 1 University of New Mexico, Albuquerque, NM, 87131, USA.
| | - Chi-Yu Shao
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei 106, Taiwan
| | - Bo-Han Su
- Department of Computer Science and Information Engineering, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei 106, Taiwan
| | - Sing-Zuo Chen
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei 106, Taiwan
| | - Yufeng Jane Tseng
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei 106, Taiwan; Department of Computer Science and Information Engineering, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei 106, Taiwan.
| |
Collapse
|
18
|
Khedkar VM, Coutinho EC. CoRILISA: A Local Similarity Based Receptor Dependent QSAR Method. J Chem Inf Model 2015; 55:194-205. [DOI: 10.1021/ci5006367] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Vijay M. Khedkar
- Department of Pharmaceutical Chemistry, Bombay College of Pharmacy, Kalina, Santacruz (E), Mumbai 400098, India
| | - Evans C. Coutinho
- Department of Pharmaceutical Chemistry, Bombay College of Pharmacy, Kalina, Santacruz (E), Mumbai 400098, India
| |
Collapse
|
19
|
Hamza A, Wagner JM, Wei NN, Kwiatkowski S, Zhan CG, Watt DS, Korotkov KV. Application of the 4D fingerprint method with a robust scoring function for scaffold-hopping and drug repurposing strategies. J Chem Inf Model 2014; 54:2834-45. [PMID: 25229183 PMCID: PMC4210175 DOI: 10.1021/ci5003872] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
![]()
Two
factors contribute to the inefficiency associated with screening
pharmaceutical library collections as a means of identifying new drugs:
[1] the limited success of virtual screening (VS) methods in identifying
new scaffolds; [2] the limited accuracy of computational methods in
predicting off-target effects. We recently introduced a 3D shape-based
similarity algorithm of the SABRE program, which encodes a consensus
molecular shape pattern of a set of active ligands into a 4D fingerprint
descriptor. Here, we report a mathematical model for shape similarity
comparisons and ligand database filtering using this 4D fingerprint
method and benchmarked the scoring function HWK (Hamza–Wei–Korotkov),
using the 81 targets of the DEKOIS database. Subsequently, we applied
our combined 4D fingerprint and HWK scoring function
VS approach in scaffold-hopping and drug repurposing using the National
Cancer Institute (NCI) and Food and Drug Administration (FDA) databases,
and we identified new inhibitors with different scaffolds of MycP1 protease from the mycobacterial ESX-1 secretion system. Experimental
evaluation of nine compounds from the NCI database and three from
the FDA database displayed IC50 values ranging from 70
to 100 μM against MycP1 and possessed high structural
diversity, which provides departure points for further structure–activity
relationship (SAR) optimization. In addition, this study demonstrates
that the combination of our 4D fingerprint algorithm and the HWK scoring function may provide a means for identifying
repurposed drugs for the treatment of infectious diseases and may
be used in the drug-target profile strategy.
Collapse
Affiliation(s)
- Adel Hamza
- Department of Molecular and Cellular Biochemistry, ‡Center for Structural Biology, §Center for Pharmaceutical Research and Innovation, College of Pharmacy, ∥Molecular Modeling and Biopharmaceutical Center, and ⊥Department of Pharmaceutical Sciences, College of Pharmacy, University of Kentucky , Lexington, Kentucky 40536, United States
| | | | | | | | | | | | | |
Collapse
|
20
|
Eriksson M, Chen H, Carlsson L, Nissink JWM, Cumming JG, Nilsson I. Beyond the Scope of Free-Wilson Analysis. 2: Can Distance Encoded R-Group Fingerprints Provide Interpretable Nonlinear Models? J Chem Inf Model 2014; 54:1117-28. [DOI: 10.1021/ci500075q] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Affiliation(s)
- Mats Eriksson
- Chemistry Innovation Center, Discovery Sciences, ‡CVMD Innovative Medicines and §Computational Toxicology, Global Safety Assessment, AstraZeneca R&D, Mölndal 431 83, Sweden
- Oncology Innovative Medicines and ⊥Chemistry Innovation Center, Discovery Sciences, AstraZeneca R&D, Alderley Park, Macclesfield SK10 4TG, U.K
| | - Hongming Chen
- Chemistry Innovation Center, Discovery Sciences, ‡CVMD Innovative Medicines and §Computational Toxicology, Global Safety Assessment, AstraZeneca R&D, Mölndal 431 83, Sweden
- Oncology Innovative Medicines and ⊥Chemistry Innovation Center, Discovery Sciences, AstraZeneca R&D, Alderley Park, Macclesfield SK10 4TG, U.K
| | - Lars Carlsson
- Chemistry Innovation Center, Discovery Sciences, ‡CVMD Innovative Medicines and §Computational Toxicology, Global Safety Assessment, AstraZeneca R&D, Mölndal 431 83, Sweden
- Oncology Innovative Medicines and ⊥Chemistry Innovation Center, Discovery Sciences, AstraZeneca R&D, Alderley Park, Macclesfield SK10 4TG, U.K
| | - J. Willem M. Nissink
- Chemistry Innovation Center, Discovery Sciences, ‡CVMD Innovative Medicines and §Computational Toxicology, Global Safety Assessment, AstraZeneca R&D, Mölndal 431 83, Sweden
- Oncology Innovative Medicines and ⊥Chemistry Innovation Center, Discovery Sciences, AstraZeneca R&D, Alderley Park, Macclesfield SK10 4TG, U.K
| | - John G. Cumming
- Chemistry Innovation Center, Discovery Sciences, ‡CVMD Innovative Medicines and §Computational Toxicology, Global Safety Assessment, AstraZeneca R&D, Mölndal 431 83, Sweden
- Oncology Innovative Medicines and ⊥Chemistry Innovation Center, Discovery Sciences, AstraZeneca R&D, Alderley Park, Macclesfield SK10 4TG, U.K
| | - Ingemar Nilsson
- Chemistry Innovation Center, Discovery Sciences, ‡CVMD Innovative Medicines and §Computational Toxicology, Global Safety Assessment, AstraZeneca R&D, Mölndal 431 83, Sweden
- Oncology Innovative Medicines and ⊥Chemistry Innovation Center, Discovery Sciences, AstraZeneca R&D, Alderley Park, Macclesfield SK10 4TG, U.K
| |
Collapse
|
21
|
Chang CY, Hsu MT, Esposito EX, Tseng YJ. Oversampling to Overcome Overfitting: Exploring the Relationship between Data Set Composition, Molecular Descriptors, and Predictive Modeling Methods. J Chem Inf Model 2013; 53:958-71. [DOI: 10.1021/ci4000536] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Affiliation(s)
- Chia-Yun Chang
- School of Pharmacy, College of Medicine, National Taiwan University, No.1, Sec.1, Jen-Ai Road,
Taipei, Taiwan 100
| | - Ming-Tsung Hsu
- Genome
and Systems Biology Degree Program, College of Life Science, National Taiwan University, No.1 Sec.4, Roosevelt Road,
Taipei, Taiwan 106
| | | | - Yufeng J. Tseng
- School of Pharmacy, College of Medicine, National Taiwan University, No.1, Sec.1, Jen-Ai Road,
Taipei, Taiwan 100
- Genome
and Systems Biology Degree Program, College of Life Science, National Taiwan University, No.1 Sec.4, Roosevelt Road,
Taipei, Taiwan 106
- Department of Computer Science and Information
Engineering, National Taiwan University, No.1 Sec.4, Roosevelt Road, Taipei, Taiwan 106
- Graduate Institute of Biomedical Electronics and
Bioinformatics, National Taiwan University, No.1 Sec.4, Roosevelt Road, Taipei, Taiwan 106
| |
Collapse
|
22
|
Palacios-Bejarano B, Cerruela García G, Luque Ruiz I, Gómez-Nieto MÁ. QSAR model based on weighted MCS trees approach for the representation of molecule data sets. J Comput Aided Mol Des 2013; 27:185-201. [DOI: 10.1007/s10822-013-9637-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2012] [Accepted: 02/01/2013] [Indexed: 11/28/2022]
|
23
|
Gleeson MP, Montanari D. Strategies for the generation, validation and application of in silico ADMET models in lead generation and optimization. Expert Opin Drug Metab Toxicol 2012; 8:1435-46. [PMID: 22849616 DOI: 10.1517/17425255.2012.711317] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
INTRODUCTION The most desirable chemical starting point in drug discovery is a hit or lead with a good overall profile, and where there may be issues; a clear SAR strategy should be identifiable to minimize the issue. Filtering based on drug-likeness concepts are a first step, but more accurate theoretical methods are needed to i) estimate the biological profile of molecule in question and ii) based on the underlying structure-activity relationships used by the model, estimate whether it is likely that the molecule in question can be altered to remove these liabilities. AREAS COVERED In this paper, the authors discuss the generation of ADMET models and their practical use in decision making. They discuss the issues surrounding data collation, experimental errors, the model assessment and validation steps, as well as the different types of descriptors and statistical models that can be used. This is followed by a discussion on how the model accuracy will dictate when and where it can be used in the drug discovery process. The authors also discuss how models can be developed to more effectively enable multiple parameter optimization. EXPERT OPINION Models can be applied in lead generation and lead optimization steps to i) rank order a collection of hits, ii) prioritize the experimental assays needed for different hit series, iii) assess the likelihood of resolving a problem that might be present in a particular series in lead optimization and iv) screen a virtual library based on a hit or lead series to assess the impact of diverse structural changes on the predicted properties.
Collapse
Affiliation(s)
- Matthew Paul Gleeson
- Kasetsart University, Faculty of Science, Department of Chemistry, 50 Phaholyothin Rd, Chatuchak, Bangkok 10900, Thailand.
| | | |
Collapse
|
24
|
The great descriptor melting pot: mixing descriptors for the common good of QSAR models. J Comput Aided Mol Des 2011; 26:39-43. [DOI: 10.1007/s10822-011-9511-4] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2011] [Accepted: 12/02/2011] [Indexed: 10/14/2022]
|
25
|
Jahn A, Rosenbaum L, Hinselmann G, Zell A. 4D Flexible Atom-Pairs: An efficient probabilistic conformational space comparison for ligand-based virtual screening. J Cheminform 2011; 3:23. [PMID: 21733172 PMCID: PMC3156737 DOI: 10.1186/1758-2946-3-23] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2011] [Accepted: 07/06/2011] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND The performance of 3D-based virtual screening similarity functions is affected by the applied conformations of compounds. Therefore, the results of 3D approaches are often less robust than 2D approaches. The application of 3D methods on multiple conformer data sets normally reduces this weakness, but entails a significant computational overhead. Therefore, we developed a special conformational space encoding by means of Gaussian mixture models and a similarity function that operates on these models. The application of a model-based encoding allows an efficient comparison of the conformational space of compounds. RESULTS Comparisons of our 4D flexible atom-pair approach with over 15 state-of-the-art 2D- and 3D-based virtual screening similarity functions on the 40 data sets of the Directory of Useful Decoys show a robust performance of our approach. Even 3D-based approaches that operate on multiple conformers yield inferior results. The 4D flexible atom-pair method achieves an averaged AUC value of 0.78 on the filtered Directory of Useful Decoys data sets. The best 2D- and 3D-based approaches of this study yield an AUC value of 0.74 and 0.72, respectively. As a result, the 4D flexible atom-pair approach achieves an average rank of 1.25 with respect to 15 other state-of-the-art similarity functions and four different evaluation metrics. CONCLUSIONS Our 4D method yields a robust performance on 40 pharmaceutically relevant targets. The conformational space encoding enables an efficient comparison of the conformational space. Therefore, the weakness of the 3D-based approaches on single conformations is circumvented. With over 100,000 similarity calculations on a single desktop CPU, the utilization of the 4D flexible atom-pair in real-world applications is feasible.
Collapse
Affiliation(s)
- Andreas Jahn
- University of Tübingen, Center for Bioinformatics Tübingen (ZBIT), Sand 1, 72076 Tübingen, Germany
| | - Lars Rosenbaum
- University of Tübingen, Center for Bioinformatics Tübingen (ZBIT), Sand 1, 72076 Tübingen, Germany
| | - Georg Hinselmann
- University of Tübingen, Center for Bioinformatics Tübingen (ZBIT), Sand 1, 72076 Tübingen, Germany
| | - Andreas Zell
- University of Tübingen, Center for Bioinformatics Tübingen (ZBIT), Sand 1, 72076 Tübingen, Germany
| |
Collapse
|
26
|
Shen MY, Su BH, Esposito EX, Hopfinger AJ, Tseng YJ. A Comprehensive Support Vector Machine Binary hERG Classification Model Based on Extensive but Biased End Point hERG Data Sets. Chem Res Toxicol 2011; 24:934-49. [DOI: 10.1021/tx200099j] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
27
|
Su BH, Shen MY, Esposito EX, Hopfinger AJ, Tseng YJ. In Silico Binary Classification QSAR Models Based on 4D-Fingerprints and MOE Descriptors for Prediction of hERG Blockage. J Chem Inf Model 2010; 50:1304-18. [DOI: 10.1021/ci100081j] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Bo-Han Su
- Department of Computer Science and Information Engineering, National Taiwan University, No.1 Sec.4, Roosevelt Road, Taipei, Taiwan 106, exeResearch, LLC, 32 University Drive, East Lansing, Michigan 48823, Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No.1 Sec.4, Roosevelt Road, Taipei, Taiwan 106, The Chem21 Group, Inc., 1780 Wilson Drive, Lake Forest, Illinois 60045, and College of Pharmacy MSC09 5360, 1 University of New Mexico, Albuquerque, New Mexico
| | - Meng-yu Shen
- Department of Computer Science and Information Engineering, National Taiwan University, No.1 Sec.4, Roosevelt Road, Taipei, Taiwan 106, exeResearch, LLC, 32 University Drive, East Lansing, Michigan 48823, Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No.1 Sec.4, Roosevelt Road, Taipei, Taiwan 106, The Chem21 Group, Inc., 1780 Wilson Drive, Lake Forest, Illinois 60045, and College of Pharmacy MSC09 5360, 1 University of New Mexico, Albuquerque, New Mexico
| | - Emilio Xavier Esposito
- Department of Computer Science and Information Engineering, National Taiwan University, No.1 Sec.4, Roosevelt Road, Taipei, Taiwan 106, exeResearch, LLC, 32 University Drive, East Lansing, Michigan 48823, Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No.1 Sec.4, Roosevelt Road, Taipei, Taiwan 106, The Chem21 Group, Inc., 1780 Wilson Drive, Lake Forest, Illinois 60045, and College of Pharmacy MSC09 5360, 1 University of New Mexico, Albuquerque, New Mexico
| | - Anton J. Hopfinger
- Department of Computer Science and Information Engineering, National Taiwan University, No.1 Sec.4, Roosevelt Road, Taipei, Taiwan 106, exeResearch, LLC, 32 University Drive, East Lansing, Michigan 48823, Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No.1 Sec.4, Roosevelt Road, Taipei, Taiwan 106, The Chem21 Group, Inc., 1780 Wilson Drive, Lake Forest, Illinois 60045, and College of Pharmacy MSC09 5360, 1 University of New Mexico, Albuquerque, New Mexico
| | - Yufeng J. Tseng
- Department of Computer Science and Information Engineering, National Taiwan University, No.1 Sec.4, Roosevelt Road, Taipei, Taiwan 106, exeResearch, LLC, 32 University Drive, East Lansing, Michigan 48823, Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No.1 Sec.4, Roosevelt Road, Taipei, Taiwan 106, The Chem21 Group, Inc., 1780 Wilson Drive, Lake Forest, Illinois 60045, and College of Pharmacy MSC09 5360, 1 University of New Mexico, Albuquerque, New Mexico
| |
Collapse
|
28
|
Andrade CH, Pasqualoto KFM, Ferreira EI, Hopfinger AJ. 4D-QSAR: perspectives in drug design. Molecules 2010; 15:3281-94. [PMID: 20657478 PMCID: PMC6263259 DOI: 10.3390/molecules15053281] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2010] [Revised: 03/30/2010] [Accepted: 04/06/2010] [Indexed: 12/05/2022] Open
Abstract
Drug design is a process driven by innovation and technological breakthroughs involving a combination of advanced experimental and computational methods. A broad variety of medicinal chemistry approaches can be used for the identification of hits, generation of leads, as well as to accelerate the optimization of leads into drug candidates. The quantitative structure–activity relationship (QSAR) formalisms are among the most important strategies that can be applied for the successful design new molecules. This review provides a comprehensive review on the evolution and current status of 4D-QSAR, highlighting present challenges and new opportunities in drug design.
Collapse
Affiliation(s)
- Carolina H. Andrade
- Laboratory of Molecular Modeling, Faculty of Pharmacy, Federal University of Goiás, 1ª Av. c/ Praça Universitária, S/N., Goiânia, Goiás, 74605-220, Brazil
- College of Pharmacy, MSC09 5360, 1 University of New Mexico, Albuquerque, New Mexico 87131-0001, USA; E-Mail: (A.J.H.)
- Author to whom correspondence should be addressed; E-Mail:
| | - Kerly F. M. Pasqualoto
- Faculty of Pharmaceutical Sciences, Av. Prof. Lineu Prestes, 580, University of Sao Paulo, Sao Paulo, 05508-900, Brazil; E-Mails: (K.F.M.P.); (E.I.F.)
| | - Elizabeth I. Ferreira
- Faculty of Pharmaceutical Sciences, Av. Prof. Lineu Prestes, 580, University of Sao Paulo, Sao Paulo, 05508-900, Brazil; E-Mails: (K.F.M.P.); (E.I.F.)
| | - Anton J. Hopfinger
- College of Pharmacy, MSC09 5360, 1 University of New Mexico, Albuquerque, New Mexico 87131-0001, USA; E-Mail: (A.J.H.)
- The Chem21 Group, Inc., 17870 Wilson Drive. Lake Forest, IL 60045, USA
| |
Collapse
|
29
|
Filimonov DA, Zakharov AV, Lagunin AA, Poroikov VV. QNA-based 'Star Track' QSAR approach. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2009; 20:679-709. [PMID: 20024804 DOI: 10.1080/10629360903438370] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
In the existing quantitative structure-activity relationship (QSAR) methods any molecule is represented as a single point in a many-dimensional space of molecular descriptors. We propose a new QSAR approach based on Quantitative Neighbourhoods of Atoms (QNA) descriptors, which characterize each atom of a molecule and depend on the whole molecule structure. In the 'Star Track' methodology any molecule is represented as a set of points in a two-dimensional space of QNA descriptors. With our new method the estimate of the target property of a chemical compound is calculated as the average value of the function of QNA descriptors in the points of the atoms of a molecule in QNA descriptor space. Substantially, we propose the use of only two descriptors rather than more than 3000 molecular descriptors that apply in the QSAR method. On the basis of this approach we have developed the computer program GUSAR and compared it with several widely used QSAR methods including CoMFA, CoMSIA, Golpe/GRID, HQSAR and others, using ten data sets representing various chemical series and diverse types of biological activity. We show that in the majority of cases the accuracy and predictivity of GUSAR models appears to be better than those for the reference QSAR methods. High predictive ability and robustness of GUSAR are also shown in the leave-20%-out cross-validation procedure.
Collapse
Affiliation(s)
- D A Filimonov
- Institute of Biomedical Chemistry of Russian Academy of Medical Sciences, Moscow, Russia.
| | | | | | | |
Collapse
|
30
|
Thipnate P, Liu J, Hannongbua S, Hopfinger AJ. 3D pharmacophore mapping using 4D QSAR analysis for the cytotoxicity of lamellarins against human hormone-dependent T47D breast cancer cells. J Chem Inf Model 2009; 49:2312-22. [PMID: 19799437 PMCID: PMC2798151 DOI: 10.1021/ci9002427] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
4D quantitative structure-activity relationship (QSAR) and 3D pharmacophore models were built and investigated for cytotoxicity using a training set of 25 lamellarins against human hormone dependent T47D breast cancer cells. Receptor-independent (RI) 4D QSAR models were first constructed from the exploration of eight possible receptor-binding alignments for the entire training set. Since the training set is small (25 compounds), the generality of the 4D QSAR paradigm was then exploited to devise a strategy to maximize the extraction of binding information from the training set and to also permit virtual screening of diverse lamellarin chemistry. 4D QSAR models were sought for only six of the most potent lamellarins of the training set as well as another subset composed of lamellarins with constrained ranges in molecular weight and lipophilicity. This overall modeling strategy has permitted maximizing 3D pharmacophore information from this small set of structurally complex lamellarins that can be used to drive future analog synthesis and the selection of alternate scaffolds. Overall, it was found that the formation of an intermolecular hydrogen bond and the hydrophobic interactions for substituents on the E ring most modulate the cytotoxicity against T47D breast cancer cells. Hydrophobic substitutions on the F-ring can also enhance cytotoxic potency. A complementary high-throughput virtual screen to the 3D pharmacophore models, a 4D fingerprint QSAR model, was constructed using absolute molecular similarity. This 4D fingerprint virtual high-throughput screen permits a larger range of chemistry diversity to be assayed than with the 4D QSAR models. The optimized 4D QSAR 3D pharmacophore model has a leave-one-out cross-correlation value of xv-r2 = 0.947, while the optimized 4D fingerprint virtual screening model has a value of xv-r2 = 0.719. This work reveals that it is possible to develop significant QSAR, 3D pharmacophore, and virtual screening models for a small set of lamellarins showing cytotoxic behavior in breast cancer screens that can guide future drug development based upon lamellarin chemistry.
Collapse
Affiliation(s)
- Poonsiri Thipnate
- Department of Chemistry, Faculty of Science, Kasetsart University, Chatuchak, Bangkok 10900, Thailand
- Center of Nanotechnology KU, Kasetsart University, Chatuchak, Bangkok 10900, Thailand
| | - Jianzhong Liu
- College of Pharmacy, MSC09 5360, 1 University of New Mexico, Albuquerque, New Mexico 87131-000, USA
- The Chem21 Group, Incorporated, 1780 Wilson Drive, Lake Forest, IL 60045
| | - Supa Hannongbua
- Department of Chemistry, Faculty of Science, Kasetsart University, Chatuchak, Bangkok 10900, Thailand
- Center of Nanotechnology KU, Kasetsart University, Chatuchak, Bangkok 10900, Thailand
| | - A. J. Hopfinger
- College of Pharmacy, MSC09 5360, 1 University of New Mexico, Albuquerque, New Mexico 87131-000, USA
- The Chem21 Group, Incorporated, 1780 Wilson Drive, Lake Forest, IL 60045
| |
Collapse
|
31
|
Krier M, Hutter MC. Bioisosteric Similarity of Molecules Based on Structural Alignment and Observed Chemical Replacements in Drugs. J Chem Inf Model 2009; 49:1280-97. [DOI: 10.1021/ci8003418] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Markus Krier
- Center for Bioinformatics, Saarland University, Campus Building C7.1, D-66123 Saarbruecken, Germany
| | - Michael C. Hutter
- Center for Bioinformatics, Saarland University, Campus Building C7.1, D-66123 Saarbruecken, Germany
| |
Collapse
|
32
|
Nigsch F, Bender A, Jenkins JL, Mitchell JBO. Ligand-Target Prediction Using Winnow and Naive Bayesian Algorithms and the Implications of Overall Performance Statistics. J Chem Inf Model 2008; 48:2313-25. [DOI: 10.1021/ci800079x] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Florian Nigsch
- Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom; Lead Discovery Informatics, Center for Proteomic Chemistry, Novartis Institutes for BioMedical Research, 250 Massachusetts Avenue, Cambridge, Massachusetts 02139; and Division of Medicinal Chemistry, Leiden/Amsterdam Center for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - Andreas Bender
- Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom; Lead Discovery Informatics, Center for Proteomic Chemistry, Novartis Institutes for BioMedical Research, 250 Massachusetts Avenue, Cambridge, Massachusetts 02139; and Division of Medicinal Chemistry, Leiden/Amsterdam Center for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - Jeremy L. Jenkins
- Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom; Lead Discovery Informatics, Center for Proteomic Chemistry, Novartis Institutes for BioMedical Research, 250 Massachusetts Avenue, Cambridge, Massachusetts 02139; and Division of Medicinal Chemistry, Leiden/Amsterdam Center for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - John B. O. Mitchell
- Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom; Lead Discovery Informatics, Center for Proteomic Chemistry, Novartis Institutes for BioMedical Research, 250 Massachusetts Avenue, Cambridge, Massachusetts 02139; and Division of Medicinal Chemistry, Leiden/Amsterdam Center for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| |
Collapse
|
33
|
von Korff M, Freyss J, Sander T. Flexophore, a New Versatile 3D Pharmacophore Descriptor That Considers Molecular Flexibility. J Chem Inf Model 2008; 48:797-810. [DOI: 10.1021/ci700359j] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Modest von Korff
- Department of Research Informatics, Actelion Ltd., Gewerbestrasse 16, CH-4123 Allschwil, Switzerland
| | - Joel Freyss
- Department of Research Informatics, Actelion Ltd., Gewerbestrasse 16, CH-4123 Allschwil, Switzerland
| | - Thomas Sander
- Department of Research Informatics, Actelion Ltd., Gewerbestrasse 16, CH-4123 Allschwil, Switzerland
| |
Collapse
|
34
|
Categorical QSAR models for skin sensitization based on local lymph node assay measures and both ground and excited state 4D-fingerprint descriptors. J Comput Aided Mol Des 2008; 22:345-66. [DOI: 10.1007/s10822-008-9190-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2007] [Accepted: 01/30/2008] [Indexed: 10/22/2022]
|
35
|
Nigsch F, Mitchell JBO. How to winnow actives from inactives: introducing molecular orthogonal sparse bigrams (MOSBs) and multiclass Winnow. J Chem Inf Model 2008; 48:306-18. [PMID: 18220378 DOI: 10.1021/ci700350n] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In the present paper we combine the Winnow algorithm and an advanced scheme for feature generation into a tool for multiclass classification. The Winnow algorithm, specifically designed in the late 1980s to work well with high-dimensional data, by design ignores most of the irrelevant features for the scoring of each single training/test case. To augment the pool of available molecular features we use the Winnow algorithm in conjunction with a process that creates additional features from a set of given ones. We adapt a technique formerly employed in text classification termed "orthogonal sparse bigrams" and extend the use of that method to the domain of cheminformatics. Using circular molecular fingerprints as initial features, we create "molecular orthogonal sparse bigrams" (MOSBs) and report their successful application to the task of classification of bioactive molecules. Additionally, we introduce a memory-efficient way of bagging individual classifiers, avoiding the need to hold the complete training data set in memory. To compare the performance of our method with published results, we use the Hert data set of 8293 active molecules in 11 classes. We compare our method to Random Forest and find that our method not only is comparable or better in classification accuracy (up to 50% higher in MCC [Matthews correlation coefficient], 98% higher in fraction of correct predictions) but also is quicker to train (by a factor between 2 and 18, depending on the feature generation), more memory efficient, and able to cope more easily with large data sets when we seeded the actives into a pool of 94290 inactive molecules. It is shown that this method can be used with different fingerprints.
Collapse
Affiliation(s)
- Florian Nigsch
- Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | | |
Collapse
|
36
|
Santos-Filho OA, Hopfinger AJ. Combined 4D‐Fingerprint and Clustering Based Membrane‐Interaction QSAR Analyses for Constructing Consensus Caco‐2 Cell Permeation Virtual Screens. J Pharm Sci 2008; 97:566-83. [PMID: 17696143 DOI: 10.1002/jps.21086] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A set of 30 structurally diverse molecules, for which Caco-2 cell permeation coefficients were determined, formed the training set for construction of Caco-2 cell permeation models based upon membrane-interaction (MI) QSAR analysis and a new QSAR method called 4D-fingerprint QSAR analysis. The descriptor terms of the 4D-fingerprints equation are molecular similarity eigenvalues, and this set of descriptors is being evaluated as a potential "universal" QSAR descriptor set. The 4D-fingerprint model suggests that Caco-2 cell permeation is governed by the spatial distribution of hydrogen bonding and nonpolar groups over the molecular shape of a molecule. Moreover, a complementary resampling of the original Caco-2 cell permeation training set, followed by the construction of several "clustered" MI-QSAR models, led to a consensus model consistent in interpretation with the 4D-fingerprint model.
Collapse
Affiliation(s)
- Osvaldo A Santos-Filho
- Division of Infectious Diseases, Faculty of Medicine, University of British Columbia, 2733 Heather Street, Vancouver, British Columbia, Canada.
| | | |
Collapse
|
37
|
Vainio MJ, Johnson MS. Generating Conformer Ensembles Using a Multiobjective Genetic Algorithm. J Chem Inf Model 2007; 47:2462-74. [PMID: 17892278 DOI: 10.1021/ci6005646] [Citation(s) in RCA: 279] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
The task of generating a nonredundant set of low-energy conformations for small molecules is of fundamental importance for many molecular modeling and drug-design methodologies. Several approaches to conformer generation have been published. Exhaustive searches suffer from the exponential growth of the search space with increasing degrees of conformational freedom (number of rotatable bonds). Stochastic algorithms do not suffer as much from the exponential increase of search space and provide a good coverage of the energy minima. Here, the use of a multiobjective genetic algorithm in the generation of conformer ensembles is investigated. Distance geometry is used to generate an initial conformer, which is then subject to geometric modifications encoded by the individuals of the genetic algorithm. The geometric modifications apply to torsion angles about rotatable bonds, stereochemistry of double bonds and tetrahedral chiral centers, and ring conformations. The geometric diversity of the evolving conformer ensemble is preserved by a fitness-sharing mechanism based on the root-mean-square distance of the atomic coordinates. Molecular symmetry is taken into account in the distance calculation. The geometric modifications introduce strain into the structures. The strain is relaxed using an MMFF94-like force field in a postprocessing step that also removes conformational duplicates and structures whose strain energy remains above a predefined window from the minimum energy value found in the set. The implementation, called Balloon, is available free of charge on the Internet ( http://www.abo.fi/~mivainio/balloon/).
Collapse
Affiliation(s)
- Mikko J Vainio
- Structural Bioinformatics Laboratory, Department of Biochemistry and Pharmacy, Abo Akademi University, Tykistökatu 6A (BioCity), Turku, Finland.
| | | |
Collapse
|
38
|
Li Y, Pan D, Liu J, Kern PS, Gerberick GF, Hopfinger AJ, Tseng YJ. Categorical QSAR Models for Skin Sensitization based upon Local Lymph Node Assay Classification Measures Part 2: 4D-Fingerprint Three-State and Two-2-State Logistic Regression Models. Toxicol Sci 2007; 99:532-44. [PMID: 17675333 DOI: 10.1093/toxsci/kfm185] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Three and four state categorical quantitative structure-activity relationship (QSAR) models for skin sensitization have been constructed using data from the murine Local Lymph Node Assay studies. These are the same data we previously used to build two-state (sensitizer, nonsensitizer) QSAR models (Li et al., 2007, Chem. Res. Toxicol. 20, 114-128). 4D-fingerprint descriptors derived from the 4D-molecular similarity paradigm are used to generate these models. A training set of 196 and a test set of 22 structurally diverse compounds were used in this study. Logistic regression, and partial least square coupled logistic regression were used to build the models. The three-state QSAR model gives a classification accuracy of 73.4% for the training set and 63.6% for the test set, while the random average value of classification accuracy for any three-state data set is 33.3%. The two-2-state [four categories in total] QSAR model gives a classification accuracy of 83.2% for the training set and 54.6% for the test set, while the random average value of classification accuracy for any two-2-state data set is 25%. An analysis of the skin-sensitization models developed in this study, as well as the two-state QSAR models developed in our previous analysis, suggests that the "moderate" sensitizers may be the main source of limited model accuracy.
Collapse
Affiliation(s)
- Yi Li
- Laboratory of Molecular Modeling and Design (MC 781), College of Pharmacy, University of Illinois at Chicago, Chicago, Illinois 60612-7231, USA
| | | | | | | | | | | | | |
Collapse
|
39
|
Abstract
QSAR models for four skin penetration enhancer data sets of 61, 44, 42, and 17 compounds were constructed using classic QSAR descriptors and 4D-fingerprints. Three data sets involved skin penetration enhancement of hydrocortisone and hydrocortisone acetate. The other data set involved skin penetration enhancement of fluorouracil. The measure of penetration enhancement is the ratio of the net permeation of the penetrant with and without a common fixed concentration of enhancer. Significant QSAR models could be built using multidimensional linear regression fitting and genetic function model optimization for all four data sets when both classic and 4D-fingerprint descriptors were used in the trial descriptor pool. Reasonable QSAR models could be built when only 4D-fingerprint descriptors were employed, and no significant QSAR models could be built using only classic descriptors for two of the four data sets. Comparison analyses of the descriptor terms, and their respective regression coefficients, across the pairs of the best QSAR models of the four skin penetration enhancer data sets did not reveal any significant extent of similar terms. Overall, the QSAR models for the penetration-enhancer systems appear meaningfully different from one another, suggesting that there are distinct mechanisms of skin penetration enhancement that depend on the chemistry of both the enhancer and the penetrant.
Collapse
Affiliation(s)
- Manisha Iyer
- Division of Clinical Chemistry, Department of Pathology, Children's Hospital of Pittsburgh, 5834 Main Tower, 200 Lothrop Street, Pittsburgh, Pennsylvania 15213, USA
| | | | | | | |
Collapse
|
40
|
Li Y, Tseng YJ, Pan D, Liu J, Kern PS, Gerberick GF, Hopfinger AJ. 4D-fingerprint categorical QSAR models for skin sensitization based on the classification of local lymph node assay measures. Chem Res Toxicol 2007; 20:114-28. [PMID: 17226934 PMCID: PMC2553001 DOI: 10.1021/tx6002535] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Currently, the only validated methods to identify skin sensitization effects are in vivo models, such as the local lymph node assay (LLNA) and guinea pig studies. There is a tremendous need, in particular due to novel legislation, to develop animal alternatives, for eaxample, quantitative structure-activity relationship (QSAR) models. Here, QSAR models for skin sensitization using LLNA data have been constructed. The descriptors used to generate these models are derived from the 4D-molecular similarity paradigm and are referred to as universal 4D-fingerprints. A training set of 132 structurally diverse compounds and a test set of 15 structurally diverse compounds were used in this study. The statistical methodologies used to build the models are logistic regression (LR) and partial least-square coupled logistic regression (PLS-LR), which prove to be effective tools for studying skin sensitization measures expressed in the two categorical terms of sensitizer and non-sensitizer. QSAR models with low values of the Hosmer-Lemeshow goodness-of-fit statistic, X(2)HL, are significant and predictive. For the training set, the cross-validated prediction accuracy of the logistic regression models ranges from 77.3% to 78.0%, whereas that of the PLS-logistic regression models ranges from 87.1% to 89.4%. For the test set, the prediction accuracy of logistic regression models ranges from 80.0% to 86.7%, whereas that of the PLS-logistic regression models ranges from 73.3% to 80.0%. The QSAR models are made up of 4D-fingerprints related to aromatic atoms, hydrogen bond acceptors, and negatively partially charged atoms.
Collapse
Affiliation(s)
- Yi Li
- Laboratory of Molecular Modeling and Design (MC 781), College of Pharmacy, University of Illinois at Chicago, 833 South Wood Street, Chicago, IL 60612-7231
| | - Yufeng J. Tseng
- The Chem21 Group, Inc., 1780 Wilson Drive, Lake Forest, IL 60045
- Dept. of Computer Science and Information Engineering, National Taiwan University, No.1 Sec. 4, Roosevelt Road, Taipei, Taiwan 106
| | - Dahua Pan
- Laboratory of Molecular Modeling and Design (MC 781), College of Pharmacy, University of Illinois at Chicago, 833 South Wood Street, Chicago, IL 60612-7231
| | - Jianzhong Liu
- College of Pharmacy, MSC09 5360, 1 University of New Mexico, Albuquerque, NM 87131-0001
- The Chem21 Group, Inc., 1780 Wilson Drive, Lake Forest, IL 60045
| | - Petra S. Kern
- Procter& Gamble Eurocor, Temselaan 100, B-1853 Strombeek-Bever, Belgium
| | - G. Frank Gerberick
- The Procter & Gamble Company, Miami Valley Innovation Center, P.O. Box 538707, Cincinnati, OH 45253-8707
| | - Anton J. Hopfinger
- College of Pharmacy, MSC09 5360, 1 University of New Mexico, Albuquerque, NM 87131-0001
- The Chem21 Group, Inc., 1780 Wilson Drive, Lake Forest, IL 60045
- Corresponding Author: Voice: 505.272.8474, Fax: 505.272.0704,
| |
Collapse
|
41
|
|
42
|
Beger RD. Computational modeling of biologically active molecules using NMR spectra. Drug Discov Today 2006; 11:429-35. [PMID: 16635805 DOI: 10.1016/j.drudis.2006.03.014] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2005] [Revised: 01/30/2006] [Accepted: 03/21/2006] [Indexed: 11/29/2022]
Abstract
The molecular structure and NMR chemical shift information of a compound can be combined to form powerful models of biological activity. NMR spectral data and structure information can be combined on a structural template analogous to 3D-QSAR methodology or orientation independently in spectral space. Surprisingly, quantitative spectrometric data-activity relationship (QSDAR) models built on structure templates are inferior to multi-dimensional QSDAR models built in spectral space. 3D-QSDAR modeling could be useful for estimating chemical toxicity, risk assessment of environmental contaminants and drug lead-compound identifications.
Collapse
Affiliation(s)
- Richard D Beger
- Division of Systems Toxicology, National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR 72079, USA.
| |
Collapse
|