1
|
Lange JJ, Anelli A, Alsenz J, Kuentz M, O'Dwyer PJ, Saal W, Wyttenbach N, Griffin BT. Comparative Analysis of Chemical Descriptors by Machine Learning Reveals Atomistic Insights into Solute-Lipid Interactions. Mol Pharm 2024. [PMID: 38780534 DOI: 10.1021/acs.molpharmaceut.4c00080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
This study explores the research area of drug solubility in lipid excipients, an area persistently complex despite recent advancements in understanding and predicting solubility based on molecular structure. To this end, this research investigated novel descriptor sets, employing machine learning techniques to understand the determinants governing interactions between solutes and medium-chain triglycerides (MCTs). Quantitative structure-property relationships (QSPR) were constructed on an extended solubility data set comprising 182 experimental values of structurally diverse drug molecules, including both development and marketed drugs to extract meaningful property relationships. Four classes of molecular descriptors, ranging from traditional representations to complex geometrical descriptions, were assessed and compared in terms of their predictive accuracy and interpretability. These include two-dimensional (2D) and three-dimensional (3D) descriptors, Abraham solvation parameters, extended connectivity fingerprints (ECFPs), and the smooth overlap of atomic position (SOAP) descriptor. Through testing three distinct regularized regression algorithms alongside various preprocessing schemes, the SOAP descriptor enabled the construction of a superior performing model in terms of interpretability and accuracy. Its atom-centered characteristics allowed contributions to be estimated at the atomic level, thereby enabling the ranking of prevalent molecular motifs and their influence on drug solubility in MCTs. The performance on a separate test set demonstrated high predictive accuracy (RMSE = 0.50) for 2D and 3D, SOAP, and Abraham Solvation descriptors. The model trained on ECFP4 descriptors resulted in inferior predictive accuracy. Lastly, uncertainty estimations for each model were introduced to assess their applicability domains and provide information on where the models may extrapolate in chemical space and, thus, where more data may be necessary to refine a data-driven approach to predict solubility in MCTs. Overall, the presented approaches further enable computationally informed formulation development by introducing a novel in silico approach for rational drug development and prediction of dose loading in lipids.
Collapse
Affiliation(s)
- Justus Johann Lange
- School of Pharmacy, University College Cork, College Road, Cork T12 R229, Cork County, Ireland
| | - Andrea Anelli
- Roche Pharma Research and Early Development, Therapeutic Modalities, Roche Innovation Center Basel, F. Hoffmann-La Roche Limited, Grenzacherstrasse 124, Basel 4070, Switzerland
| | - Jochem Alsenz
- Roche Pharma Research and Early Development, Therapeutic Modalities, Roche Innovation Center Basel, F. Hoffmann-La Roche Limited, Grenzacherstrasse 124, Basel 4070, Switzerland
| | - Martin Kuentz
- Insitute of Pharma Technology, University of Applied Sciences and Arts Northwestern Switzerland, Hofackerstrasse 30, Muttenz CH-4231, Basel City, Switzerland
| | - Patrick J O'Dwyer
- School of Pharmacy, University College Cork, College Road, Cork T12 R229, Cork County, Ireland
| | - Wiebke Saal
- Roche Pharma Research and Early Development, Therapeutic Modalities, Roche Innovation Center Basel, F. Hoffmann-La Roche Limited, Grenzacherstrasse 124, Basel 4070, Switzerland
| | - Nicole Wyttenbach
- Roche Pharma Research and Early Development, Therapeutic Modalities, Roche Innovation Center Basel, F. Hoffmann-La Roche Limited, Grenzacherstrasse 124, Basel 4070, Switzerland
| | - Brendan T Griffin
- School of Pharmacy, University College Cork, College Road, Cork T12 R229, Cork County, Ireland
| |
Collapse
|
2
|
Wu Z, Chen J, Li Y, Deng Y, Zhao H, Hsieh CY, Hou T. From Black Boxes to Actionable Insights: A Perspective on Explainable Artificial Intelligence for Scientific Discovery. J Chem Inf Model 2023; 63:7617-7627. [PMID: 38079566 DOI: 10.1021/acs.jcim.3c01642] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2023]
Abstract
The application of Explainable Artificial Intelligence (XAI) in the field of chemistry has garnered growing interest for its potential to justify the prediction of black-box machine learning models and provide actionable insights. We first survey a range of XAI techniques adapted for chemical applications and categorize them based on the technical details of each methodology. We then present a few case studies to illustrate the practical utility of XAI, such as identifying carcinogenic molecules and guiding molecular optimizations, in order to provide chemists with concrete examples of ways to take full advantage of XAI-augmented machine learning for chemistry. Despite the initial success of XAI in chemistry, we still face the challenges of developing more reliable explanations, assuring robustness against adversarial actions, and customizing the explanation for different applications and needs of the diverse scientific community. Finally, we discuss the emerging role of large language models like GPT in generating natural language explanations and discusses the specific challenges associated with them. We advocate that addressing the aforementioned challenges and actively embracing new techniques may contribute to establishing machine learning as an indispensable technique for chemistry in this digital era.
Collapse
Affiliation(s)
- Zhenxing Wu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, P. R. China
- CarbonSilicon AI Technology Company, Limited, Hangzhou, 310018 Zhejiang, P. R. China
| | - Jihong Chen
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, P. R. China
- CarbonSilicon AI Technology Company, Limited, Hangzhou, 310018 Zhejiang, P. R. China
| | - Yitong Li
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, P. R. China
| | - Yafeng Deng
- CarbonSilicon AI Technology Company, Limited, Hangzhou, 310018 Zhejiang, P. R. China
| | - Haitao Zhao
- Center for Intelligent and Biomimetic Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 440305 Guangdong, P. R. China
| | - Chang-Yu Hsieh
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, P. R. China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, P. R. China
| |
Collapse
|
3
|
Park GJ, Kang NS. ADis-QSAR: a machine learning model based on biological activity differences of compounds. J Comput Aided Mol Des 2023:10.1007/s10822-023-00517-1. [PMID: 37382799 DOI: 10.1007/s10822-023-00517-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 06/26/2023] [Indexed: 06/30/2023]
Abstract
Drug candidates identified by the pharmaceutical industry typically have unique structural characteristics to ensure they interact strongly and specifically with their biological targets. Identifying these characteristics is a key challenge for developing new drugs, and quantitative structure-activity relationship (QSAR) analysis has generally been used to perform this task. QSAR models with good predictive power improve the cost and time efficiencies invested in compound development. Generating these good models depends on how well differences between "active" and "inactive" compound groups can be conveyed to the model to be learned. Efforts to solve this difference issue have been made, including generating a "molecular descriptor" that compressively expresses the structural characteristics of compounds. From the same perspective, we succeeded in developing the Activity Differences-Quantitative Structure-Activity Relationship (ADis-QSAR) model by generating molecular descriptors that more explicitly convey features of the group through a pair system that performs direct connections between active and inactive groups. We used popular machine learning algorithms, such as Support Vector Machine, Random Forest, XGBoost and Multi-Layer Perceptron for model learning and evaluated the model using scores such as accuracy, area under curve, precision and specificity. The results showed that the Support Vector Machine performed better than the others. Notably, the ADis-QSAR model showed significant improvements in meaningful scores such as precision and specificity compared to the baseline model, even in datasets with dissimilar chemical spaces. This model reduces the risk of selecting false positive compounds, improving the efficiency of drug development.
Collapse
Affiliation(s)
- Gyoung Jin Park
- Graduate School of New Drug Discovery and Development, Chungnam National University, 99 Daehak-ro,Yuseong-gu, Daejeon, 34134, Korea
| | - Nam Sook Kang
- Graduate School of New Drug Discovery and Development, Chungnam National University, 99 Daehak-ro,Yuseong-gu, Daejeon, 34134, Korea.
| |
Collapse
|
4
|
Rodríguez-Pérez R, Miljković F, Bajorath J. Machine Learning in Chemoinformatics and Medicinal Chemistry. Annu Rev Biomed Data Sci 2022; 5:43-65. [PMID: 35440144 DOI: 10.1146/annurev-biodatasci-122120-124216] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In chemoinformatics and medicinal chemistry, machine learning has evolved into an important approach. In recent years, increasing computational resources and new deep learning algorithms have put machine learning onto a new level, addressing previously unmet challenges in pharmaceutical research. In silico approaches for compound activity predictions, de novo design, and reaction modeling have been further advanced by new algorithmic developments and the emergence of big data in the field. Herein, novel applications of machine learning and deep learning in chemoinformatics and medicinal chemistry are reviewed. Opportunities and challenges for new methods and applications are discussed, placing emphasis on proper baseline comparisons, robust validation methodologies, and new applicability domains. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany; .,Current affiliation: Novartis Institutes for Biomedical Research, Novartis Campus, Basel, Switzerland
| | - Filip Miljković
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany; .,Current affiliation: Data Science and AI, Imaging and Data Analytics, Clinical Pharmacology and Safety Sciences, R&D AstraZeneca, Gothenburg, Sweden
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany;
| |
Collapse
|
5
|
Gini G. QSAR Methods. Methods Mol Biol 2022; 2425:1-26. [PMID: 35188626 DOI: 10.1007/978-1-0716-1960-5_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
This chapter introduces the basis of computational chemistry and discusses how computational methods have been extended from physical to biological properties, and toxicology in particular, modeling. Since about three decades, chemical experimentation is more and more replaced by modeling and virtual experimentation, using a large core of mathematics, chemistry, physics, and algorithms. Animal and wet experiments, aimed at providing a standardized result about a biological property, can be mimicked by modeling methods, globally called in silico methods, all characterized by deducing properties starting from the chemical structures. Two main streams of such models are available: models that consider the whole molecular structure to predict a value, namely QSAR (quantitative structure-activity relationships), and models that check relevant substructures to predict a class, namely SAR. The term in silico discovery is applied to chemical design, to computational toxicology, and to drug discovery. Virtual experiments confirm hypotheses, provide data for regulation, and help in designing new chemicals.
Collapse
|
6
|
Rica E, Álvarez S, Serratosa F. Ligand-Based Virtual Screening Based on the Graph Edit Distance. Int J Mol Sci 2021; 22:12751. [PMID: 34884555 PMCID: PMC8658044 DOI: 10.3390/ijms222312751] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 11/12/2021] [Accepted: 11/13/2021] [Indexed: 11/25/2022] Open
Abstract
Chemical compounds can be represented as attributed graphs. An attributed graph is a mathematical model of an object composed of two types of representations: nodes and edges. Nodes are individual components, and edges are relations between these components. In this case, pharmacophore-type node descriptions are represented by nodes and chemical bounds by edges. If we want to obtain the bioactivity dissimilarity between two chemical compounds, a distance between attributed graphs can be used. The Graph Edit Distance allows computing this distance, and it is defined as the cost of transforming one graph into another. Nevertheless, to define this dissimilarity, the transformation cost must be properly tuned. The aim of this paper is to analyse the structural-based screening methods to verify the quality of the Harper transformation costs proposal and to present an algorithm to learn these transformation costs such that the bioactivity dissimilarity is properly defined in a ligand-based virtual screening application. The goodness of the dissimilarity is represented by the classification accuracy. Six publicly available datasets-CAPST, DUD-E, GLL&GDD, NRLiSt-BDB, MUV and ULS-UDS-have been used to validate our methodology and show that with our learned costs, we obtain the highest ratios in identifying the bioactivity similarity in a structurally diverse group of molecules.
Collapse
Affiliation(s)
- Elena Rica
- Departament d’Enginyeria Informàtica i Matemàtiques, Universitat Rovira i Virgili, 43007 Tarragona, Spain; (S.Á.); (F.S.)
| | | | | |
Collapse
|
7
|
Cao C, Yan L, Cao C. Determination and application of the excited‐state substituent constants of pyridyl and substituted phenyl groups. J PHYS ORG CHEM 2021. [DOI: 10.1002/poc.4246] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Chao‐Tun Cao
- Key Laboratory of Theoretical Organic Chemistry and Function Molecule, Ministry of Education, Key Laboratory of QSAR/QSPR of Hunan Provincial University, School of Chemistry and Chemical Engineering Hunan University of Science and Technology Xiangtan China
| | - Lu Yan
- Key Laboratory of Theoretical Organic Chemistry and Function Molecule, Ministry of Education, Key Laboratory of QSAR/QSPR of Hunan Provincial University, School of Chemistry and Chemical Engineering Hunan University of Science and Technology Xiangtan China
| | - Chenzhong Cao
- Key Laboratory of Theoretical Organic Chemistry and Function Molecule, Ministry of Education, Key Laboratory of QSAR/QSPR of Hunan Provincial University, School of Chemistry and Chemical Engineering Hunan University of Science and Technology Xiangtan China
| |
Collapse
|
8
|
Rybenkov VV, Zgurskaya HI, Ganguly C, Leus IV, Zhang Z, Moniruzzaman M. The Whole Is Bigger than the Sum of Its Parts: Drug Transport in the Context of Two Membranes with Active Efflux. Chem Rev 2021; 121:5597-5631. [PMID: 33596653 DOI: 10.1021/acs.chemrev.0c01137] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Cell envelope plays a dual role in the life of bacteria by simultaneously protecting it from a hostile environment and facilitating access to beneficial molecules. At the heart of this ability lie the restrictive properties of the cellular membrane augmented by efflux transporters, which preclude intracellular penetration of most molecules except with the help of specialized uptake mediators. Recently, kinetic properties of the cell envelope came into focus driven on one hand by the urgent need in new antibiotics and, on the other hand, by experimental and theoretical advances in studies of transmembrane transport. A notable result from these studies is the development of a kinetic formalism that integrates the Michaelis-Menten behavior of individual transporters with transmembrane diffusion and offers a quantitative basis for the analysis of intracellular penetration of bioactive compounds. This review surveys key experimental and computational approaches to the investigation of transport by individual translocators and in whole cells, summarizes key findings from these studies and outlines implications for antibiotic discovery. Special emphasis is placed on Gram-negative bacteria, whose envelope contains two separate membranes. This feature sets these organisms apart from Gram-positive bacteria and eukaryotic cells by providing them with full benefits of the synergy between slow transmembrane diffusion and active efflux.
Collapse
Affiliation(s)
- Valentin V Rybenkov
- Department of Chemistry and Biochemistry, Stephenson Life Sciences Research Center, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| | - Helen I Zgurskaya
- Department of Chemistry and Biochemistry, Stephenson Life Sciences Research Center, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| | - Chhandosee Ganguly
- Department of Chemistry and Biochemistry, Stephenson Life Sciences Research Center, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| | - Inga V Leus
- Department of Chemistry and Biochemistry, Stephenson Life Sciences Research Center, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| | - Zhen Zhang
- Department of Chemistry and Biochemistry, Stephenson Life Sciences Research Center, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| | - Mohammad Moniruzzaman
- Department of Chemistry and Biochemistry, Stephenson Life Sciences Research Center, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| |
Collapse
|
9
|
Zhang XC, Wu CK, Yang ZJ, Wu ZX, Yi JC, Hsieh CY, Hou TJ, Cao DS. MG-BERT: leveraging unsupervised atomic representation learning for molecular property prediction. Brief Bioinform 2021; 22:6265201. [PMID: 33951729 DOI: 10.1093/bib/bbab152] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/11/2021] [Accepted: 04/01/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Accurate and efficient prediction of molecular properties is one of the fundamental issues in drug design and discovery pipelines. Traditional feature engineering-based approaches require extensive expertise in the feature design and selection process. With the development of artificial intelligence (AI) technologies, data-driven methods exhibit unparalleled advantages over the feature engineering-based methods in various domains. Nevertheless, when applied to molecular property prediction, AI models usually suffer from the scarcity of labeled data and show poor generalization ability. RESULTS In this study, we proposed molecular graph BERT (MG-BERT), which integrates the local message passing mechanism of graph neural networks (GNNs) into the powerful BERT model to facilitate learning from molecular graphs. Furthermore, an effective self-supervised learning strategy named masked atoms prediction was proposed to pretrain the MG-BERT model on a large amount of unlabeled data to mine context information in molecules. We found the MG-BERT model can generate context-sensitive atomic representations after pretraining and transfer the learned knowledge to the prediction of a variety of molecular properties. The experimental results show that the pretrained MG-BERT model with a little extra fine-tuning can consistently outperform the state-of-the-art methods on all 11 ADMET datasets. Moreover, the MG-BERT model leverages attention mechanisms to focus on atomic features essential to the target property, providing excellent interpretability for the trained model. The MG-BERT model does not require any hand-crafted feature as input and is more reliable due to its excellent interpretability, providing a novel framework to develop state-of-the-art models for a wide range of drug discovery tasks.
Collapse
Affiliation(s)
- Xiao-Chen Zhang
- State Key Laboratory of High-Performance Computing, School of Computer Science, National University of Defense Technology, China
| | - Cheng-Kun Wu
- State Key Laboratory of High-Performance Computing, School of Computer Science, National University of Defense Technology, China
| | - Zhi-Jiang Yang
- Xiangya School of Pharmaceutical Sciences, Central South University, China
| | - Zhen-Xing Wu
- College of Pharmaceutical Sciences, Zhengjiang University, China
| | - Jia-Cai Yi
- State Key Laboratory of High-Performance Computing, School of Computer Science, National University of Defense Technology, China
| | - Chang-Yu Hsieh
- Tencent Quantum Laboratory since 2018. He received his PhD degree in Physics from the University of Ottawa in 2012 and worked as a postdoctoral researcher at the University of Toronto (2012-2013) and Massachusetts Institute of Technology (2013-2016), respectively. Before joining Tencent, he worked as a senior researcher at Singapore-MIT Alliance for Science and Technology (2017-2018)
| | - Ting-Jun Hou
- College of Pharmaceutical Sciences, Zhejiang University, China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, China
| |
Collapse
|
10
|
Wu Z, Zhu M, Kang Y, Leung ELH, Lei T, Shen C, Jiang D, Wang Z, Cao D, Hou T. Do we need different machine learning algorithms for QSAR modeling? A comprehensive assessment of 16 machine learning algorithms on 14 QSAR data sets. Brief Bioinform 2020; 22:6032614. [PMID: 33313673 DOI: 10.1093/bib/bbaa321] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 10/09/2020] [Accepted: 10/19/2020] [Indexed: 12/18/2022] Open
Abstract
Although a wide variety of machine learning (ML) algorithms have been utilized to learn quantitative structure-activity relationships (QSARs), there is no agreed single best algorithm for QSAR learning. Therefore, a comprehensive understanding of the performance characteristics of popular ML algorithms used in QSAR learning is highly desirable. In this study, five linear algorithms [linear function Gaussian process regression (linear-GPR), linear function support vector machine (linear-SVM), partial least squares regression (PLSR), multiple linear regression (MLR) and principal component regression (PCR)], three analogizers [radial basis function support vector machine (rbf-SVM), K-nearest neighbor (KNN) and radial basis function Gaussian process regression (rbf-GPR)], six symbolists [extreme gradient boosting (XGBoost), Cubist, random forest (RF), multiple adaptive regression splines (MARS), gradient boosting machine (GBM), and classification and regression tree (CART)] and two connectionists [principal component analysis artificial neural network (pca-ANN) and deep neural network (DNN)] were employed to learn the regression-based QSAR models for 14 public data sets comprising nine physicochemical properties and five toxicity endpoints. The results show that rbf-SVM, rbf-GPR, XGBoost and DNN generally illustrate better performances than the other algorithms. The overall performances of different algorithms can be ranked from the best to the worst as follows: rbf-SVM > XGBoost > rbf-GPR > Cubist > GBM > DNN > RF > pca-ANN > MARS > linear-GPR ≈ KNN > linear-SVM ≈ PLSR > CART ≈ PCR ≈ MLR. In terms of prediction accuracy and computational efficiency, SVM and XGBoost are recommended to the regression learning for small data sets, and XGBoost is an excellent choice for large data sets. We then investigated the performances of the ensemble models by integrating the predictions of multiple ML algorithms. The results illustrate that the ensembles of two or three algorithms in different categories can indeed improve the predictions of the best individual ML algorithms.
Collapse
Affiliation(s)
- Zhenxing Wu
- College of Pharmaceutical Sciences, Hangzhou Institute of Innovative Medicine, Zhejiang University, P. R. China
| | - Minfeng Zhu
- Xiangya School of Pharmaceutical Sciences, Central South University, P. R. China
| | - Yu Kang
- College of Pharmaceutical Sciences, Hangzhou Institute of Innovative Medicine, Zhejiang University, P. R. China
| | - Elaine Lai-Han Leung
- State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, P. R. China
| | - Tailong Lei
- College of Pharmaceutical Sciences, Hangzhou Institute of Innovative Medicine, Zhejiang University, P. R. China
| | - Chao Shen
- College of Pharmaceutical Sciences, Hangzhou Institute of Innovative Medicine, Zhejiang University, P. R. China
| | - Dejun Jiang
- College of Pharmaceutical Sciences, Hangzhou Institute of Innovative Medicine, Zhejiang University, P. R. China
| | - Zhe Wang
- College of Pharmaceutical Sciences, Hangzhou Institute of Innovative Medicine, Zhejiang University, P. R. China
| | | | - Tingjun Hou
- Peking University, China. He is currently a professor in the College of Pharmaceutical Sciences, Zhejiang University, China
| |
Collapse
|
11
|
Sala D, Cosentino U, Ranaudo A, Greco C, Moro G. Dynamical Behavior and Conformational Selection Mechanism of the Intrinsically Disordered Sic1 Kinase-Inhibitor Domain. Life (Basel) 2020; 10:life10070110. [PMID: 32664566 PMCID: PMC7399826 DOI: 10.3390/life10070110] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 07/02/2020] [Accepted: 07/08/2020] [Indexed: 01/04/2023] Open
Abstract
Intrinsically Disordered Peptides and Proteins (IDPs) in solution can span a broad range of conformations that often are hard to characterize by both experimental and computational methods. However, obtaining a significant representation of the conformational space is important to understand mechanisms underlying protein functions such as partner recognition. In this work, we investigated the behavior of the Sic1 Kinase-Inhibitor Domain (KID) in solution by Molecular Dynamics (MD) simulations. Our results point out that application of common descriptors of molecular shape such as Solvent Accessible Surface (SAS) area can lead to misleading outcomes. Instead, more appropriate molecular descriptors can be used to define 3D structures. In particular, we exploited Weighted Holistic Invariant Molecular (WHIM) descriptors to get a coarse-grained but accurate definition of the variegated Sic1 KID conformational ensemble. We found that Sic1 is able to form a variable amount of folded structures even in absence of partners. Among them, there were some conformations very close to the structure that Sic1 is supposed to assume in the binding with its physiological complexes. Therefore, our results support the hypothesis that this protein relies on the conformational selection mechanism to recognize the correct molecular partners.
Collapse
Affiliation(s)
- Davide Sala
- Dipartimento di Biotecnologie e Bioscienze, Università di Milano-Bicocca, P.zza della Scienza 2, 20126 Milano, Italy;
| | - Ugo Cosentino
- Dipartimento di Scienze dell’Ambiente e della Terra, Università di Milano-Bicocca, P.zza della Scienza 1, 20126 Milano, Italy; (U.C.); (A.R.)
| | - Anna Ranaudo
- Dipartimento di Scienze dell’Ambiente e della Terra, Università di Milano-Bicocca, P.zza della Scienza 1, 20126 Milano, Italy; (U.C.); (A.R.)
| | - Claudio Greco
- Dipartimento di Scienze dell’Ambiente e della Terra, Università di Milano-Bicocca, P.zza della Scienza 1, 20126 Milano, Italy; (U.C.); (A.R.)
- Correspondence: (C.G.); (G.M.)
| | - Giorgio Moro
- Dipartimento di Biotecnologie e Bioscienze, Università di Milano-Bicocca, P.zza della Scienza 2, 20126 Milano, Italy;
- Correspondence: (C.G.); (G.M.)
| |
Collapse
|
12
|
Abstract
At the end of her academic career, the author summarizes the main aspects of QSAR modeling, giving comments and suggestions according to her 23 years' experience in QSAR research on environmental topics. The focus is mainly on Multiple Linear Regression, particularly Ordinary Least Squares, using a Genetic Algorithm for variable selection from various theoretical molecular descriptors, but the comments can be useful also for other QSAR methods. The need for rigorous validation, also external, and for applicability domain check to guarantee predictivity and reliability of QSAR models is particularly highlighted. The commented approach is the “predictive” one, based on chemometrics, and is usefully applied to the prioritization of environmental pollutants. All the discussed points and the author's ideas are implemented in the software QSARINS, as a legacy to the QSAR community.
Collapse
|
13
|
Benigni R, Bassan A, Pavan M. In silico models for genotoxicity and drug regulation. Expert Opin Drug Metab Toxicol 2020; 16:651-662. [DOI: 10.1080/17425255.2020.1785428] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
14
|
Garcia-Hernandez C, Fernández A, Serratosa F. Learning the Edit Costs of Graph Edit Distance Applied to Ligand-Based Virtual Screening. Curr Top Med Chem 2020; 20:1582-1592. [PMID: 32493194 PMCID: PMC7536799 DOI: 10.2174/1568026620666200603122000] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 11/19/2019] [Accepted: 12/07/2019] [Indexed: 11/22/2022]
Abstract
BACKGROUND Graph edit distance is a methodology used to solve error-tolerant graph matching. This methodology estimates a distance between two graphs by determining the minimum number of modifications required to transform one graph into the other. These modifications, known as edit operations, have an edit cost associated that has to be determined depending on the problem. OBJECTIVE This study focuses on the use of optimization techniques in order to learn the edit costs used when comparing graphs by means of the graph edit distance. METHODS Graphs represent reduced structural representations of molecules using pharmacophore-type node descriptions to encode the relevant molecular properties. This reduction technique is known as extended reduced graphs. The screening and statistical tools available on the ligand-based virtual screening benchmarking platform and the RDKit were used. RESULTS In the experiments, the graph edit distance using learned costs performed better or equally good than using predefined costs. This is exemplified with six publicly available datasets: DUD-E, MUV, GLL&GDD, CAPST, NRLiSt BDB, and ULS-UDS. CONCLUSION This study shows that the graph edit distance along with learned edit costs is useful to identify bioactivity similarities in a structurally diverse group of molecules. Furthermore, the target-specific edit costs might provide useful structure-activity information for future drug-design efforts.
Collapse
Affiliation(s)
| | - Alberto Fernández
- Department of Chemical Engineering, Rovira i Virgili University, Tarragona, Spain
| | - Francesc Serratosa
- Department of Computer Engineering and Mathematics, Rovira i Virgili University, Tarragona, Spain
| |
Collapse
|
15
|
Suay‐Garcia B, Bueso‐Bordils JI, Falcó A, Pérez‐Gracia MT, Antón‐Fos G, Alemán‐López P. Quantitative structure–activity relationship methods in the discovery and development of antibacterials. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1472] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Affiliation(s)
- Beatriz Suay‐Garcia
- Departamento de Matemáticas, Física y Ciencias Tecnológicas Universidad Cardenal Herrera‐CEU, CEU Universities Alfara del Patriarca, Valencia Spain
| | - Jose Ignacio Bueso‐Bordils
- Departamento de Farmacia, Universidad Cardenal Herrera‐CEU CEU Universities Alfara del Patriarca, Valencia Spain
| | - Antonio Falcó
- Departamento de Matemáticas, Física y Ciencias Tecnológicas Universidad Cardenal Herrera‐CEU, CEU Universities Alfara del Patriarca, Valencia Spain
| | - María Teresa Pérez‐Gracia
- Departamento de Farmacia, Universidad Cardenal Herrera‐CEU CEU Universities Alfara del Patriarca, Valencia Spain
| | - Gerardo Antón‐Fos
- Departamento de Farmacia, Universidad Cardenal Herrera‐CEU CEU Universities Alfara del Patriarca, Valencia Spain
| | - Pedro Alemán‐López
- Departamento de Farmacia, Universidad Cardenal Herrera‐CEU CEU Universities Alfara del Patriarca, Valencia Spain
| |
Collapse
|
16
|
Wang D, Liu W, Shen Z, Jiang L, Wang J, Li S, Li H. Deep Learning Based Drug Metabolites Prediction. Front Pharmacol 2020; 10:1586. [PMID: 32082146 PMCID: PMC7003989 DOI: 10.3389/fphar.2019.01586] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Accepted: 12/09/2019] [Indexed: 11/13/2022] Open
Abstract
Drug metabolism research plays a key role in the discovery and development of drugs. Based on the discovery of drug metabolites, new chemical entities can be identified and potential safety hazards caused by reactive or toxic metabolites can be minimized. Nowadays, computational methods are usually complementary tools for experiments. However, current metabolites prediction methods tend to have high false positive rates with low accuracy and are usually only used for specific enzyme systems. In order to overcome this difficulty, a method was developed in this paper by first establishing a database with broad coverage of SMARTS-coded metabolic reaction rule, and then extracting the molecular fingerprints of compounds to construct a classification model based on deep learning algorithms. The metabolic reaction rule database we built can supplement chemically reasonable negative reaction examples. Based on deep learning algorithms, the model could determine which reaction types are more likely to occur than the others. In the test set, our method can achieve the accuracy of 70% (Top-10), which is significantly higher than that of random guess and the rule-based method SyGMa. The results demonstrated that our method has a certain predictive ability and application value.
Collapse
Affiliation(s)
- Disha Wang
- Shanghai Key Laboratory of New Drug Design, State Key Laboratory of Bioreactor Engineering, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Wenjun Liu
- Research and Development Department, Jiangzhong Pharmaceutical Co., Ltd., Nanchang, China
| | - Zihao Shen
- Shanghai Key Laboratory of New Drug Design, State Key Laboratory of Bioreactor Engineering, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Lei Jiang
- Shanghai Key Laboratory of New Drug Design, State Key Laboratory of Bioreactor Engineering, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Jie Wang
- Shanghai Key Laboratory of New Drug Design, State Key Laboratory of Bioreactor Engineering, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Shiliang Li
- Shanghai Key Laboratory of New Drug Design, State Key Laboratory of Bioreactor Engineering, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Honglin Li
- Shanghai Key Laboratory of New Drug Design, State Key Laboratory of Bioreactor Engineering, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| |
Collapse
|
17
|
Abdolmaleki A, Ghasemi JB. Inhibition activity prediction for a dataset of candidates' drug by combining fuzzy logic with MLR/ANN QSAR models. Chem Biol Drug Des 2019; 93:1139-1157. [PMID: 31343121 DOI: 10.1111/cbdd.13511] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2018] [Revised: 02/03/2019] [Accepted: 02/16/2019] [Indexed: 11/28/2022]
Abstract
A hybrid of artificial intelligence simple and low computational cost QSAR was used. Approximately 90 pyridinylimidazole-based drug candidates with a range of potencies against p38R MAP kinase were investigated. To obtain more flexibility and effective capability of handling and processing information about the real world, in this case, the fuzzy set theory was introduced into the QSAR. An integration of multiple linear regression and artificial neural network with adaptive neuro-fuzzy inference systems (ANFIS) was developed to predict the inhibition activity. The algorithm of ANFIS was applied to identify the suitable variables and then to find the optimal descriptors. The gradient descent with momentum backpropagation ANN was used to establish the nonlinear multivariate relationships between the chemical structural parameters and biological response. A comparison between the result of the proposed linear and nonlinear regression showed the superiority of QSAR modeling by ANFIS-ANN method over the MLR. The results demonstrated that the ANFIS could be applied successfully as a feature selection. The appearance of Diam, Homo, and LogP descriptors in the model showed the importance of the steric, electronic, and thermodynamic interactions between a drug and its target site in the distribution of a compound within a biosystem and its interaction with competing for binding sites.
Collapse
Affiliation(s)
- Azizeh Abdolmaleki
- Department of Chemistry, Tuyserkan Branch, Islamic Azad University, Tuyserkan, Iran
| | - Jahan B Ghasemi
- Drug Design in Silico Lab., Chemistry Faculty, University of Tehran, Tehran, Iran
| |
Collapse
|
18
|
Garcia-Hernandez C, Fernández A, Serratosa F. Ligand-Based Virtual Screening Using Graph Edit Distance as Molecular Similarity Measure. J Chem Inf Model 2019; 59:1410-1421. [PMID: 30920214 PMCID: PMC6668628 DOI: 10.1021/acs.jcim.8b00820] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
Extended
reduced graphs provide summary representations of chemical
structures using pharmacophore-type node descriptions to encode the
relevant molecular properties. Commonly used similarity measures using
reduced graphs convert these graphs into 2D vectors like fingerprints,
before chemical comparisons are made. This study investigates the
effectiveness of a graph-only driven molecular comparison by using
extended reduced graphs along with graph edit distance methods for
molecular similarity calculation as a tool for ligand-based virtual
screening applications, which estimate the bioactivity of a chemical
on the basis of the bioactivity of similar compounds. The results
proved to be very stable and the graph editing distance method performed
better than other methods previously used on reduced graphs. This
is exemplified with six publicly available data sets: DUD-E, MUV,
GLL&GDD, CAPST, NRLiSt BDB, and ULS-UDS. The screening and statistical
tools available on the ligand-based virtual screening benchmarking
platform and the RDKit were also used. In the experiments, our method
performed better than other molecular similarity methods which use
array representations in most cases. Overall, it is shown that extended
reduced graphs along with graph edit distance is a combination of
methods that has numerous applications and can identify bioactivity
similarities in a structurally diverse group of molecules.
Collapse
Affiliation(s)
- Carlos Garcia-Hernandez
- Departament d'Enginyeria Química , Universitat Rovira i Virgili , Tarragona , Catalunya 43007 , Spain
| | - Alberto Fernández
- Departament d'Enginyeria Química , Universitat Rovira i Virgili , Tarragona , Catalunya 43007 , Spain
| | - Francesc Serratosa
- Departament d'Enginyeria Informàtica i Matemàtiques , Universitat Rovira i Virgili , Tarragona , Catalunya 43007 , Spain
| |
Collapse
|
19
|
A Toolbox for the Identification of Modes of Action of Natural Products. PROGRESS IN THE CHEMISTRY OF ORGANIC NATURAL PRODUCTS 110 2019; 110:73-97. [DOI: 10.1007/978-3-030-14632-0_3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
20
|
Duran‐Frigola M, Fernández‐Torras A, Bertoni M, Aloy P. Formatting biological big data for modern machine learning in drug discovery. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2018. [DOI: 10.1002/wcms.1408] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- Miquel Duran‐Frigola
- Joint IRB‐BSC‐CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona) Barcelona Institute of Science and Technology Barcelona Spain
| | - Adrià Fernández‐Torras
- Joint IRB‐BSC‐CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona) Barcelona Institute of Science and Technology Barcelona Spain
| | - Martino Bertoni
- Joint IRB‐BSC‐CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona) Barcelona Institute of Science and Technology Barcelona Spain
| | - Patrick Aloy
- Joint IRB‐BSC‐CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona) Barcelona Institute of Science and Technology Barcelona Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA) Barcelona Spain
| |
Collapse
|
21
|
Cardoso‐Silva J, Papadatos G, Papageorgiou LG, Tsoka S. Optimal Piecewise Linear Regression Algorithm for QSAR Modelling. Mol Inform 2018; 38:e1800028. [DOI: 10.1002/minf.201800028] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Accepted: 08/02/2018] [Indexed: 12/20/2022]
Affiliation(s)
- Jonathan Cardoso‐Silva
- Department of Informatics, Faculty of Natural and Mathematical SciencesKing's College London, Bush House London WC2B 4BG UK
| | - George Papadatos
- European Molecular Biology Laboratory – European Bioinformatics InstituteWellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD UK
- GlaxoSmithKline Gunnels Wood Road Stevenage, Hertfordshire SG1 2NY UK
| | - Lazaros G. Papageorgiou
- Centre for Process Systems Engineering, Department of Chemical EngineeringUniversity College London Torrington Place London WC1E 7JE UK
| | - Sophia Tsoka
- Department of Informatics, Faculty of Natural and Mathematical SciencesKing's College London, Bush House London WC2B 4BG UK
| |
Collapse
|
22
|
Gozalbes R, Vicente de Julián-Ortiz J. Applications of Chemoinformatics in Predictive Toxicology for Regulatory Purposes, Especially in the Context of the EU REACH Legislation. ACTA ACUST UNITED AC 2018. [DOI: 10.4018/ijqspr.2018010101] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Chemoinformatics methodologies such as QSAR/QSPR have been used for decades in drug discovery projects, especially for the finding of new compounds with therapeutic properties and the optimization of ADME properties on chemical series. The application of computational techniques in predictive toxicology is much more recent, and they are experiencing an increasingly interest because of the new legal requirements imposed by national and international regulations. In the pharmaceutical field, the US Food and Drug Administration (FDA) support the use of predictive models for regulatory decision-making when assessing the genotoxic and carcinogenic potential of drug impurities. In Europe, the REACH legislation promotes the use of QSAR in order to reduce the huge amount of animal testing needed to demonstrate the safety of new chemical entities subjected to registration, provided they meet specific conditions to ensure their quality and predictive power. In this review, the authors summarize the state of art of in silico methods for regulatory purposes, with especial emphasis on QSAR models.
Collapse
|
23
|
Abstract
In this chapter, we introduce the basis of computational chemistry and discuss how computational methods have been extended to some biological properties and toxicology, in particular. Since about 20 years, chemical experimentation is more and more replaced by modeling and virtual experimentation, using a large core of mathematics, chemistry, physics, and algorithms. Then we see how animal experiments, aimed at providing a standardized result about a biological property, can be mimicked by new in silico methods. Our emphasis here is on toxicology and on predicting properties through chemical structures. Two main streams of such models are available: models that consider the whole molecular structure to predict a value, namely QSAR (Quantitative Structure Activity Relationships), and models that find relevant substructures to predict a class, namely SAR. The term in silico discovery is applied to chemical design, to computational toxicology, and to drug discovery. We discuss how the experimental practice in biological science is moving more and more toward modeling and simulation. Such virtual experiments confirm hypotheses, provide data for regulation, and help in designing new chemicals.
Collapse
|
24
|
Singh A, Singh R, Gupta N. Role of Supercomputers in Bioinformatics. Oncology 2017. [DOI: 10.4018/978-1-5225-0549-5.ch021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Due to the involvement of effective and client-friendly components (i.e. supercomputers), rapid data analysis is being accomplished. In Bioinformatics, it is expanding many areas of research such as genomics, proteomics, metabolomics, etc. Structure-based drug design is one of the major areas of research to cure human malady. This chapter initiates a discussion on supercomputing in sequence analysis with a detailed table summarizing the software and Web-based programs used for sequence analysis. A brief talk on the supercomputing in virtual screening is given where the databases like DOCK, ZINC, EDULISS, etc. are introduced. As the chapter transitions to the next phase, the intricacies of advanced Quantitative Structure-Activity Relationship technologies like Fragment-Based 2D QSAR, Multiple-Field 3D QSAR, and Amino Acid-Based Peptide Prediction are put forth in a manner similar to the concept of abstraction. The supercomputing in docking studies is stressed where docking software for Protein-Ligand docking, Protein-Protein docking, and Multi-Protein docking are provided. The chapter ends with the applications of supercomputing in widely used microarray data analysis.
Collapse
Affiliation(s)
- Anamika Singh
- Maitreyi College, India & University of Delhi, India
| | - Rajeev Singh
- Division of RCH, Indian Council of Medical Research, India
| | - Neha Gupta
- Northeastern University, USA & Osmania University, India
| |
Collapse
|
25
|
Zoete V, Daina A, Bovigny C, Michielin O. SwissSimilarity: A Web Tool for Low to Ultra High Throughput Ligand-Based Virtual Screening. J Chem Inf Model 2016; 56:1399-404. [PMID: 27391578 DOI: 10.1021/acs.jcim.6b00174] [Citation(s) in RCA: 182] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
SwissSimilarity is a new web tool for rapid ligand-based virtual screening of small to unprecedented ultralarge libraries of small molecules. Screenable compounds include drugs, bioactive and commercial molecules, as well as 205 million of virtual compounds readily synthesizable from commercially available synthetic reagents. Predictions can be carried out on-the-fly using six different screening approaches, including 2D molecular fingerprints as well as superpositional and fast nonsuperpositional 3D similarity methodologies. SwissSimilarity is part of a large initiative of the SIB Swiss Institute of Bioinformatics to provide online tools for computer-aided drug design, such as SwissDock, SwissBioisostere or SwissTargetPrediction with which it can interoperate, and is linked to other well-established online tools and databases. User interface and backend have been designed for simplicity and ease of use, to provide proficient virtual screening capabilities to specialists and nonexperts in the field. SwissSimilarity is accessible free of charge or login at http://www.swisssimilarity.ch .
Collapse
Affiliation(s)
- Vincent Zoete
- SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Génopode , CH-1015 Lausanne, Switzerland
| | - Antoine Daina
- SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Génopode , CH-1015 Lausanne, Switzerland
| | - Christophe Bovigny
- SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Génopode , CH-1015 Lausanne, Switzerland
| | - Olivier Michielin
- SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Génopode , CH-1015 Lausanne, Switzerland.,Ludwig Institute for Cancer Research, Centre Hospitalier Universitaire Vaudois , CH-1011 Lausanne, Switzerland.,Department of Oncology, University of Lausanne and Centre Hospitalier Universitaire Vaudois , CH-1011 Lausanne, Switzerland
| |
Collapse
|
26
|
Bains W. Low potency toxins reveal dense interaction networks in metabolism. BMC SYSTEMS BIOLOGY 2016; 10:19. [PMID: 26897366 PMCID: PMC4761184 DOI: 10.1186/s12918-016-0262-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/04/2015] [Accepted: 01/29/2016] [Indexed: 11/13/2022]
Abstract
BACKGROUND The chemicals of metabolism are constructed of a small set of atoms and bonds. This may be because chemical structures outside the chemical space in which life operates are incompatible with biochemistry, or because mechanisms to make or utilize such excluded structures has not evolved. In this paper I address the extent to which biochemistry is restricted to a small fraction of the chemical space of possible chemicals, a restricted subset that I call Biochemical Space. I explore evidence that this restriction is at least in part due to selection again specific structures, and suggest a mechanism by which this occurs. RESULTS Chemicals that contain structures that our outside Biochemical Space (UnBiological groups) are more likely to be toxic to a wide range of organisms, even though they have no specifically toxic groups and no obvious mechanism of toxicity. This correlation of UnBiological with toxicity is stronger for low potency (millimolar) toxins. I relate this to the observation that most chemicals interact with many biological structures at low millimolar toxicity. I hypothesise that life has to select its components not only to have a specific set of functions but also to avoid interactions with all the other components of life that might degrade their function. CONCLUSIONS The chemistry of life has to form a dense, self-consistent network of chemical structures, and cannot easily be arbitrarily extended. The toxicity of arbitrary chemicals is a reflection of the disruption to that network occasioned by trying to insert a chemical into it without also selecting all the other components to tolerate that chemical. This suggests new ways to test for the toxicity of chemicals, and that engineering organisms to make high concentrations of materials such as chemical precursors or fuels may require more substantial engineering than just of the synthetic pathways involved.
Collapse
Affiliation(s)
- William Bains
- Earth, Atmospheric and Planetary Sciences Department, MIT, 77 Mass Avenue, Cambridge, MA, 02139, USA.
- Rufus Scientific Ltd., 37 The Moor, Melbourn, Royston, Herts, SG8 6ED, UK.
| |
Collapse
|
27
|
Ferreira RDQ, Greco SJ, Delarmelina M, Weber KC. Electrochemical quantification of the structure/antioxidant activity relationship of flavonoids. Electrochim Acta 2015. [DOI: 10.1016/j.electacta.2015.02.164] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
|
28
|
Roy K, Kar S, Das RN. QSAR/QSPR Modeling: Introduction. SPRINGERBRIEFS IN MOLECULAR SCIENCE 2015. [DOI: 10.1007/978-3-319-17281-1_1] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
29
|
Gini G, Franchi AM, Manganaro A, Golbamaki A, Benfenati E. ToxRead: a tool to assist in read across and its use to assess mutagenicity of chemicals. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2014; 25:999-1011. [PMID: 25511972 DOI: 10.1080/1062936x.2014.976267] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2014] [Accepted: 09/15/2014] [Indexed: 06/04/2023]
Abstract
Life sciences, and toxicology in particular, are heavily impacted by the development of methods for data collection and data analysis; they are moving from an analytical approach to a modelling approach. The scarce availability of experimental data is a known bottleneck in assessing the properties of new chemicals. Even when a model is available, the resulting predictions have to be assessed by close scrutiny of the chemicals and the biological properties of the compounds concerned. To avoid unnecessary testing, a read across strategy is often suggested and used. In this paper we discuss how to improve and standardize read across activity using ad hoc visualization and data search methods which use similarity measures and fragment search to organize in a chart a picture of all the relevant information that the expert needs to make an assessment. We show in particular how to apply our system to the case of mutagenicity.
Collapse
Affiliation(s)
- G Gini
- a Dipartimento di Elettronica, Informazione e Bioingegneria , Politecnico di Milano , Milan , Italy
| | | | | | | | | |
Collapse
|
30
|
Kumar V, Krishna S, Siddiqi MI. Virtual screening strategies: recent advances in the identification and design of anti-cancer agents. Methods 2014; 71:64-70. [PMID: 25171960 DOI: 10.1016/j.ymeth.2014.08.010] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 07/31/2014] [Accepted: 08/19/2014] [Indexed: 01/29/2023] Open
Abstract
Virtual screening (VS) is a well-established technique, which is now routinely employed in computer aided drug designing process. VS can be broadly classified into two categories, i.e., ligand-based and structure-based approach. In recent years, VS has emerged as a time saving and cost effective technique, capable of screening millions of compounds in a user friendly manner. In the area of cancer drug design, VS methods have been widely used and helped in identifying novel molecules as potential anti-cancer agents. Both ligand-based VS (LBVS) structure-based VS (SBVS) methods have been highly useful in the identification of a number of potential anti-cancer agents exhibiting activities in nanomolar range. In tune with the rapid progress in the enhancement of computational power, VS has witnessed significant change in terms of speed and hit rate and in future it is expected that VS will be a preferential alternative to high throughput screening (HTS). This review, discusses recent trends and contribution of VS in the area of anti-cancer drug discovery.
Collapse
Affiliation(s)
- Vikash Kumar
- Molecular & Structural Biology Division, CSIR-Central Drug Research Institute, Lucknow, India
| | - Shagun Krishna
- Molecular & Structural Biology Division, CSIR-Central Drug Research Institute, Lucknow, India
| | - Mohammad Imran Siddiqi
- Molecular & Structural Biology Division, CSIR-Central Drug Research Institute, Lucknow, India; Academy of Scientific and Innovative Research, New Delhi, India.
| |
Collapse
|
31
|
Beck B, Geppert T. Industrial applications of in silico ADMET. J Mol Model 2014; 20:2322. [PMID: 24972798 DOI: 10.1007/s00894-014-2322-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2014] [Accepted: 05/27/2014] [Indexed: 11/26/2022]
Abstract
Quantitative structure activity relationship (QSAR) modeling has been in use for several decades now. One branch of it, in silico ADMET, became more and more important since the late 1990s as studies indicated that poor pharmacokinetics and toxicity were important causes of costly late-stage failures in drug development. In this paper we describe some of the available methods and best practice for the different stages of the in silico model building process. We also describe some more recent developments, like automated model building and the prediction probability. Finally we will discuss the use of in silico ADMET for "big data" and the importance and possible further development of interpretable models.
Collapse
Affiliation(s)
- Bernd Beck
- Department of Lead Identification and Optimization Support, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorferstrasse 65, 88397, Biberach an der Riss, Germany,
| | | |
Collapse
|
32
|
|
33
|
Ong CE, Pan Y, Mak JW, Ismail R. In vitro approaches to investigate cytochrome P450 activities: update on current status and their applicability. Expert Opin Drug Metab Toxicol 2013; 9:1097-113. [PMID: 23682848 DOI: 10.1517/17425255.2013.800482] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
INTRODUCTION Cytochromes P450 (CYPs) play a central role in the Phase I metabolism of drugs and other xenobiotics. It is estimated that CYPs can metabolize up to two-thirds of drugs present in humans. Over the past two decades, there have been numerous advances in in vitro methodologies to characterize drug metabolism and interaction involving CYPs. AREAS COVERED This review focuses on the use of in vitro methodologies to examine CYPs' role in drug metabolism and interaction. There is an emphasis on their current development, applicability, advantages and limitations as well as the use of in silico approaches in complementing and supporting in vitro data. The article also highlights the challenges in extrapolating in vitro data to in vivo situations. EXPERT OPINION Advances in in vitro methodologies have been made such that data can be used for in vivo prediction with comfortable degree of confidence. Improved assay designs and analytical techniques have permitted development of miniaturized assay format and automated system with improved sensitivity and throughput capacity. High-quality experimental designs and scientifically rigorous assessment/validation protocols remain crucial in developing reliable and robust in vitro models. With continued progress made in the field, in vitro methodologies will continually be employed in evaluating CYP activities in pharmaceutical industries and laboratories.
Collapse
Affiliation(s)
- Chin Eng Ong
- Monash University Sunway Campus, Jeffrey Cheah School of Medicine and Health Sciences, Jalan Lagoon Selatan, 46150 Bandar Sunway, Selangor, Malaysia.
| | | | | | | |
Collapse
|
34
|
|
35
|
Benigni R, Battistelli CL, Bossa C, Colafranceschi M, Tcheremenskaia O. Mutagenicity, carcinogenicity, and other end points. Methods Mol Biol 2013; 930:67-98. [PMID: 23086838 DOI: 10.1007/978-1-62703-059-5_4] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2023]
Abstract
Aiming at understanding the structural and physical chemical basis of the biological activity of chemicals, the science of structure-activity relationships has seen dramatic progress in the last decades. Coarse-grain, qualitative approaches (e.g., the structural alerts), and fine-tuned quantitative structure-activity relationship models have been developed and used to predict the toxicological properties of untested chemicals. More recently, a number of approaches and concepts have been developed as support to, and corollary of, the structure-activity methods. These approaches (e.g., chemical relational databases, expert systems, software tools for manipulating the chemical information) have dramatically expanded the reach of the structure-activity work; at present, they are powerful and inescapable tools for computer chemists, toxicologists, and regulators. This chapter, after a general overview of traditional and well-known approaches, gives a detailed presentation of the latter more recent support tools freely available in the public domain.
Collapse
Affiliation(s)
- Romualdo Benigni
- Environment and Health Department, Istitituto Superiore di Sanita', Rome, Italy.
| | | | | | | | | |
Collapse
|
36
|
Chakraborty A, Pan S, Chattaraj PK. Biological Activity and Toxicity: A Conceptual DFT Approach. STRUCTURE AND BONDING 2013. [DOI: 10.1007/978-3-642-32750-6_5] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
37
|
Ferrari T, Cattaneo D, Gini G, Golbamaki Bakhtyari N, Manganaro A, Benfenati E. Automatic knowledge extraction from chemical structures: the case of mutagenicity prediction. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013; 24:365-83. [PMID: 23710765 DOI: 10.1080/1062936x.2013.773376] [Citation(s) in RCA: 98] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
This work proposes a new structure-activity relationship (SAR) approach to mine molecular fragments that act as structural alerts for biological activity. The entire process is designed to fit with human reasoning, not only to make the predictions more reliable but also to permit clear control by the user in order to meet customized requirements. This approach has been tested on the mutagenicity endpoint, showing marked prediction skills and, more interestingly, bringing to the surface much of the knowledge already collected in the literature as well as new evidence.
Collapse
Affiliation(s)
- T Ferrari
- Department of Electronics and Information, Politecnico di Milano, Milan, Italy
| | | | | | | | | | | |
Collapse
|
38
|
Structure-based predictions of 13C-NMR chemical shifts for a series of 2-functionalized 5-(methylsulfonyl)-1-phenyl-1H-indoles derivatives using GA-based MLR method. J Mol Struct 2012. [DOI: 10.1016/j.molstruc.2012.06.042] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
39
|
Antitumor structure–activity relationship in bis-stannoxane derivatives from pyridine dicarboxylic and benzoic acids. Inorganica Chim Acta 2012. [DOI: 10.1016/j.ica.2012.06.029] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
40
|
Gleeson MP, Montanari D. Strategies for the generation, validation and application of in silico ADMET models in lead generation and optimization. Expert Opin Drug Metab Toxicol 2012; 8:1435-46. [PMID: 22849616 DOI: 10.1517/17425255.2012.711317] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
INTRODUCTION The most desirable chemical starting point in drug discovery is a hit or lead with a good overall profile, and where there may be issues; a clear SAR strategy should be identifiable to minimize the issue. Filtering based on drug-likeness concepts are a first step, but more accurate theoretical methods are needed to i) estimate the biological profile of molecule in question and ii) based on the underlying structure-activity relationships used by the model, estimate whether it is likely that the molecule in question can be altered to remove these liabilities. AREAS COVERED In this paper, the authors discuss the generation of ADMET models and their practical use in decision making. They discuss the issues surrounding data collation, experimental errors, the model assessment and validation steps, as well as the different types of descriptors and statistical models that can be used. This is followed by a discussion on how the model accuracy will dictate when and where it can be used in the drug discovery process. The authors also discuss how models can be developed to more effectively enable multiple parameter optimization. EXPERT OPINION Models can be applied in lead generation and lead optimization steps to i) rank order a collection of hits, ii) prioritize the experimental assays needed for different hit series, iii) assess the likelihood of resolving a problem that might be present in a particular series in lead optimization and iv) screen a virtual library based on a hit or lead series to assess the impact of diverse structural changes on the predicted properties.
Collapse
Affiliation(s)
- Matthew Paul Gleeson
- Kasetsart University, Faculty of Science, Department of Chemistry, 50 Phaholyothin Rd, Chatuchak, Bangkok 10900, Thailand.
| | | |
Collapse
|
41
|
Dashtbozorgi Z, Golmohammadi H. Prediction of gas to water partition coefficient of some organic compounds using theoretically derived molecular descriptors. J STRUCT CHEM+ 2012. [DOI: 10.1134/s0022476612020096] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
42
|
Ghasemi JB, Zolfonoun E. A New Variable Selection Method Based on Mutual Information Maximization by Replacing Collinear Variables for Nonlinear Quantitative Structure-Property Relationship Models. B KOREAN CHEM SOC 2012. [DOI: 10.5012/bkcs.2012.33.5.1527] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
43
|
Abstract
Computational methods now play an integral role in modern drug discovery, and include the design and management of small molecule libraries, initial hit identification through virtual screening, optimization of the affinity and selectivity of hits, and improving the physicochemical properties of the lead compounds. In this chapter, we survey the most important data sources for the discovery of new molecular entities, and discuss the key considerations and guidelines for virtual chemical library design.
Collapse
Affiliation(s)
- Paul H Bernardo
- Institute of Chemical and Engineering Sciences, Agency for Science Technology and Research (A STAR), Singapore, Singapore
| | | |
Collapse
|
44
|
Singh R. Learning and Prediction of Complex Molecular Structure-Property Relationships. Mach Learn 2012. [DOI: 10.4018/978-1-60960-818-7.ch518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The problem of modeling and predicting complex structure-property relationships, such as the absorption, distribution, metabolism, and excretion of putative drug molecules is a fundamental one in contemporary drug discovery. An accurate model can not only be used to predict the behavior of a molecule and understand how structural variations may influence molecular property, but also to identify regions of molecular space that hold promise in context of a specific investigation. However, a variety of factors contribute to the difficulty of constructing robust structure activity models for such complex properties. These include conceptual issues related to how well the true bio-chemical property is accounted for by formulation of the specific learning strategy, algorithmic issues associated with determining the proper molecular descriptors, access to small quantities of data, possibly on tens of molecules only, due to the high cost and complexity of the experimental process, and the complex nature of bio-chemical phenomena underlying the data. This chapter attempts to address this problem from the rudiments: the authors first identify and discuss the salient computational issues that span (and complicate) structure-property modeling formulations and present a brief review of the state-of-the-art. The authors then consider a specific problem: that of modeling intestinal drug absorption, where many of the aforementioned factors play a role. In addressing them, their solution uses a novel characterization of molecular space based on the notion of surface-based molecular similarity. This is followed by identifying a statistically relevant set of molecular descriptors, which along with an appropriate machine learning technique, is used to build the structure-property model. The authors propose simultaneous use of both ratio and ordinal error-measures for model construction and validation. The applicability of the approach is demonstrated in a real world case study.
Collapse
|
45
|
SHIGA M, TAKAHASHI Y. Compression of Topological Fragment Spectra (TFS) for Accelerating Chemical Data Mining. JOURNAL OF COMPUTER CHEMISTRY-JAPAN 2012. [DOI: 10.2477/jccj.2012-0002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
46
|
Park K, Kim D. Drug-drug relationship based on target information: application to drug target identification. BMC SYSTEMS BIOLOGY 2011; 5 Suppl 2:S12. [PMID: 22784569 PMCID: PMC3287478 DOI: 10.1186/1752-0509-5-s2-s12] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Background Drugs that bind to common targets likely exert similar activities. In this target-centric view, the inclusion of richer target information may better represent the relationships between drugs and their activities. Under this assumption, we expanded the “common binding rule” assumption of QSAR to create a new drug-drug relationship score (DRS). Method Our method uses various chemical features to encode drug target information into the drug-drug relationship information. Specifically, drug pairs were transformed into numerical vectors containing the basal drug properties and their differences. After that, machine learning techniques such as data cleaning, dimension reduction, and ensemble classifier were used to prioritize drug pairs bound to a common target. In other words, the estimation of the drug-drug relationship is restated as a large-scale classification problem, which provides the framework for using state-of-the-art machine learning techniques with thousands of chemical features for newly defining drug-drug relationships. Conclusions Various aspects of the presented score were examined to determine its reliability and usefulness: the abundance of common domains for the predicted drug pairs, c.a. 80% coverage for known targets, successful identifications of unknown targets, and a meaningful correlation with another cutting-edge method for analyzing drug similarities. The most significant strength of our method is that the DRS can be used to describe phenotypic similarities, such as pharmacological effects.
Collapse
Affiliation(s)
- Keunwan Park
- Department of Bio and Brain Engineering, KAIST, 373-1, Guseong-dong, Yuseong-gu, Daejeon, 305-701, Republic of Korea
| | | |
Collapse
|
47
|
Lagorce D, Villoutreix BO, Miteva MA. Three-dimensional structure generators of drug-like compounds: DG-AMMOS, an open-source package. Expert Opin Drug Discov 2011; 6:339-51. [DOI: 10.1517/17460441.2011.554393] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
48
|
Benigni R, Bossa C. Mechanisms of Chemical Carcinogenicity and Mutagenicity: A Review with Implications for Predictive Toxicology. Chem Rev 2011; 111:2507-36. [PMID: 21265518 DOI: 10.1021/cr100222q] [Citation(s) in RCA: 239] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Affiliation(s)
- Romualdo Benigni
- Istituto Superiore di Sanita’, Environment and Health Department, Viale Regina Elena, 299 00161 Rome, Italy
| | - Cecilia Bossa
- Istituto Superiore di Sanita’, Environment and Health Department, Viale Regina Elena, 299 00161 Rome, Italy
| |
Collapse
|
49
|
Lange KM, Hodeck KF, Schade U, Aziz EF. Nature of the Hydrogen Bond of Water in Solvents of Different Polarities. J Phys Chem B 2010; 114:16997-7001. [DOI: 10.1021/jp109790z] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Kathrin M. Lange
- Helmholtz-Zentrum Berlin für Materialien und Energie, c/o BESSY GmbH, Albert-Einstein-Strasse 15, 12489 Berlin, Germany, and FB Physik, Freie Universität Berlin, Arnimallee 14, D-14195 Berlin, Germany
| | - Kai F. Hodeck
- Helmholtz-Zentrum Berlin für Materialien und Energie, c/o BESSY GmbH, Albert-Einstein-Strasse 15, 12489 Berlin, Germany, and FB Physik, Freie Universität Berlin, Arnimallee 14, D-14195 Berlin, Germany
| | - Ulrich Schade
- Helmholtz-Zentrum Berlin für Materialien und Energie, c/o BESSY GmbH, Albert-Einstein-Strasse 15, 12489 Berlin, Germany, and FB Physik, Freie Universität Berlin, Arnimallee 14, D-14195 Berlin, Germany
| | - Emad F. Aziz
- Helmholtz-Zentrum Berlin für Materialien und Energie, c/o BESSY GmbH, Albert-Einstein-Strasse 15, 12489 Berlin, Germany, and FB Physik, Freie Universität Berlin, Arnimallee 14, D-14195 Berlin, Germany
| |
Collapse
|
50
|
Naik PK, Alam A, Malhotra A, Rizvi O. Molecular Modeling and Structure-Activity Relationship of Podophyllotoxin and Its Congeners. ACTA ACUST UNITED AC 2010; 15:528-40. [DOI: 10.1177/1087057110368994] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
A quantitative structure-activity relationship (QSAR) model has been developed between cytotoxic activity and structural properties by considering a data set of 119 podophyllotoxin analogs based on 2D and 3D structural descriptors. A systematic stepwise searching approach of zero tests, a missing value test, a simple correlation test, a multicollinearity test, and a genetic algorithm method of variable selection was used to generate the model. A statistically significant model ( r train2 = 0.906; q cv2 = 0.893) was obtained with the molecular descriptors. The robustness of the QSAR model was characterized by the values of the internal leave-one-out cross-validated regression coefficient ( q cv2) for the training set and r test2 for the test set. The overall root mean square error (RMSE) between the experimental and predicted pIC50 value was 0.265 and r test2 = 0.824, revealing good predictability of the QSAR model. For an external data set of 16 podophyllotoxin analogs, the QSAR model was able to predict the tubulin polymerization inhibition and mechanistically cytotoxic activity with an RMSE value of 0.295 in comparison to experimental values. The QSAR model developed in this study shall aid further design of novel potent podophyllotoxin derivatives.
Collapse
Affiliation(s)
- Pradeep Kumar Naik
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Solan, Himachal Pradesh, India.
| | | | | | | |
Collapse
|