1
|
Prediction and Screening Model for Products Based on Fusion Regression and XGBoost Classification. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:4987639. [PMID: 35958779 PMCID: PMC9357736 DOI: 10.1155/2022/4987639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Revised: 06/14/2022] [Accepted: 06/27/2022] [Indexed: 11/18/2022]
Abstract
Performance prediction based on candidates and screening based on predicted performance value are the core of product development. For example, the performance prediction and screening of equipment components and parts are an important guarantee for the reliability of equipment products. The prediction and screening of drug bioactivity value and performance are the keys to pharmaceutical product development. The main reasons for the failure of pharmaceutical discovery are the low bioactivity of the candidate compounds and the deficiencies in their efficacy and safety, which are related to the absorption, distribution, metabolism, excretion, and toxicity (ADMET) of the compounds. Therefore, it is very necessary to quickly and effectively perform systematic bioactivity value prediction and ADMET property evaluation for candidate compounds in the early stage of drug discovery. In this paper, a data-driven pharmaceutical products screening prediction model is proposed to screen drug candidates with higher bioactivity value and better ADMET properties. First, a quantitative prediction method for bioactivity value is proposed using the fusion regression of LGBM and neural network based on backpropagation (BP-NN). Then, the ADMET properties prediction method is proposed using XGBoost. According to the predicted bioactivity value and ADMET properties, the BVAP method is defined to screen the drug candidates. And the screening model is validated on the dataset of antagonized Erα active compounds, in which the mean square error (MSE) of fusion regression is 1.1496, the XGBoost prediction accuracy of ADMET properties are 94.0% for Caco-2, 95.7% for CYP3A4, 89.4% for HERG, 88.6% for hob, and 96.2% for Mn. Compared with the commonly used methods for ADMET properties such as SVM, RF, KNN, LDA, and NB, the XGBoost in this paper has the highest prediction accuracy and AUC value, which has better guiding significance and can help screen pharmaceutical product candidates with good bioactivity, pharmacokinetic properties, and safety.
Collapse
|
2
|
Artificial intelligence and machine-learning approaches in structure and ligand-based discovery of drugs affecting central nervous system. Mol Divers 2022; 27:959-985. [PMID: 35819579 DOI: 10.1007/s11030-022-10489-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 06/21/2022] [Indexed: 12/11/2022]
Abstract
CNS disorders are indications with a very high unmet medical needs, relatively smaller number of available drugs, and a subpar satisfaction level among patients and caregiver. Discovery of CNS drugs is extremely expensive affair with its own unique challenges leading to extremely high attrition rates and low efficiency. With explosion of data in information age, there is hardly any aspect of life that has not been touched by data driven technologies such as artificial intelligence (AI) and machine learning (ML). Drug discovery is no exception, emergence of big data via genomic, proteomic, biological, and chemical technologies has driven pharmaceutical giants to collaborate with AI oriented companies to revolutionise drug discovery, with the goal of increasing the efficiency of the process. In recent years many examples of innovative applications of AI and ML techniques in CNS drug discovery has been reported. Research on therapeutics for diseases such as schizophrenia, Alzheimer's and Parkinsonism has been provided with a new direction and thrust from these developments. AI and ML has been applied to both ligand-based and structure-based drug discovery and design of CNS therapeutics. In this review, we have summarised the general aspects of AI and ML from the perspective of drug discovery followed by a comprehensive coverage of the recent developments in the applications of AI/ML techniques in CNS drug discovery.
Collapse
|
3
|
Cottura N, Kinvig H, Grañana-Castillo S, Wood A, Siccardi M. Drug-Drug Interactions in People Living with HIV at Risk of Hepatic and Renal Impairment: Current Status and Future Perspectives. J Clin Pharmacol 2022; 62:835-846. [PMID: 34990024 PMCID: PMC9304147 DOI: 10.1002/jcph.2025] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 01/03/2022] [Indexed: 11/10/2022]
Abstract
Despite the advancement of antiretroviral therapy (ART) for the treatment of human immunodeficiency virus (HIV), drug–drug interactions (DDIs) remain a relevant clinical issue for people living with HIV receiving ART. Antiretroviral (ARV) drugs can be victims and perpetrators of DDIs, and a detailed investigation during drug discovery and development is required to determine whether dose adjustments are necessary or coadministrations are contraindicated. Maintaining therapeutic ARV plasma concentrations is essential for successful ART, and changes resulting from potential DDIs could lead to toxicity, treatment failure, or the emergence of ARV‐resistant HIV. The challenges surrounding DDI management are complex in special populations of people living with HIV, and often lack evidence‐based guidance as a result of their underrepresentation in clinical investigations. Specifically, the prevalence of hepatic and renal impairment in people living with HIV are between five and 10 times greater than in people who are HIV‐negative, with each condition constituting approximately 15% of non‐AIDS‐related mortality. Therapeutic strategies tend to revolve around the treatment of risk factors that lead to hepatic and renal impairment, such as hepatitis C, hepatitis B, hypertension, hyperlipidemia, and diabetes. These strategies result in a diverse range of potential DDIs with ART. The purpose of this review was 2‐fold. First, to summarize current pharmacokinetic DDIs and their mechanisms between ARVs and co‐medications used for the prevention and treatment of hepatic and renal impairment in people living with HIV. Second, to identify existing knowledge gaps surrounding DDIs related to these special populations and suggest areas and techniques to focus upon in future research efforts.
Collapse
Affiliation(s)
- Nicolas Cottura
- Department of Pharmacology and Therapeutics, University of Liverpool, Liverpool, UK
| | - Hannah Kinvig
- Department of Pharmacology and Therapeutics, University of Liverpool, Liverpool, UK
| | | | - Adam Wood
- Department of Pharmacology and Therapeutics, University of Liverpool, Liverpool, UK
| | - Marco Siccardi
- Department of Pharmacology and Therapeutics, University of Liverpool, Liverpool, UK
| |
Collapse
|
4
|
Castro LHE, Sant'Anna CMR. Molecular Modeling Techniques Applied to the Design of Multitarget Drugs: Methods and Applications. Curr Top Med Chem 2021; 22:333-346. [PMID: 34844540 DOI: 10.2174/1568026621666211129140958] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 10/23/2021] [Accepted: 10/28/2021] [Indexed: 11/22/2022]
Abstract
Multifactorial diseases, such as cancer and diabetes present a challenge for the traditional "one-target, one disease" paradigm due to their complex pathogenic mechanisms. Although a combination of drugs can be used, a multitarget drug may be a better choice face of its efficacy, lower adverse effects and lower chance of resistance development. The computer-based design of these multitarget drugs can explore the same techniques used for single-target drug design, but the difficulties associated to the obtention of drugs that are capable of modulating two or more targets with similar efficacy impose new challenges, whose solutions involve the adaptation of known techniques and also to the development of new ones, including machine-learning approaches. In this review, some SBDD and LBDD techniques for the multitarget drug design are discussed, together with some cases where the application of such techniques led to effective multitarget ligands.
Collapse
Affiliation(s)
| | - Carlos Mauricio R Sant'Anna
- Programa de Pós-Graduação em Química, Instituto de Química, Universidade Federal Rural do Rio de Janeiro, Seropédica. Brazil
| |
Collapse
|
5
|
Nayarisseri A, Khandelwal R, Tanwar P, Madhavi M, Sharma D, Thakur G, Speck-Planche A, Singh SK. Artificial Intelligence, Big Data and Machine Learning Approaches in Precision Medicine & Drug Discovery. Curr Drug Targets 2021; 22:631-655. [PMID: 33397265 DOI: 10.2174/1389450122999210104205732] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Revised: 08/21/2020] [Accepted: 09/14/2020] [Indexed: 11/22/2022]
Abstract
Artificial Intelligence revolutionizes the drug development process that can quickly identify potential biologically active compounds from millions of candidate within a short period. The present review is an overview based on some applications of Machine Learning based tools, such as GOLD, Deep PVP, LIB SVM, etc. and the algorithms involved such as support vector machine (SVM), random forest (RF), decision tree and Artificial Neural Network (ANN), etc. at various stages of drug designing and development. These techniques can be employed in SNP discoveries, drug repurposing, ligand-based drug design (LBDD), Ligand-based Virtual Screening (LBVS) and Structure- based Virtual Screening (SBVS), Lead identification, quantitative structure-activity relationship (QSAR) modeling, and ADMET analysis. It is demonstrated that SVM exhibited better performance in indicating that the classification model will have great applications on human intestinal absorption (HIA) predictions. Successful cases have been reported which demonstrate the efficiency of SVM and RF models in identifying JFD00950 as a novel compound targeting against a colon cancer cell line, DLD-1, by inhibition of FEN1 cytotoxic and cleavage activity. Furthermore, a QSAR model was also used to predict flavonoid inhibitory effects on AR activity as a potent treatment for diabetes mellitus (DM), using ANN. Hence, in the era of big data, ML approaches have been evolved as a powerful and efficient way to deal with the huge amounts of generated data from modern drug discovery to model small-molecule drugs, gene biomarkers and identifying the novel drug targets for various diseases.
Collapse
Affiliation(s)
- Anuraj Nayarisseri
- In silico Research Laboratory, Eminent Biosciences, Mahalakshmi Nagar, Indore - 452010, Madhya Pradesh, India
| | - Ravina Khandelwal
- In silico Research Laboratory, Eminent Biosciences, Mahalakshmi Nagar, Indore - 452010, Madhya Pradesh, India
| | - Poonam Tanwar
- In silico Research Laboratory, Eminent Biosciences, Mahalakshmi Nagar, Indore - 452010, Madhya Pradesh, India
| | - Maddala Madhavi
- Department of Zoology, Nizam College, Osmania University, Hyderabad - 500001, Telangana State, India
| | - Diksha Sharma
- In silico Research Laboratory, Eminent Biosciences, Mahalakshmi Nagar, Indore - 452010, Madhya Pradesh, India
| | - Garima Thakur
- In silico Research Laboratory, Eminent Biosciences, Mahalakshmi Nagar, Indore - 452010, Madhya Pradesh, India
| | - Alejandro Speck-Planche
- Programa Institucional de Fomento a la Investigacion, Desarrollo e Innovacion, Universidad Tecnologica Metropolitana, Ignacio Valdivieso 2409, P.O. 8940577, San Joaquin, Santiago, Chile
| | - Sanjeev Kumar Singh
- Computer Aided Drug Designing and Molecular Modeling Lab, Department of Bioinformatics, Alagappa University, Karaikudi-630003, Tamil Nadu, India
| |
Collapse
|
6
|
Ulenberg S, Belka M, Georgiev P, Ślifirski G, Król M, Herold F, Bączek T. The influence of phase II enzymes on in vitro half-life of pirydo[1,2-c]pirymidine derivatives as structural analogues of arylpiperazine. Microchem J 2020. [DOI: 10.1016/j.microc.2020.105550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
7
|
Soubraylu S, Rajalakshmi R. Hybrid convolutional bidirectional recurrent neural network based sentiment analysis on movie reviews. Comput Intell 2020. [DOI: 10.1111/coin.12400] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Sivakumar Soubraylu
- School of Computer Science and Engineering Vellore Institute of Technology Chennai India
| | - Ratnavel Rajalakshmi
- School of Computer Science and Engineering Vellore Institute of Technology Chennai India
| |
Collapse
|
8
|
Lambard G, Gracheva E. SMILES-X: autonomous molecular compounds characterization for small datasets without descriptors. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2020. [DOI: 10.1088/2632-2153/ab57f3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Abstract
There is more and more evidence that machine learning can be successfully applied in materials science and related fields. However, datasets in these fields are often quite small (from tens to several thousands of samples). This means the most advanced machine learning techniques remain neglected, as they are considered to be applicable to big data only. Moreover, materials informatics methods often rely on human-engineered descriptors, that should be carefully chosen, or even created, to fit the physicochemical property that one intends to predict. In this article, we propose a new method that tackles both the issue of small datasets and the difficulty of developing task-specific descriptors. The SMILES-X is an autonomous pipeline for molecular compounds characterisation based on a {Embed-Encode-Attend-Predict} neural architecture with a data-specific Bayesian hyper-parameters optimisation. The only input to the architecture—the SMILES strings—are de-canonicalised in order to efficiently augment the data. One of the key features of the architecture is the attention mechanism, which enables the interpretation of output predictions without extra computational cost. The SMILES-X achieves state-of-the-art results in the inference of aqueous solubility (
RMSE
¯
test
≃
0.57
±
0.07
mols/L), hydration free energy (
RMSE
¯
test
≃
0.81
±
0.22
kcal/mol, which is ∼24.5% better than molecular dynamics simulations), and octanol/water distribution coefficient (
RMSE
¯
test
≃
0.59
±
0.02
for LogD at pH 7.4) of molecular compounds. The SMILES-X is intended to become an important asset in the toolkit of materials scientists and chemists. The source code for the SMILES-X is available at github.com/GLambard/SMILES-X.
Collapse
|
9
|
Hong J, Luo Y, Zhang Y, Ying J, Xue W, Xie T, Tao L, Zhu F. Protein functional annotation of simultaneously improved stability, accuracy and false discovery rate achieved by a sequence-based deep learning. Brief Bioinform 2019; 21:1437-1447. [PMID: 31504150 PMCID: PMC7412958 DOI: 10.1093/bib/bbz081] [Citation(s) in RCA: 90] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 05/27/2019] [Accepted: 06/10/2019] [Indexed: 11/12/2022] Open
Abstract
Functional annotation of protein sequence with high accuracy has become one of the most important issues in modern biomedical studies, and computational approaches of significantly accelerated analysis process and enhanced accuracy are greatly desired. Although a variety of methods have been developed to elevate protein annotation accuracy, their ability in controlling false annotation rates remains either limited or not systematically evaluated. In this study, a protein encoding strategy, together with a deep learning algorithm, was proposed to control the false discovery rate in protein function annotation, and its performances were systematically compared with that of the traditional similarity-based and de novo approaches. Based on a comprehensive assessment from multiple perspectives, the proposed strategy and algorithm were found to perform better in both prediction stability and annotation accuracy compared with other de novo methods. Moreover, an in-depth assessment revealed that it possessed an improved capacity of controlling the false discovery rate compared with traditional methods. All in all, this study not only provided a comprehensive analysis on the performances of the newly proposed strategy but also provided a tool for the researcher in the fields of protein function annotation.
Collapse
Affiliation(s)
- Jiajun Hong
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicine of Zhejiang Province, School of Medicine, Hangzhou Normal University, Hangzhou, China.,College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Yongchao Luo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Yang Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences, Chongqing University, Chongqing, China
| | - Junbiao Ying
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Weiwei Xue
- School of Pharmaceutical Sciences, Chongqing University, Chongqing, China
| | - Tian Xie
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicine of Zhejiang Province, School of Medicine, Hangzhou Normal University, Hangzhou, China
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicine of Zhejiang Province, School of Medicine, Hangzhou Normal University, Hangzhou, China
| | - Feng Zhu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicine of Zhejiang Province, School of Medicine, Hangzhou Normal University, Hangzhou, China.,College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| |
Collapse
|
10
|
Winter R, Montanari F, Noé F, Clevert DA. Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations. Chem Sci 2019; 10:1692-1701. [PMID: 30842833 PMCID: PMC6368215 DOI: 10.1039/c8sc04175j] [Citation(s) in RCA: 233] [Impact Index Per Article: 46.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2018] [Accepted: 11/17/2018] [Indexed: 12/23/2022] Open
Abstract
There has been a recent surge of interest in using machine learning across chemical space in order to predict properties of molecules or design molecules and materials with the desired properties. Most of this work relies on defining clever feature representations, in which the chemical graph structure is encoded in a uniform way such that predictions across chemical space can be made. In this work, we propose to exploit the powerful ability of deep neural networks to learn a feature representation from low-level encodings of a huge corpus of chemical structures. Our model borrows ideas from neural machine translation: it translates between two semantically equivalent but syntactically different representations of molecular structures, compressing the meaningful information both representations have in common in a low-dimensional representation vector. Once the model is trained, this representation can be extracted for any new molecule and utilized as a descriptor. In fair benchmarks with respect to various human-engineered molecular fingerprints and graph-convolution models, our method shows competitive performance in modelling quantitative structure-activity relationships in all analysed datasets. Additionally, we show that our descriptor significantly outperforms all baseline molecular fingerprints in two ligand-based virtual screening tasks. Overall, our descriptors show the most consistent performances in all experiments. The continuity of the descriptor space and the existence of the decoder that permits deducing a chemical structure from an embedding vector allow for exploration of the space and open up new opportunities for compound optimization and idea generation.
Collapse
Affiliation(s)
- Robin Winter
- Department of Bioinformatics , Bayer AG , Berlin , Germany .
- Department of Mathematics and Computer Science , Freie Universität Berlin , Berlin , Germany
| | | | - Frank Noé
- Department of Mathematics and Computer Science , Freie Universität Berlin , Berlin , Germany
| | | |
Collapse
|
11
|
Kazmi SR, Jun R, Yu MS, Jung C, Na D. In silico approaches and tools for the prediction of drug metabolism and fate: A review. Comput Biol Med 2019; 106:54-64. [PMID: 30682640 DOI: 10.1016/j.compbiomed.2019.01.008] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2018] [Revised: 01/14/2019] [Accepted: 01/14/2019] [Indexed: 01/08/2023]
Abstract
The fate of administered drugs is largely influenced by their metabolism. For example, endogenous enzyme-catalyzed conversion of drugs may result in therapeutic inactivation or activation or may transform the drugs into toxic chemical compounds. This highlights the importance of drug metabolism in drug discovery and development, and accounts for the wide variety of experimental technologies that provide insights into the fate of drugs. In view of the high cost of traditional drug development, a number of computational approaches have been developed for predicting the metabolic fate of drug candidates, allowing for screening of large numbers of chemical compounds and then identifying a small number of promising candidates. In this review, we introduce in silico approaches and tools that have been developed to predict drug metabolism and fate, and assess their potential to facilitate the virtual discovery of promising drug candidates. We also provide a brief description of various recent models for predicting different aspects of enzyme-drug reactions and provide a list of recent in silico tools used for drug metabolism prediction.
Collapse
Affiliation(s)
- Sayada Reemsha Kazmi
- School of Integrative Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea
| | - Ren Jun
- School of Integrative Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea
| | - Myeong-Sang Yu
- School of Integrative Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea
| | - Chanjin Jung
- School of Integrative Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea
| | - Dokyun Na
- School of Integrative Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea.
| |
Collapse
|
12
|
Yu CY, Li XX, Yang H, Li YH, Xue WW, Chen YZ, Tao L, Zhu F. Assessing the Performances of Protein Function Prediction Algorithms from the Perspectives of Identification Accuracy and False Discovery Rate. Int J Mol Sci 2018; 19:E183. [PMID: 29316706 PMCID: PMC5796132 DOI: 10.3390/ijms19010183] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2017] [Revised: 12/09/2017] [Accepted: 01/04/2018] [Indexed: 12/27/2022] Open
Abstract
The function of a protein is of great interest in the cutting-edge research of biological mechanisms, disease development and drug/target discovery. Besides experimental explorations, a variety of computational methods have been designed to predict protein function. Among these in silico methods, the prediction of BLAST is based on protein sequence similarity, while that of machine learning is also based on the sequence, but without the consideration of their similarity. This unique characteristic of machine learning makes it a good complement to BLAST and many other approaches in predicting the function of remotely relevant proteins and the homologous proteins of distinct function. However, the identification accuracies of these in silico methods and their false discovery rate have not yet been assessed so far, which greatly limits the usage of these algorithms. Herein, a comprehensive comparison of the performances among four popular prediction algorithms (BLAST, SVM, PNN and KNN) was conducted. In particular, the performance of these methods was systematically assessed by four standard statistical indexes based on the independent test datasets of 93 functional protein families defined by UniProtKB keywords. Moreover, the false discovery rates of these algorithms were evaluated by scanning the genomes of four representative model organisms (Homo sapiens, Arabidopsis thaliana, Saccharomyces cerevisiae and Mycobacterium tuberculosis). As a result, the substantially higher sensitivity of SVM and BLAST was observed compared with that of PNN and KNN. However, the machine learning algorithms (PNN, KNN and SVM) were found capable of substantially reducing the false discovery rate (SVM < PNN < KNN). In sum, this study comprehensively assessed the performance of four popular algorithms applied to protein function prediction, which could facilitate the selection of the most appropriate method in the related biomedical research.
Collapse
Affiliation(s)
- Chun Yan Yu
- Innovative Drug Research and Bioinformatics Group, School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing 401331, China.
- Innovative Drug Research and Bioinformatics Group, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Xiao Xu Li
- Innovative Drug Research and Bioinformatics Group, School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing 401331, China.
- Innovative Drug Research and Bioinformatics Group, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Hong Yang
- Innovative Drug Research and Bioinformatics Group, School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing 401331, China.
- Innovative Drug Research and Bioinformatics Group, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Ying Hong Li
- Innovative Drug Research and Bioinformatics Group, School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing 401331, China.
- Innovative Drug Research and Bioinformatics Group, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Wei Wei Xue
- Innovative Drug Research and Bioinformatics Group, School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing 401331, China.
| | - Yu Zong Chen
- Bioinformatics and Drug Design Group, Department of Pharmacy, and Center for Computational Science and Engineering, National University of Singapore, Singapore 117543, Singapore.
| | - Lin Tao
- School of Medicine, Hangzhou Normal University, Hangzhou 310012, China.
| | - Feng Zhu
- Innovative Drug Research and Bioinformatics Group, School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing 401331, China.
- Innovative Drug Research and Bioinformatics Group, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
| |
Collapse
|
13
|
Nature is the best source of anti-inflammatory drugs: indexing natural products for their anti-inflammatory bioactivity. Inflamm Res 2017; 67:67-75. [DOI: 10.1007/s00011-017-1096-5] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Revised: 09/12/2017] [Accepted: 09/23/2017] [Indexed: 02/05/2023] Open
|
14
|
Mohan CG, Gupta S. QSAR Models towards Cholinesterase Inhibitors for the Treatment of Alzheimer's Disease. Oncology 2017. [DOI: 10.4018/978-1-5225-0549-5.ch022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Alzheimer's Disease (AD) is a multifactorial neurological syndrome with the combination of aging, genetic, and environmental factors triggering the pathological decline. Interestingly, the importance of the Acetylcholinesterase (AChE) enzyme has increased due to its involvement in the ß-amyloid peptide fibril formation during AD pathogenesis. In silico technique, QSAR has proven its usefulness in pharmaceutical research for the design/optimization of new chemical entities. Further, QSAR method advanced the scope of rational drug design and the search for the mechanism of drug action. It is a well-established fact that the chemical and pharmaceutical effects of a compound are closely related to its physico-chemical properties, which can be calculated by various methods from the compound structure. This chapter focuses on different Quantitative Structure-Activity Relationship (QSAR) studies carried out for a variety of cholinesterase inhibitors for the treatment of AD. These predictive models will be potentially used for further designing better and safer drugs against AD.
Collapse
Affiliation(s)
- C. Gopi Mohan
- Amrita Institute of Medical Sciences and Research Centre, India
| | - Shikhar Gupta
- National Institute of Pharmaceutical Education and Research, India
| |
Collapse
|
15
|
Yang M, Chen J, Shi X, Xu L, Xi Z, You L, An R, Wang X. Development of in Silico Models for Predicting P-Glycoprotein Inhibitors Based on a Two-Step Approach for Feature Selection and Its Application to Chinese Herbal Medicine Screening. Mol Pharm 2015; 12:3691-713. [PMID: 26376206 DOI: 10.1021/acs.molpharmaceut.5b00465] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
P-glycoprotein (P-gp) is regarded as an important factor in determining the ADMET (absorption, distribution, metabolism, elimination, and toxicity) characteristics of drugs and drug candidates. Successful prediction of P-gp inhibitors can thus lead to an improved understanding of the underlying mechanisms of both changes in the pharmacokinetics of drugs and drug-drug interactions. Therefore, there has been considerable interest in the development of in silico modeling of P-gp inhibitors in recent years. Considering that a large number of molecular descriptors are used to characterize diverse structural moleculars, efficient feature selection methods are required to extract the most informative predictors. In this work, we constructed an extensive available data set of 2428 molecules that includes 1518 P-gp inhibitors and 910 P-gp noninhibitors from multiple resources. Importantly, a two-step feature selection approach based on a genetic algorithm and a greedy forward-searching algorithm was employed to select the minimum set of the most informative descriptors that contribute to the prediction of P-gp inhibitors. To determine the best machine learning algorithm, 18 classifiers coupled with the feature selection method were compared. The top three best-performing models (flexible discriminant analysis, support vector machine, and random forest) and their ensemble model using respectively only 3, 9, 7, and 14 descriptors achieve an overall accuracy of 83.2%-86.7% for the training set containing 1040 compounds, an overall accuracy of 82.3%-85.5% for the test set containing 1039 compounds, and a prediction accuracy of 77.4%-79.9% for the external validation set containing 349 compounds. The models were further extensively validated by DrugBank database (1890 compounds). The proposed models are competitive with and in some cases better than other published models in terms of prediction accuracy and minimum number of descriptors. Applicability domain then was addressed by developing an ensemble classification model to obtain more reliable predictions. Finally, we employed these models as a virtual screening tool for identifying potential P-gp inhibitors in Traditional Chinese Medicine Systems Pharmacology (TCMSP) database containing a total of 13 051 unique compounds from 498 herbs, resulting in 875 potential P-gp inhibitors and 15 inhibitor-rich herbs. These predictions were partly supported by a literature search and are valuable not only to develop novel P-gp inhibitors from TCM in the early stages of drug development, but also to optimize the use of herbal remedies.
Collapse
Affiliation(s)
- Ming Yang
- Department of Chemistry, College of Pharmacy, Shanghai University of Traditional Chinese Medicine , Shanghai 200444, People's Republic of China.,Department of Pharmacy, Longhua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine , Shanghai 200032, People's Republic of China
| | - Jialei Chen
- Department of Pharmacy, Longhua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine , Shanghai 200032, People's Republic of China
| | - Xiufeng Shi
- Department of Pharmacy, Longhua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine , Shanghai 200032, People's Republic of China
| | - Liwen Xu
- Department of Pharmacy, Longhua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine , Shanghai 200032, People's Republic of China
| | - Zhijun Xi
- Department of Pharmacy, Longhua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine , Shanghai 200032, People's Republic of China
| | - Lisha You
- Department of Chemistry, College of Pharmacy, Shanghai University of Traditional Chinese Medicine , Shanghai 200444, People's Republic of China
| | - Rui An
- Department of Chemistry, College of Pharmacy, Shanghai University of Traditional Chinese Medicine , Shanghai 200444, People's Republic of China
| | - Xinhong Wang
- Department of Chemistry, College of Pharmacy, Shanghai University of Traditional Chinese Medicine , Shanghai 200444, People's Republic of China
| |
Collapse
|
16
|
Ai N, Fan X, Ekins S. In silico methods for predicting drug-drug interactions with cytochrome P-450s, transporters and beyond. Adv Drug Deliv Rev 2015; 86:46-60. [PMID: 25796619 DOI: 10.1016/j.addr.2015.03.006] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2014] [Revised: 01/05/2015] [Accepted: 03/11/2015] [Indexed: 12/13/2022]
Abstract
Drug-drug interactions (DDIs) are associated with severe adverse effects that may lead to the patient requiring alternative therapeutics and could ultimately lead to drug withdrawal from the market if they are severe. To prevent the occurrence of DDI in the clinic, experimental systems to evaluate drug interaction have been integrated into the various stages of the drug discovery and development process. A large body of knowledge about DDI has also accumulated through these studies and pharmacovigillence systems. Much of this work to date has focused on the drug metabolizing enzymes such as cytochrome P-450s as well as drug transporters, ion channels and occasionally other proteins. This combined knowledge provides a foundation for a hypothesis-driven in silico approach, using either cheminformatics or physiologically based pharmacokinetics (PK) modeling methods to assess DDI potential. Here we review recent advances in these approaches with emphasis on hypothesis-driven mechanistic models for important protein targets involved in PK-based DDI. Recent efforts with other informatics approaches to detect DDI are highlighted. Besides DDI, we also briefly introduce drug interactions with other substances, such as Traditional Chinese Medicines to illustrate how in silico modeling can be useful in this domain. We also summarize valuable data sources and web-based tools that are available for DDI prediction. We finally explore the challenges we see faced by in silico approaches for predicting DDI and propose future directions to make these computational models more reliable, accurate, and publically accessible.
Collapse
Affiliation(s)
- Ni Ai
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Road, Hangzhou, Zhejiang 310058, PR China
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Road, Hangzhou, Zhejiang 310058, PR China.
| | - Sean Ekins
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526, USA.
| |
Collapse
|
17
|
Pappalardo M, Shachaf N, Basile L, Milardi D, Zeidan M, Raiyn J, Guccione S, Rayan A. Sequential application of ligand and structure based modeling approaches to index chemicals for their hH4R antagonism. PLoS One 2014; 9:e109340. [PMID: 25330207 PMCID: PMC4199621 DOI: 10.1371/journal.pone.0109340] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2014] [Accepted: 09/10/2014] [Indexed: 02/03/2023] Open
Abstract
The human histamine H4 receptor (hH4R), a member of the G-protein coupled receptors (GPCR) family, is an increasingly attractive drug target. It plays a key role in many cell pathways and many hH4R ligands are studied for the treatment of several inflammatory, allergic and autoimmune disorders, as well as for analgesic activity. Due to the challenging difficulties in the experimental elucidation of hH4R structure, virtual screening campaigns are normally run on homology based models. However, a wealth of information about the chemical properties of GPCR ligands has also accumulated over the last few years and an appropriate combination of these ligand-based knowledge with structure-based molecular modeling studies emerges as a promising strategy for computer-assisted drug design. Here, two chemoinformatics techniques, the Intelligent Learning Engine (ILE) and Iterative Stochastic Elimination (ISE) approach, were used to index chemicals for their hH4R bioactivity. An application of the prediction model on external test set composed of more than 160 hH4R antagonists picked from the chEMBL database gave enrichment factor of 16.4. A virtual high throughput screening on ZINC database was carried out, picking ∼ 4000 chemicals highly indexed as H4R antagonists' candidates. Next, a series of 3D models of hH4R were generated by molecular modeling and molecular dynamics simulations performed in fully atomistic lipid membranes. The efficacy of the hH4R 3D models in discrimination between actives and non-actives were checked and the 3D model with the best performance was chosen for further docking studies performed on the focused library. The output of these docking studies was a consensus library of 11 highly active scored drug candidates. Our findings suggest that a sequential combination of ligand-based chemoinformatics approaches with structure-based ones has the potential to improve the success rate in discovering new biologically active GPCR drugs and increase the enrichment factors in a synergistic manner.
Collapse
Affiliation(s)
- Matteo Pappalardo
- Department of Chemical Sciences, University of Catania, Catania, Italy
| | - Nir Shachaf
- Drug Discovery Informatics Lab, QRC-Qasemi Research Center, Al-Qasemi Academic College, Baka El-Garbiah, Israel
| | - Livia Basile
- Etnalead s.r.l., Scuola Superiore di Catania, University of Catania, Catania, Italy
| | - Danilo Milardi
- National Research Council, Institute of Biostructures and Bioimaging, Catania, Italy
| | - Mouhammed Zeidan
- Drug Discovery Informatics Lab, QRC-Qasemi Research Center, Al-Qasemi Academic College, Baka El-Garbiah, Israel
| | - Jamal Raiyn
- Drug Discovery Informatics Lab, QRC-Qasemi Research Center, Al-Qasemi Academic College, Baka El-Garbiah, Israel
| | - Salvatore Guccione
- Etnalead s.r.l., Scuola Superiore di Catania, University of Catania, Catania, Italy
- Department of Pharmaceutical Sciences, University of Catania, Catania, Italy
| | - Anwar Rayan
- Drug Discovery Informatics Lab, QRC-Qasemi Research Center, Al-Qasemi Academic College, Baka El-Garbiah, Israel
| |
Collapse
|
18
|
Han B, Ma X, Zhao R, Zhang J, Wei X, Liu X, Liu X, Zhang C, Tan C, Jiang Y, Chen Y. Development and experimental test of support vector machines virtual screening method for searching Src inhibitors from large compound libraries. Chem Cent J 2012; 6:139. [PMID: 23173901 PMCID: PMC3538513 DOI: 10.1186/1752-153x-6-139] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2012] [Accepted: 11/07/2012] [Indexed: 01/04/2023] Open
Abstract
UNLABELLED BACKGROUND Src plays various roles in tumour progression, invasion, metastasis, angiogenesis and survival. It is one of the multiple targets of multi-target kinase inhibitors in clinical uses and trials for the treatment of leukemia and other cancers. These successes and appearances of drug resistance in some patients have raised significant interest and efforts in discovering new Src inhibitors. Various in-silico methods have been used in some of these efforts. It is desirable to explore additional in-silico methods, particularly those capable of searching large compound libraries at high yields and reduced false-hit rates. RESULTS We evaluated support vector machines (SVM) as virtual screening tools for searching Src inhibitors from large compound libraries. SVM trained and tested by 1,703 inhibitors and 63,318 putative non-inhibitors correctly identified 93.53%~ 95.01% inhibitors and 99.81%~ 99.90% non-inhibitors in 5-fold cross validation studies. SVM trained by 1,703 inhibitors reported before 2011 and 63,318 putative non-inhibitors correctly identified 70.45% of the 44 inhibitors reported since 2011, and predicted as inhibitors 44,843 (0.33%) of 13.56M PubChem, 1,496 (0.89%) of 168 K MDDR, and 719 (7.73%) of 9,305 MDDR compounds similar to the known inhibitors. CONCLUSIONS SVM showed comparable yield and reduced false hit rates in searching large compound libraries compared to the similarity-based and other machine-learning VS methods developed from the same set of training compounds and molecular descriptors. We tested three virtual hits of the same novel scaffold from in-house chemical libraries not reported as Src inhibitor, one of which showed moderate activity. SVM may be potentially explored for searching Src inhibitors from large compound libraries at low false-hit rates.
Collapse
Affiliation(s)
- Bucong Han
- The Key Laboratory of Chemical Biology, Guangdong Province, The Graduate School at Shenzhen, Tsinghua University, Shenzhen, Guangdong, 518055, People’s Republic of China
- Computation and Systems Biology, Singapore-MIT Alliance, National University of Singapore, E4-04-10, 4 Engineering Drive 3, Singapore, 117576, Singapore
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| | - Xiaohua Ma
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| | - Ruiying Zhao
- Central Research Institute of China Chemical Science and Technology, 20 Xueyuan Road, Haidian District, Beijing, 100083, People’s Republic of China
| | - Jingxian Zhang
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| | - Xiaona Wei
- Computation and Systems Biology, Singapore-MIT Alliance, National University of Singapore, E4-04-10, 4 Engineering Drive 3, Singapore, 117576, Singapore
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| | - Xianghui Liu
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| | - Xin Liu
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| | - Cunlong Zhang
- The Key Laboratory of Chemical Biology, Guangdong Province, The Graduate School at Shenzhen, Tsinghua University, Shenzhen, Guangdong, 518055, People’s Republic of China
| | - Chunyan Tan
- The Key Laboratory of Chemical Biology, Guangdong Province, The Graduate School at Shenzhen, Tsinghua University, Shenzhen, Guangdong, 518055, People’s Republic of China
| | - Yuyang Jiang
- The Key Laboratory of Chemical Biology, Guangdong Province, The Graduate School at Shenzhen, Tsinghua University, Shenzhen, Guangdong, 518055, People’s Republic of China
| | - Yuzong Chen
- The Key Laboratory of Chemical Biology, Guangdong Province, The Graduate School at Shenzhen, Tsinghua University, Shenzhen, Guangdong, 518055, People’s Republic of China
- Computation and Systems Biology, Singapore-MIT Alliance, National University of Singapore, E4-04-10, 4 Engineering Drive 3, Singapore, 117576, Singapore
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| |
Collapse
|
19
|
Zhang J, Han B, Wei X, Tan C, Chen Y, Jiang Y. A two-step target binding and selectivity support vector machines approach for virtual screening of dopamine receptor subtype-selective ligands. PLoS One 2012; 7:e39076. [PMID: 22720033 PMCID: PMC3376116 DOI: 10.1371/journal.pone.0039076] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2011] [Accepted: 05/15/2012] [Indexed: 01/13/2023] Open
Abstract
Target selective drugs, such as dopamine receptor (DR) subtype selective ligands, are developed for enhanced therapeutics and reduced side effects. In silico methods have been explored for searching DR selective ligands, but encountered difficulties associated with high subtype similarity and ligand structural diversity. Machine learning methods have shown promising potential in searching target selective compounds. Their target selective capability can be further enhanced. In this work, we introduced a new two-step support vector machines target-binding and selectivity screening method for searching DR subtype-selective ligands, which was tested together with three previously-used machine learning methods for searching D1, D2, D3 and D4 selective ligands. It correctly identified 50.6%–88.0% of the 21–408 subtype selective and 71.7%–81.0% of the 39–147 multi-subtype ligands. Its subtype selective ligand identification rates are significantly better than, and its multi-subtype ligand identification rates are comparable to the best rates of the previously used methods. Our method produced low false-hit rates in screening 13.56 M PubChem, 168,016 MDDR and 657,736 ChEMBLdb compounds. Molecular features important for subtype selectivity were extracted by using the recursive feature elimination feature selection method. These features are consistent with literature-reported features. Our method showed similar performance in searching estrogen receptor subtype selective ligands. Our study demonstrated the usefulness of the two-step target binding and selectivity screening method in searching subtype selective ligands from large compound libraries.
Collapse
Affiliation(s)
- Jingxian Zhang
- The Key Laboratory of Chemical Biology, Guangdong Province, Graduate School at Shenzhen, Tsinghua University, Shenzhen, People's Republic of China
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Singapore, Singapore
| | - Bucong Han
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Singapore, Singapore
- Computation and Systems Biology, Singapore-MIT Alliance, National University of Singapore, Singapore, Singapore
| | - Xiaona Wei
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Singapore, Singapore
- Computation and Systems Biology, Singapore-MIT Alliance, National University of Singapore, Singapore, Singapore
| | - Chunyan Tan
- The Key Laboratory of Chemical Biology, Guangdong Province, Graduate School at Shenzhen, Tsinghua University, Shenzhen, People's Republic of China
| | - Yuzong Chen
- The Key Laboratory of Chemical Biology, Guangdong Province, Graduate School at Shenzhen, Tsinghua University, Shenzhen, People's Republic of China
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Singapore, Singapore
- * E-mail: (YZC); (YYJ)
| | - Yuyang Jiang
- The Key Laboratory of Chemical Biology, Guangdong Province, Graduate School at Shenzhen, Tsinghua University, Shenzhen, People's Republic of China
- * E-mail: (YZC); (YYJ)
| |
Collapse
|
20
|
Zhang S. Application of Machine Leaning in Drug Discovery and Development. Mach Learn 2012. [DOI: 10.4018/978-1-60960-818-7.ch517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Machine learning techniques have been widely used in drug discovery and development, particularly in the areas of cheminformatics, bioinformatics and other types of pharmaceutical research. It has been demonstrated they are suitable for large high dimensional data, and the models built with these methods can be used for robust external predictions. However, various problems and challenges still exist, and new approaches are in great need. In this Chapter, the authors will review the current development of machine learning techniques, and especially focus on several machine learning techniques they developed as well as their application to model building, lead discovery via virtual screening, integration with molecular docking, and prediction of off-target properties. The authors will suggest some potential different avenues to unify different disciplines, such as cheminformatics, bioinformatics and systems biology, for the purpose of developing integrated in silico drug discovery and development approaches.
Collapse
Affiliation(s)
- Shuxing Zhang
- The University of Texas at M.D. Anderson Cancer Center, USA
| |
Collapse
|
21
|
Liu X, Zhu F, Ma X, Tao L, Zhang J, Yang S, Wei Y, Chen YZ. The Therapeutic Target Database: an internet resource for the primary targets of approved, clinical trial and experimental drugs. Expert Opin Ther Targets 2011; 15:903-12. [PMID: 21619487 DOI: 10.1517/14728222.2011.586635] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Increasing numbers of proteins, nucleic acids and other molecular entities have been explored as therapeutic targets. A challenge in drug discovery is to decide which targets to pursue from an increasing pool of potential targets, given the fact that few innovative targets have made it to the approval list each year. Knowledge of existing drug targets (both approved and within clinical trials) is highly useful for facilitating target discovery, selection, exploration and tool development. The Therapeutic Target Database (TTD) has been developed and updated to provide information on 358 successful targets, 251 clinical trial targets and 1254 research targets in addition to 1511 approved drugs, 1118 clinical trials drugs and 2331 experimental drugs linked to their primary targets (3257 drugs with available structure data). This review briefly describes the TTD database and illustrates how its data can be explored for facilitating target and drug searches, the study of the mechanism of multi-target drugs and the development of in silico target discovery tools.
Collapse
|
22
|
Klon AE. Machine learning algorithms for the prediction of hERG and CYP450 binding in drug development. Expert Opin Drug Metab Toxicol 2011; 6:821-33. [PMID: 20465523 DOI: 10.1517/17425255.2010.489550] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
IMPORTANCE OF THE FIELD The cost of developing new drugs is estimated at approximately $1 billion; the withdrawal of a marketed compound due to toxicity can result in serious financial loss for a pharmaceutical company. There has been a greater interest in the development of in silico tools that can identify compounds with metabolic liabilities before they are brought to market. AREAS COVERED IN THIS REVIEW The two largest classes of machine learning (ML) models, which will be discussed in this review, have been developed to predict binding to the human ether-a-go-go related gene (hERG) ion channel protein and the various CYP isoforms. Being able to identify potentially toxic compounds before they are made would greatly reduce the number of compound failures and the costs associated with drug development. WHAT THE READER WILL GAIN This review summarizes the state of modeling hERG and CYP binding towards this goal since 2003 using ML algorithms. TAKE HOME MESSAGE A wide variety of ML algorithms that are comparable in their overall performance are available. These ML methods may be applied regularly in discovery projects to flag compounds with potential metabolic liabilities.
Collapse
Affiliation(s)
- Anthony E Klon
- Ansaris, Computational Chemistry, Four Valley Square, 512 East Township Line Road, Blue Bell, PA 19422, USA.
| |
Collapse
|
23
|
Abstract
Computer-aided approaches have been widely used in pharmaceutical research to improve the efficiency of the drug discovery and development pipeline. To identify and design small molecules as clinically effective therapeutics, various computational methods have been evaluated as promising strategies, depending on the purpose and systems of interest. Both ligand and structure-based drug design approaches are powerful technologies, which can be applied to virtual screening for lead identification and optimization. Here, we review the progress in this field and summarize the application of some new technologies we developed. These state-of-the-art tools have been used for the discovery and development of active agents for various diseases, in particular for cancer therapies. The described protocols are appropriate for all drug discovery stages, but expertise is still needed to perform the studies based on the targets of interest.
Collapse
Affiliation(s)
- Shuxing Zhang
- Department of Experimental Therapeutics, M.D. Anderson Cancer Center, Houston, TX, USA.
| |
Collapse
|
24
|
Hecht D. Applications of machine learning and computational intelligence to drug discovery and development. Drug Dev Res 2010. [DOI: 10.1002/ddr.20402] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- David Hecht
- Southwestern College, Chula Vista, California
| |
Collapse
|
25
|
Sakiyama Y. The use of machine learning and nonlinear statistical tools for ADME prediction. Expert Opin Drug Metab Toxicol 2010; 5:149-69. [PMID: 19239395 DOI: 10.1517/17425250902753261] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Absorption, distribution, metabolism and excretion (ADME)-related failure of drug candidates is a major issue for the pharmaceutical industry today. Prediction of ADME by in silico tools has now become an inevitable paradigm to reduce cost and enhance efficiency in pharmaceutical research. Recently, machine learning as well as nonlinear statistical tools has been widely applied to predict routine ADME end points. To achieve accurate and reliable predictions, it would be a prerequisite to understand the concepts, mechanisms and limitations of these tools. Here, we have devised a small synthetic nonlinear data set to help understand the mechanism of machine learning by 2D-visualisation. We applied six new machine learning methods to four different data sets. The methods include Naive Bayes classifier, classification and regression tree, random forest, Gaussian process, support vector machine and k nearest neighbour. The results demonstrated that ensemble learning and kernel machine displayed greater accuracy of prediction than classical methods irrespective of the data set size. The importance of interaction with the engineering field is also addressed. The results described here provide insights into the mechanism of machine learning, which will enable appropriate usage in the future.
Collapse
Affiliation(s)
- Yojiro Sakiyama
- Pharmacokinetics Dynamics Metabolism, Pfizer Global Research and Development, Sandwich Laboratories, Kent, UK.
| |
Collapse
|
26
|
Rao H, Li Z, Li X, Ma X, Ung C, Li H, Liu X, Chen Y. Identification of small molecule aggregators from large compound libraries by support vector machines. J Comput Chem 2010; 31:752-63. [PMID: 19569201 DOI: 10.1002/jcc.21347] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Small molecule aggregators non-specifically inhibit multiple unrelated proteins, rendering them therapeutically useless. They frequently appear as false hits and thus need to be eliminated in high-throughput screening campaigns. Computational methods have been explored for identifying aggregators, which have not been tested in screening large compound libraries. We used 1319 aggregators and 128,325 non-aggregators to develop a support vector machines (SVM) aggregator identification model, which was tested by four methods. The first is five fold cross-validation, which showed comparable aggregator and significantly improved non-aggregator identification rates against earlier studies. The second is the independent test of 17 aggregators discovered independently from the training aggregators, 71% of which were correctly identified. The third is retrospective screening of 13M PUBCHEM and 168K MDDR compounds, which predicted 97.9% and 98.7% of the PUBCHEM and MDDR compounds as non-aggregators. The fourth is retrospective screening of 5527 MDDR compounds similar to the known aggregators, 1.14% of which were predicted as aggregators. SVM showed slightly better overall performance against two other machine learning methods based on five fold cross-validation studies of the same settings. Molecular features of aggregation, extracted by a feature selection method, are consistent with published profiles. SVM showed substantial capability in identifying aggregators from large libraries at low false-hit rates.
Collapse
Affiliation(s)
- Hanbing Rao
- College of Chemistry, Sichuan University, Chengdu 610064, People's Republic of China
| | | | | | | | | | | | | | | |
Collapse
|
27
|
Kortagere S, Ekins S. Troubleshooting computational methods in drug discovery. J Pharmacol Toxicol Methods 2010; 61:67-75. [PMID: 20176118 DOI: 10.1016/j.vascn.2010.02.005] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2010] [Accepted: 02/11/2010] [Indexed: 10/19/2022]
Abstract
Computational approaches for drug discovery such as ligand-based and structure-based methods, are increasingly seen as an efficient approach for lead discovery as well as providing insights on absorption, distribution, metabolism, excretion and toxicity (ADME/Tox). What is perhaps less well known and widely described are the limitations of the different technologies. We have therefore presented a troubleshooting approach to QSAR, homology modeling, docking as well as hybrid methods. If such computational or cheminformatics methods are to become more widely used by non-experts it is critical that such limitations are brought to the user's attention and addressed during their workflows. This could improve the quality of the models and results that are obtained.
Collapse
Affiliation(s)
- Sandhya Kortagere
- Department of Microbiology and Immunology, Drexel University College of Medicine, Philadelphia, PA 19129, USA.
| | | |
Collapse
|
28
|
Demel MA, Krämer O, Ettmayer P, Haaksma EEJ, Ecker GF. Predicting ligand interactions with ABC transporters in ADME. Chem Biodivers 2010; 6:1960-9. [PMID: 19937827 DOI: 10.1002/cbdv.200900138] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
ABC-type drug efflux pumps, e.g., ABCB1 (=P-glycoprotein, =MDR1), ABCC1 (=MRP1), and ABCG2 (=MXR, =BCRP), confer a multi-drug resistance (MDR) phenotype to cancer cells. Furthermore, the important contribution of ABC transporters for bioavailability, distribution, elimination, and blood-brain barrier permeation of drug candidates is increasingly recognized. This review presents an overview on the different computational methods and models pursued to predict ABC transporter substrate properties of drug-like compounds. They encompass ligand-based approaches ranging from 'simple rule'-based efforts to sophisticated machine learning methods. Many of these models show excellent performance for the data sets used. However, due to the complex nature of the applied methods, useful interpretation of the models that can be directly translated into chemical structures by the medicinal chemist is rather difficult. Additionally, very recent and promising attempts in the field of structure-based modeling of ABC transporters, which embody homology modeling as well as recently published X-ray structures of murine ABCB1, will be discussed.
Collapse
Affiliation(s)
- Michael A Demel
- University of Vienna, Department of Medicinal Chemistry, Emerging Field Pharmacoinformatics, Althanstrasse 14, AT-1090 Vienna
| | | | | | | | | |
Collapse
|
29
|
Bayesian Classification of Cytochrome P450 3A4 Substrates/Non-substrates and Color Mapping for Chemical Interpretation. JOURNAL OF COMPUTER AIDED CHEMISTRY 2010. [DOI: 10.2751/jcac.11.19] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
30
|
Liu XH, Ma XH, Tan CY, Jiang YY, Go ML, Low BC, Chen YZ. Virtual screening of Abl inhibitors from large compound libraries by support vector machines. J Chem Inf Model 2009; 49:2101-10. [PMID: 19689138 DOI: 10.1021/ci900135u] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Abl promotes cancers by regulating cell morphogenesis, motility, growth, and survival. Successes of several marketed and clinical trial Abl inhibitors against leukemia and other cancers and appearances of reduced efficacies and drug resistances have led to significant interest in and efforts for developing new Abl inhibitors. In silico methods of pharmacophore, fragment, and molecular docking have been used in some of these efforts. It is desirable to explore other in silico methods capable of searching large compound libraries at high yields and reduced false-hit rates. We evaluated support vector machines (SVM) as a virtual screening tool for searching Abl inhibitors from large compound libraries. SVM trained and tested by 708 inhibitors and 65,494 putative noninhibitors correctly identified 84.4 to 92.3% inhibitors and 99.96 to 99.99% noninhibitors in 5-fold cross validation studies. SVM trained by 708 pre-2008 inhibitors and 65 494 putative noninhibitors correctly identified 50.5% of the 91 inhibitors reported since 2008 and predicted as inhibitors 29,072 (0.21%) of 13.56M PubChem, 659 (0.39%) of 168K MDDR, and 330 (5.0%) of 6638 MDDR compounds similar to the known inhibitors. SVM showed comparable yields and substantially reduced false-hit rates against two similarity based and another machine learning VS methods based on the same training and testing data sets and molecular descriptors. These suggest that SVM is capable of searching Abl inhibitors from large compound libraries at low false-hit rates.
Collapse
Affiliation(s)
- X H Liu
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore 117543
| | | | | | | | | | | | | |
Collapse
|
31
|
Zhu F, Han B, Kumar P, Liu X, Ma X, Wei X, Huang L, Guo Y, Han L, Zheng C, Chen Y. Update of TTD: Therapeutic Target Database. Nucleic Acids Res 2009; 38:D787-91. [PMID: 19933260 PMCID: PMC2808971 DOI: 10.1093/nar/gkp1014] [Citation(s) in RCA: 200] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Increasing numbers of proteins, nucleic acids and other molecular entities have been explored as therapeutic targets, hundreds of which are targets of approved and clinical trial drugs. Knowledge of these targets and corresponding drugs, particularly those in clinical uses and trials, is highly useful for facilitating drug discovery. Therapeutic Target Database (TTD) has been developed to provide information about therapeutic targets and corresponding drugs. In order to accommodate increasing demand for comprehensive knowledge about the primary targets of the approved, clinical trial and experimental drugs, numerous improvements and updates have been made to TTD. These updates include information about 348 successful, 292 clinical trial and 1254 research targets, 1514 approved, 1212 clinical trial and 2302 experimental drugs linked to their primary targets (3382 small molecule and 649 antisense drugs with available structure and sequence), new ways to access data by drug mode of action, recursive search of related targets or drugs, similarity target and drug searching, customized and whole data download, standardized target ID, and significant increase of data (1894 targets, 560 diseases and 5028 drugs compared with the 433 targets, 125 diseases and 809 drugs in the original release described in previous paper). This database can be accessed at http://bidd.nus.edu.sg/group/cjttd/TTD.asp.
Collapse
Affiliation(s)
- Feng Zhu
- Department of Pharmacy and Computation and Systems Biology, Center for Computational Science and Engineering, Singapore-MIT Alliance, National University of Singapore, Singapore
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Liew CY, Ma XH, Liu X, Yap CW. SVM Model for Virtual Screening of Lck Inhibitors. J Chem Inf Model 2009; 49:877-85. [DOI: 10.1021/ci800387z] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Chin Y. Liew
- Pharmaceutical Data Exploration Laboratory, Department of Pharmacy, National University of Singapore, and Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore
| | - Xiao H. Ma
- Pharmaceutical Data Exploration Laboratory, Department of Pharmacy, National University of Singapore, and Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore
| | - Xianghui Liu
- Pharmaceutical Data Exploration Laboratory, Department of Pharmacy, National University of Singapore, and Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore
| | - Chun W. Yap
- Pharmaceutical Data Exploration Laboratory, Department of Pharmacy, National University of Singapore, and Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore
| |
Collapse
|
33
|
Abstract
Topological polar surface area (TPSA), which makes use of functional group contributions based on a large database of structures, is a convenient measure of the polar surface area that avoids the need to calculate ligand 3D structure or to decide which is the relevant biological conformation or conformations. We demonstrate the utility of TPSA in 2D-QSAR for 14 sets of diverse pharmacological activity data. Even though a large pool of reports showing the importance of the classic 2D descriptors such as calculated logP (ClogP) and calculated molar refractivity (CMR) exists in the 2D-QSAR literature, this is the first report to demonstrate the value of TPSA as a relevant descriptor applicable to a large, structurally and pharmacologically diverse set of classes of compounds. We also address the limitations of applicability of this descriptor for 2D-QSAR analysis. We observed a negative correlation of TPSA with activity data for anticancer alkaloids, MT1 and MT2 agonists, MAO-B and tumor necrosis factor-alpha inhibitors and a positive correlation with inhibitory activity data for telomerase, PDE-5, GSK-3, DNA-PK, aromatase, malaria, trypanosomatids and CB2 agonists.
Collapse
Affiliation(s)
- S Prasanna
- Department of Medicinal Chemistry, University of Mississippi, MS 38677-1848, USA
| | | |
Collapse
|
34
|
Ma XH, Wang R, Yang SY, Li ZR, Xue Y, Wei YC, Low BC, Chen YZ. Evaluation of virtual screening performance of support vector machines trained by sparsely distributed active compounds. J Chem Inf Model 2008; 48:1227-37. [PMID: 18533644 DOI: 10.1021/ci800022e] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Virtual screening performance of support vector machines (SVM) depends on the diversity of training active and inactive compounds. While diverse inactive compounds can be routinely generated, the number and diversity of known actives are typically low. We evaluated the performance of SVM trained by sparsely distributed actives in six MDDR biological target classes composed of a high number of known actives (983-1645) of high, intermediate, and low structural diversity (muscarinic M1 receptor agonists, NMDA receptor antagonists, thrombin inhibitors, HIV protease inhibitors, cephalosporins, and renin inhibitors). SVM trained by regularly sparse data sets of 100 actives show improved yields at substantially reduced false-hit rates compared to those of published studies and those of Tanimoto-based similarity searching method based on the same data sets and molecular descriptors. SVM trained by very sparse data sets of 40 actives (2.4%-4.1% of the known actives) predicted 17.5-39.5%, 23.0-48.1%, and 70.2-92.4% of the remaining 943-1605 actives in the high, intermediate, and low diversity classes, respectively, 13.8-68.7% of which are outside the training compound families. SVM predicted 99.97% and 97.1% of the 9.997 M PUBCHEM and 167K remaining MDDR compounds as inactive and 2.6%-8.3% of the 19,495-38,483 MDDR compounds similar to the known actives as active. These suggest that SVM has substantial capability in identifying novel active compounds from sparse active data sets at low false-hit rates.
Collapse
Affiliation(s)
- X H Ma
- Centre for Computational Science and Engineering, National University of Singapore, Singapore
| | | | | | | | | | | | | | | |
Collapse
|
35
|
Yan L, Sheihk-Bahaei S, Park S, Ropella GEP, Hunt CA. Predictions of Hepatic Disposition Properties Using a Mechanistically Realistic, Physiologically Based Model. Drug Metab Dispos 2008; 36:759-68. [DOI: 10.1124/dmd.107.019067] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|
36
|
Han LY, Ma XH, Lin HH, Jia J, Zhu F, Xue Y, Li ZR, Cao ZW, Ji ZL, Chen YZ. A support vector machines approach for virtual screening of active compounds of single and multiple mechanisms from large libraries at an improved hit-rate and enrichment factor. J Mol Graph Model 2007; 26:1276-86. [PMID: 18218332 DOI: 10.1016/j.jmgm.2007.12.002] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2007] [Revised: 12/05/2007] [Accepted: 12/05/2007] [Indexed: 01/04/2023]
Abstract
Support vector machines (SVM) and other machine-learning (ML) methods have been explored as ligand-based virtual screening (VS) tools for facilitating lead discovery. While exhibiting good hit selection performance, in screening large compound libraries, these methods tend to produce lower hit-rate than those of the best performing VS tools, partly because their training-sets contain limited spectrum of inactive compounds. We tested whether the performance of SVM can be improved by using training-sets of diverse inactive compounds. In retrospective database screening of active compounds of single mechanism (HIV protease inhibitors, DHFR inhibitors, dopamine antagonists) and multiple mechanisms (CNS active agents) from large libraries of 2.986 million compounds, the yields, hit-rates, and enrichment factors of our SVM models are 52.4-78.0%, 4.7-73.8%, and 214-10,543, respectively, compared to those of 62-95%, 0.65-35%, and 20-1200 by structure-based VS and 55-81%, 0.2-0.7%, and 110-795 by other ligand-based VS tools in screening libraries of >or=1 million compounds. The hit-rates are comparable and the enrichment factors are substantially better than the best results of other VS tools. 24.3-87.6% of the predicted hits are outside the known hit families. SVM appears to be potentially useful for facilitating lead discovery in VS of large compound libraries.
Collapse
Affiliation(s)
- L Y Han
- Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore 117543, Singapore
| | | | | | | | | | | | | | | | | | | |
Collapse
|