1
|
Ekins S, Freundlich JS, Clark AM, Anantpadma M, Davey RA, Madrid P. Machine learning models identify molecules active against the Ebola virus in vitro. F1000Res 2016; 4:1091. [PMID: 26834994 DOI: 10.12688/f1000research.7217.2] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/23/2015] [Indexed: 12/15/2022] Open
Abstract
The search for small molecule inhibitors of Ebola virus (EBOV) has led to several high throughput screens over the past 3 years. These have identified a range of FDA-approved active pharmaceutical ingredients (APIs) with anti-EBOV activity in vitro and several of which are also active in a mouse infection model. There are millions of additional commercially-available molecules that could be screened for potential activities as anti-EBOV compounds. One way to prioritize compounds for testing is to generate computational models based on the high throughput screening data and then virtually screen compound libraries. In the current study, we have generated Bayesian machine learning models with viral pseudotype entry assay and the EBOV replication assay data. We have validated the models internally and externally. We have also used these models to computationally score the MicroSource library of drugs to select those likely to be potential inhibitors. Three of the highest scoring molecules that were not in the model training sets, quinacrine, pyronaridine and tilorone, were tested in vitro and had EC50 values of 350, 420 and 230 nM, respectively. Pyronaridine is a component of a combination therapy for malaria that was recently approved by the European Medicines Agency, which may make it more readily accessible for clinical testing. Like other known antimalarial drugs active against EBOV, it shares the 4-aminoquinoline scaffold. Tilorone, is an investigational antiviral agent that has shown a broad array of biological activities including cell growth inhibition in cancer cells, antifibrotic properties, α7 nicotinic receptor agonist activity, radioprotective activity and activation of hypoxia inducible factor-1. Quinacrine is an antimalarial but also has use as an anthelmintic. Our results suggest data sets with less than 1,000 molecules can produce validated machine learning models that can in turn be utilized to identify novel EBOV inhibitors in vitro.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry, Fuquay-Varina, NC, 27526, USA
- Collaborations Pharmaceuticals Inc, Fuquay-Varina, NC, 27526, USA
- Collaborative Drug Discovery, Burlingame, CA, 94010, USA
| | - Joel S Freundlich
- Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, UMDNJ, New Jersey Medical School, Newark, NJ, 07103, USA
| | - Alex M Clark
- Molecular Materials Informatics, Inc., Montreal, 94025, Canada
| | - Manu Anantpadma
- Texas Biomedical Research Institute, San Antonio, TX, 78227, USA
| | - Robert A Davey
- Texas Biomedical Research Institute, San Antonio, TX, 78227, USA
| | | |
Collapse
|
2
|
Ekins S, Freundlich JS, Clark AM, Anantpadma M, Davey RA, Madrid P. Machine learning models identify molecules active against the Ebola virus in vitro. F1000Res 2015; 4:1091. [PMID: 26834994 DOI: 10.12688/f1000research.7217.1] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/15/2015] [Indexed: 12/23/2022] Open
Abstract
The search for small molecule inhibitors of Ebola virus (EBOV) has led to several high throughput screens over the past 3 years. These have identified a range of FDA-approved active pharmaceutical ingredients (APIs) with anti-EBOV activity in vitro and several of which are also active in a mouse infection model. There are millions of additional commercially-available molecules that could be screened for potential activities as anti-EBOV compounds. One way to prioritize compounds for testing is to generate computational models based on the high throughput screening data and then virtually screen compound libraries. In the current study, we have generated Bayesian machine learning models with viral pseudotype entry assay and the EBOV replication assay data. We have validated the models internally and externally. We have also used these models to computationally score the MicroSource library of drugs to select those likely to be potential inhibitors. Three of the highest scoring molecules that were not in the model training sets, quinacrine, pyronaridine and tilorone, were tested in vitro and had EC 50 values of 350, 420 and 230 nM, respectively. Pyronaridine is a component of a combination therapy for malaria that was recently approved by the European Medicines Agency, which may make it more readily accessible for clinical testing. Like other known antimalarial drugs active against EBOV, it shares the 4-aminoquinoline scaffold. Tilorone, is an investigational antiviral agent that has shown a broad array of biological activities including cell growth inhibition in cancer cells, antifibrotic properties, α7 nicotinic receptor agonist activity, radioprotective activity and activation of hypoxia inducible factor-1. Quinacrine is an antimalarial but also has use as an anthelmintic. Our results suggest data sets with less than 1,000 molecules can produce validated machine learning models that can in turn be utilized to identify novel EBOV inhibitors in vitro.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry, Fuquay-Varina, NC, 27526, USA.,Collaborations Pharmaceuticals Inc, Fuquay-Varina, NC, 27526, USA.,Collaborative Drug Discovery, Burlingame, CA, 94010, USA
| | - Joel S Freundlich
- Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, UMDNJ, New Jersey Medical School, Newark, NJ, 07103, USA
| | - Alex M Clark
- Molecular Materials Informatics, Inc., Montreal, 94025, Canada
| | - Manu Anantpadma
- Texas Biomedical Research Institute, San Antonio, TX, 78227, USA
| | - Robert A Davey
- Texas Biomedical Research Institute, San Antonio, TX, 78227, USA
| | | |
Collapse
|
3
|
Ekins S, Freundlich JS, Clark AM, Anantpadma M, Davey RA, Madrid P. Machine learning models identify molecules active against the Ebola virus in vitro. F1000Res 2015; 4:1091. [PMID: 26834994 PMCID: PMC4706063 DOI: 10.12688/f1000research.7217.3] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/17/2017] [Indexed: 12/21/2022] Open
Abstract
The search for small molecule inhibitors of Ebola virus (EBOV) has led to several high throughput screens over the past 3 years. These have identified a range of FDA-approved active pharmaceutical ingredients (APIs) with anti-EBOV activity
in vitro and several of which are also active in a mouse infection model. There are millions of additional commercially-available molecules that could be screened for potential activities as anti-EBOV compounds. One way to prioritize compounds for testing is to generate computational models based on the high throughput screening data and then virtually screen compound libraries. In the current study, we have generated Bayesian machine learning models with viral pseudotype entry assay and the EBOV replication assay data. We have validated the models internally and externally. We have also used these models to computationally score the MicroSource library of drugs to select those likely to be potential inhibitors. Three of the highest scoring molecules that were not in the model training sets, quinacrine, pyronaridine and tilorone, were tested
in vitro and had EC
50 values of 350, 420 and 230 nM, respectively. Pyronaridine is a component of a combination therapy for malaria that was recently approved by the European Medicines Agency, which may make it more readily accessible for clinical testing. Like other known antimalarial drugs active against EBOV, it shares the 4-aminoquinoline scaffold. Tilorone, is an investigational antiviral agent that has shown a broad array of biological activities including cell growth inhibition in cancer cells, antifibrotic properties, α7 nicotinic receptor agonist activity, radioprotective activity and activation of hypoxia inducible factor-1. Quinacrine is an antimalarial but also has use as an anthelmintic. Our results suggest data sets with less than 1,000 molecules can produce validated machine learning models that can in turn be utilized to identify novel EBOV inhibitors
in vitro.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry, Fuquay-Varina, NC, 27526, USA.,Collaborations Pharmaceuticals Inc, Fuquay-Varina, NC, 27526, USA.,Collaborative Drug Discovery, Burlingame, CA, 94010, USA
| | - Joel S Freundlich
- Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, UMDNJ, New Jersey Medical School, Newark, NJ, 07103, USA
| | - Alex M Clark
- Molecular Materials Informatics, Inc., Montreal, 94025, Canada
| | - Manu Anantpadma
- Texas Biomedical Research Institute, San Antonio, TX, 78227, USA
| | - Robert A Davey
- Texas Biomedical Research Institute, San Antonio, TX, 78227, USA
| | | |
Collapse
|
4
|
Ekins S, Clark AM, Wright SH. Making Transporter Models for Drug-Drug Interaction Prediction Mobile. Drug Metab Dispos 2015. [PMID: 26199424 DOI: 10.1124/dmd.115.064956] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The past decade has seen increased numbers of studies publishing ligand-based computational models for drug transporters. Although they generally use small experimental data sets, these models can provide insights into structure-activity relationships for the transporter. In addition, such models have helped to identify new compounds as substrates or inhibitors of transporters of interest. We recently proposed that many transporters are promiscuous and may require profiling of new chemical entities against multiple substrates for a specific transporter. Furthermore, it should be noted that virtually all of the published ligand-based transporter models are only accessible to those involved in creating them and, consequently, are rarely shared effectively. One way to surmount this is to make models shareable or more accessible. The development of mobile apps that can access such models is highlighted here. These apps can be used to predict ligand interactions with transporters using Bayesian algorithms. We used recently published transporter data sets (MATE1, MATE2K, OCT2, OCTN2, ASBT, and NTCP) to build preliminary models in a commercial tool and in open software that can deliver the model in a mobile app. In addition, several transporter data sets extracted from the ChEMBL database were used to illustrate how such public data and models can be shared. Predicting drug-drug interactions for various transporters using computational models is potentially within reach of anyone with an iPhone or iPad. Such tools could help prioritize which substrates should be used for in vivo drug-drug interaction testing and enable open sharing of models.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations Pharmaceuticals, Inc., and Collaborations in Chemistry, Fuquay-Varina, North Carolina (S.E.); Collaborative Drug Discovery, Burlingame, California (S.E.); Molecular Materials Informatics, Inc., Montreal, Quebec, Canada (A.M.C.); and Department of Physiology, University of Arizona, Tucson, Arizona (S.H.W.)
| | - Alex M Clark
- Collaborations Pharmaceuticals, Inc., and Collaborations in Chemistry, Fuquay-Varina, North Carolina (S.E.); Collaborative Drug Discovery, Burlingame, California (S.E.); Molecular Materials Informatics, Inc., Montreal, Quebec, Canada (A.M.C.); and Department of Physiology, University of Arizona, Tucson, Arizona (S.H.W.)
| | - Stephen H Wright
- Collaborations Pharmaceuticals, Inc., and Collaborations in Chemistry, Fuquay-Varina, North Carolina (S.E.); Collaborative Drug Discovery, Burlingame, California (S.E.); Molecular Materials Informatics, Inc., Montreal, Quebec, Canada (A.M.C.); and Department of Physiology, University of Arizona, Tucson, Arizona (S.H.W.)
| |
Collapse
|
5
|
Clark AM, Dole K, Coulon-Spektor A, McNutt A, Grass G, Freundlich JS, Reynolds RC, Ekins S. Open Source Bayesian Models. 1. Application to ADME/Tox and Drug Discovery Datasets. J Chem Inf Model 2015; 55:1231-45. [PMID: 25994950 PMCID: PMC4478615 DOI: 10.1021/acs.jcim.5b00143] [Citation(s) in RCA: 84] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
![]()
On the order of hundreds of absorption,
distribution, metabolism,
excretion, and toxicity (ADME/Tox) models have been described in the
literature in the past decade which are more often than not inaccessible
to anyone but their authors. Public accessibility is also an issue
with computational models for bioactivity, and the ability to share
such models still remains a major challenge limiting drug discovery.
We describe the creation of a reference implementation of a Bayesian
model-building software module, which we have released as an open
source component that is now included in the Chemistry Development
Kit (CDK) project, as well as implemented in the CDD Vault and
in several mobile apps. We use this implementation to build an array
of Bayesian models for ADME/Tox, in vitro and in vivo bioactivity, and other physicochemical properties.
We show that these models possess cross-validation receiver operator
curve values comparable to those generated previously in prior publications
using alternative tools. We have now described how the implementation
of Bayesian models with FCFP6 descriptors generated in the CDD Vault
enables the rapid production of robust machine learning models from
public data or the user’s own datasets. The current study sets
the stage for generating models in proprietary software (such as CDD)
and exporting these models in a format that could be run in open source
software using CDK components. This work also demonstrates that we
can enable biocomputation across distributed private or public datasets
to enhance drug discovery.
Collapse
Affiliation(s)
- Alex M Clark
- †Molecular Materials Informatics, Inc., 1900 St. Jacques No. 302, Montreal H3J 2S1, Quebec, Canada
| | - Krishna Dole
- ‡Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States
| | - Anna Coulon-Spektor
- ‡Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States
| | - Andrew McNutt
- ‡Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States
| | - George Grass
- §G2 Research, Inc., P.O. Box 1242, Tahoe City, California 96145, United States
| | | | - Robert C Reynolds
- #Department of Chemistry, College of Arts and Sciences, University of Alabama at Birmingham, , 1530 Third Avenue South, Birmingham, Alabama 35294-1240, United States
| | - Sean Ekins
- ‡Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States.,∇Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina 27526, United States
| |
Collapse
|
6
|
Clark AM, Ekins S. Open Source Bayesian Models. 2. Mining a "Big Dataset" To Create and Validate Models with ChEMBL. J Chem Inf Model 2015; 55:1246-60. [PMID: 25995041 DOI: 10.1021/acs.jcim.5b00144] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
In an associated paper, we have described a reference implementation of Laplacian-corrected naïve Bayesian model building using extended connectivity (ECFP)- and molecular function class fingerprints of maximum diameter 6 (FCFP)-type fingerprints. As a follow-up, we have now undertaken a large-scale validation study in order to ensure that the technique generalizes to a broad variety of drug discovery datasets. To achieve this, we have used the ChEMBL (version 20) database and split it into more than 2000 separate datasets, each of which consists of compounds and measurements with the same target and activity measurement. In order to test these datasets with the two-state Bayesian classification, we developed an automated algorithm for detecting a suitable threshold for active/inactive designation, which we applied to all collections. With these datasets, we were able to establish that our Bayesian model implementation is effective for the large majority of cases, and we were able to quantify the impact of fingerprint folding on the receiver operator curve cross-validation metrics. We were also able to study the impact that the choice of training/testing set partitioning has on the resulting recall rates. The datasets have been made publicly available to be downloaded, along with the corresponding model data files, which can be used in conjunction with the CDK and several mobile apps. We have also explored some novel visualization methods which leverage the structural origins of the ECFP/FCFP fingerprints to attribute regions of a molecule responsible for positive and negative contributions to activity. The ability to score molecules across thousands of relevant datasets across organisms also may help to access desirable and undesirable off-target effects as well as suggest potential targets for compounds derived from phenotypic screens.
Collapse
Affiliation(s)
- Alex M Clark
- †Molecular Materials Informatics, Inc., 1900 St. Jacques No. 302, Montreal H3J 2S1, Quebec, Canada
| | - Sean Ekins
- ‡Collaborations Pharmaceuticals, Inc., 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina 27526, United States.,§Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina 27526, United States.,∥Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States
| |
Collapse
|
7
|
Abstract
The recent outbreak of the Ebola virus in West Africa has highlighted the clear shortage of broad-spectrum antiviral drugs for emerging viruses. There are numerous FDA approved drugs and other small molecules described in the literature that could be further evaluated for their potential as antiviral compounds. These molecules are in addition to the few new antivirals that have been tested in Ebola patients but were not originally developed against the Ebola virus, and may play an important role as we await an effective vaccine. The balance between using FDA approved drugs versus novel antivirals with minimal safety and no efficacy data in humans should be considered. We have evaluated 55 molecules from the perspective of an experienced medicinal chemist as well as using simple molecular properties and have highlighted 16 compounds that have desirable qualities as well as those that may be less desirable. In addition we propose that a collaborative database for sharing such published and novel information on small molecules is needed for the research community studying the Ebola virus.
Collapse
Affiliation(s)
- Nadia Litterman
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA, 94010, USA
| | - Christopher Lipinski
- Christopher A. Lipinski, Ph.D., LLC., 10 Connshire Drive, Waterford, CT, 06385-4122, USA
| | - Sean Ekins
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA, 94010, USA ; Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay Varina, NC, 27526, USA
| |
Collapse
|
8
|
Korb O, Finn PW, Jones G. The cloud and other new computational methods to improve molecular modelling. Expert Opin Drug Discov 2014; 9:1121-31. [DOI: 10.1517/17460441.2014.941800] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
9
|
Clark AM, Sarker M, Ekins S. New target prediction and visualization tools incorporating open source molecular fingerprints for TB Mobile 2.0. J Cheminform 2014; 6:38. [PMID: 25302078 PMCID: PMC4190048 DOI: 10.1186/s13321-014-0038-2] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Accepted: 06/30/2014] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND We recently developed a freely available mobile app (TB Mobile) for both iOS and Android platforms that displays Mycobacterium tuberculosis (Mtb) active molecule structures and their targets with links to associated data. The app was developed to make target information available to as large an audience as possible. RESULTS We now report a major update of the iOS version of the app. This includes enhancements that use an implementation of ECFP_6 fingerprints that we have made open source. Using these fingerprints, the user can propose compounds with possible anti-TB activity, and view the compounds within a cluster landscape. Proposed compounds can also be compared to existing target data, using a näive Bayesian scoring system to rank probable targets. We have curated an additional 60 new compounds and their targets for Mtb and added these to the original set of 745 compounds. We have also curated 20 further compounds (many without targets in TB Mobile) to evaluate this version of the app with 805 compounds and associated targets. CONCLUSIONS TB Mobile can now manage a small collection of compounds that can be imported from external sources, or exported by various means such as email or app-to-app inter-process communication. This means that TB Mobile can be used as a node within a growing ecosystem of mobile apps for cheminformatics. It can also cluster compounds and use internal algorithms to help identify potential targets based on molecular similarity. TB Mobile represents a valuable dataset, data-visualization aid and target prediction tool.
Collapse
Affiliation(s)
- Alex M Clark
- Molecular Materials Informatics, 1900 St. Jacques #302, Montreal H3J 2S1, Quebec, Canada
| | - Malabika Sarker
- SRI International, 333 Ravenswood Avenue, Menlo Park 94025, CA, USA
| | - Sean Ekins
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame 94010, CA, USA
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina 27526, NC, USA
| |
Collapse
|
10
|
Ekins S. Progress in computational toxicology. J Pharmacol Toxicol Methods 2013; 69:115-40. [PMID: 24361690 DOI: 10.1016/j.vascn.2013.12.003] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2013] [Accepted: 12/08/2013] [Indexed: 01/02/2023]
Abstract
INTRODUCTION Computational methods have been widely applied to toxicology across pharmaceutical, consumer product and environmental fields over the past decade. Progress in computational toxicology is now reviewed. METHODS A literature review was performed on computational models for hepatotoxicity (e.g. for drug-induced liver injury (DILI)), cardiotoxicity, renal toxicity and genotoxicity. In addition various publications have been highlighted that use machine learning methods. Several computational toxicology model datasets from past publications were used to compare Bayesian and Support Vector Machine (SVM) learning methods. RESULTS The increasing amounts of data for defined toxicology endpoints have enabled machine learning models that have been increasingly used for predictions. It is shown that across many different models Bayesian and SVM perform similarly based on cross validation data. DISCUSSION Considerable progress has been made in computational toxicology in a decade in both model development and availability of larger scale or 'big data' models. The future efforts in toxicology data generation will likely provide us with hundreds of thousands of compounds that are readily accessible for machine learning models. These models will cover relevant chemistry space for pharmaceutical, consumer product and environmental applications.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay Varina, NC 27526, USA; Department of Pharmaceutical Sciences, University of Maryland, 20 Penn Street, Baltimore, MD 21201, USA; Department of Pharmacology, Rutgers University-Robert Wood Johnson Medical School, 675 Hoes Lane, Piscataway, NJ 08854, USA; Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, NC 27599-7355, USA.
| |
Collapse
|
11
|
Ekins S, Freundlich JS, Hobrath JV, Lucile White E, Reynolds RC. Combining computational methods for hit to lead optimization in Mycobacterium tuberculosis drug discovery. Pharm Res 2013; 31:414-35. [PMID: 24132686 DOI: 10.1007/s11095-013-1172-7] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2013] [Accepted: 07/28/2013] [Indexed: 12/19/2022]
Abstract
PURPOSE Tuberculosis treatments need to be shorter and overcome drug resistance. Our previous large scale phenotypic high-throughput screening against Mycobacterium tuberculosis (Mtb) has identified 737 active compounds and thousands that are inactive. We have used this data for building computational models as an approach to minimize the number of compounds tested. METHODS A cheminformatics clustering approach followed by Bayesian machine learning models (based on publicly available Mtb screening data) was used to illustrate that application of these models for screening set selections can enrich the hit rate. RESULTS In order to explore chemical diversity around active cluster scaffolds of the dose-response hits obtained from our previous Mtb screens a set of 1924 commercially available molecules have been selected and evaluated for antitubercular activity and cytotoxicity using Vero, THP-1 and HepG2 cell lines with 4.3%, 4.2% and 2.7% hit rates, respectively. We demonstrate that models incorporating antitubercular and cytotoxicity data in Vero cells can significantly enrich the selection of non-toxic actives compared to random selection. Across all cell lines, the Molecular Libraries Small Molecule Repository (MLSMR) and cytotoxicity model identified ~10% of the hits in the top 1% screened (>10 fold enrichment). We also showed that seven out of nine Mtb active compounds from different academic published studies and eight out of eleven Mtb active compounds from a pharmaceutical screen (GSK) would have been identified by these Bayesian models. CONCLUSION Combining clustering and Bayesian models represents a useful strategy for compound prioritization and hit-to lead optimization of antitubercular agents.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California, 94010, USA,
| | | | | | | | | |
Collapse
|