1
|
Lane TR, Urbina F, Rank L, Gerlach J, Riabova O, Lepioshkin A, Kazakova E, Vocat A, Tkachenko V, Cole S, Makarov V, Ekins S. Machine Learning Models for Mycobacterium tuberculosisIn Vitro Activity: Prediction and Target Visualization. Mol Pharm 2022; 19:674-689. [PMID: 34964633 PMCID: PMC9121329 DOI: 10.1021/acs.molpharmaceut.1c00791] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Tuberculosis (TB) is a major global health challenge, with approximately 1.4 million deaths per year. There is still a need to develop novel treatments for patients infected with Mycobacterium tuberculosis (Mtb). There have been many large-scale phenotypic screens that have led to the identification of thousands of new compounds. Yet, there is very limited investment in TB drug discovery which points to the need for new methods to increase the efficiency of drug discovery against Mtb. We have used machine learning approaches to learn from the public Mtb data, resulting in many data sets and models with robust enrichment and hit rates leading to the discovery of new active compounds. Recently, we have curated predominantly small-molecule Mtb data and developed new machine learning classification models with 18 886 molecules at different activity cutoffs. We now describe the further validation of these Bayesian models using a library of over 1000 molecules synthesized as part of EU-funded New Medicines for TB and More Medicines for TB programs. We highlight molecular features which are enriched in these active compounds. In addition, we provide new regression and classification models that can be used for scoring compound libraries or used to design new molecules. We have also visualized these molecules in the context of known molecular targets and identified clusters in chemical property space, which may aid in future target identification efforts. Finally, we are also making these data sets publicly available, representing a significant increase to the available Mtb inhibition data in the public domain.
Collapse
Affiliation(s)
- Thomas R. Lane
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| | - Fabio Urbina
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| | - Laura Rank
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| | - Jacob Gerlach
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| | - Olga Riabova
- Research Center of Biotechnology RAS, 119071 Moscow, Russia
| | | | - Elena Kazakova
- Research Center of Biotechnology RAS, 119071 Moscow, Russia
| | - Anthony Vocat
- Global Health Institute, Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| | - Valery Tkachenko
- Science Data Experts, 14909 Forest Landing Cir, Rockville, MD 20850
| | | | - Vadim Makarov
- Research Center of Biotechnology RAS, 119071 Moscow, Russia
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| |
Collapse
|
2
|
Schmalstig AA, Zorn KM, Murci S, Robinson A, Savina S, Komarova E, Makarov V, Braunstein M, Ekins S. Mycobacterium abscessus drug discovery using machine learning. Tuberculosis (Edinb) 2022; 132:102168. [PMID: 35077930 PMCID: PMC8855326 DOI: 10.1016/j.tube.2022.102168] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 10/30/2021] [Accepted: 01/14/2022] [Indexed: 01/22/2023]
Abstract
The prevalence of infections by nontuberculous mycobacteria is increasing, having surpassed tuberculosis in the United States and much of the developed world. Nontuberculous mycobacteria occur naturally in the environment and are a significant problem for patients with underlying lung diseases such as bronchiectasis, chronic obstructive pulmonary disease, and cystic fibrosis. Current treatment regimens are lengthy, complicated, toxic and they are often unsuccessful as seen by disease recurrence. Mycobacterium abscessus is one of the most commonly encountered organisms in nontuberculous mycobacteria disease and it is the most difficult to eradicate. There is currently no systematically proven regimen that is effective for treating M. abscessus infections. Our approach to drug discovery integrates machine learning, medicinal chemistry and in vitro testing and has been previously applied to Mycobacterium tuberculosis. We have now identified several novel 1-(phenylsulfonyl)-1H-benzimidazol-2-amines that have weak activity on M. abscessus in vitro but may represent a starting point for future further medicinal chemistry optimization. We also address limitations still to be overcome with the machine learning approach for M. abscessus.
Collapse
Affiliation(s)
- Alan A. Schmalstig
- Department of Microbiology and Immunology, School of Medicine, University of North Carolina at Chapel Hill, North Carolina, 27599, USA
| | - Kimberley M. Zorn
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive Lab 3510, Raleigh, North Carolina, 27606, USA
| | - Sebastian Murci
- Department of Microbiology and Immunology, School of Medicine, University of North Carolina at Chapel Hill, North Carolina, 27599, USA
| | - Andrew Robinson
- Department of Microbiology and Immunology, School of Medicine, University of North Carolina at Chapel Hill, North Carolina, 27599, USA
| | - Svetlana Savina
- Research Center of Biotechnology RAS, Moscow, 119071, Russia
| | - Elena Komarova
- Research Center of Biotechnology RAS, Moscow, 119071, Russia
| | - Vadim Makarov
- Research Center of Biotechnology RAS, Moscow, 119071, Russia
| | - Miriam Braunstein
- Department of Microbiology and Immunology, School of Medicine, University of North Carolina at Chapel Hill, North Carolina, 27599, USA
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive Lab 3510, Raleigh, North Carolina, 27606, USA.,Corresponding author: Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive Lab 3510, Raleigh, North Carolina, 27606, USA.
| |
Collapse
|
3
|
Winkler DA. Use of Artificial Intelligence and Machine Learning for Discovery of Drugs for Neglected Tropical Diseases. Front Chem 2021; 9:614073. [PMID: 33791277 PMCID: PMC8005575 DOI: 10.3389/fchem.2021.614073] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 01/18/2021] [Indexed: 12/11/2022] Open
Abstract
Neglected tropical diseases continue to create high levels of morbidity and mortality in a sizeable fraction of the world’s population, despite ongoing research into new treatments. Some of the most important technological developments that have accelerated drug discovery for diseases of affluent countries have not flowed down to neglected tropical disease drug discovery. Pharmaceutical development business models, cost of developing new drug treatments and subsequent costs to patients, and accessibility of technologies to scientists in most of the affected countries are some of the reasons for this low uptake and slow development relative to that for common diseases in developed countries. Computational methods are starting to make significant inroads into discovery of drugs for neglected tropical diseases due to the increasing availability of large databases that can be used to train ML models, increasing accuracy of these methods, lower entry barrier for researchers, and widespread availability of public domain machine learning codes. Here, the application of artificial intelligence, largely the subset called machine learning, to modelling and prediction of biological activities and discovery of new drugs for neglected tropical diseases is summarized. The pathways for the development of machine learning methods in the short to medium term and the use of other artificial intelligence methods for drug discovery is discussed. The current roadblocks to, and likely impacts of, synergistic new technological developments on the use of ML methods for neglected tropical disease drug discovery in the future are also discussed.
Collapse
Affiliation(s)
- David A Winkler
- Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, VIC, Australia.,Latrobe Institute for Molecular Science, La Trobe University, Bundoora, VIC, Australia.,School of Pharmacy, University of Nottingham, Nottingham, United Kingdom.,CSIRO Data61, Pullenvale, QLD, Australia
| |
Collapse
|
4
|
Simoben CV, Ntie-Kang F, Robaa D, Sippl W. Case studies on computer-based identification of natural products as lead molecules. PHYSICAL SCIENCES REVIEWS 2020. [DOI: 10.1515/psr-2018-0119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
AbstractThe development and application of computer-aided drug design/discovery (CADD) techniques (such as structured-base virtual screening, ligand-based virtual screening and neural networks approaches) are on the point of disintermediation in the pharmaceutical drug discovery processes. The application of these CADD methods are standing out positively as compared to other experimental approaches in the identification of hits. In order to venture into new chemical spaces, research groups are exploring natural products (NPs) for the search and identification of new hits and more efficient leads as well as the repurposing of approved NPs. The chemical space of NPs is continuously increasing as a result of millions of years of evolution of species and these data are mainly stored in the form of databases providing access to scientists around the world to conduct studies using them. Investigation of these NP databases with the help of CADD methodologies in combination with experimental validation techniques is essential to identify and propose new drug molecules. In this chapter, we highlight the importance of the chemical diversity of NPs as a source for potential drugs as well as some of the success stories of NP-derived candidates against important therapeutic targets. The focus is on studies that applied a healthy dose of the emerging CADD methodologies (structure-based, ligand-based and machine learning).
Collapse
Affiliation(s)
- Conrad V. Simoben
- Department of Medicinal Chemistry (AG Sippl), Institute of Pharmacy, Martin-Luther-Universität Halle-Wittenberg, Kurt-Mothes-Str. 3, 06120Halle (Saale), Germany
| | - Fidele Ntie-Kang
- Department of Chemistry, University of Buea, P. O. Box 63, Buea, Cameroon
- Department of Medicinal Chemistry (AG Sippl), Institute of Pharmacy, Martin-Luther-Universität Halle-Wittenberg, Kurt-Mothes-Str. 3, 06120Halle (Saale), Germany
| | - Dina Robaa
- Department of Medicinal Chemistry (AG Sippl), Institute of Pharmacy, Martin-Luther-Universität Halle-Wittenberg, Kurt-Mothes-Str. 3, 06120Halle (Saale), Germany
| | - Wolfgang Sippl
- Department of Medicinal Chemistry (AG Sippl), Institute of Pharmacy, Martin-Luther-Universität Halle-Wittenberg, Kurt-Mothes-Str. 3, 06120Halle (Saale), Germany
| |
Collapse
|
5
|
Makarov V, Salina E, Reynolds RC, Kyaw Zin PP, Ekins S. Molecule Property Analyses of Active Compounds for Mycobacterium tuberculosis. J Med Chem 2020; 63:8917-8955. [PMID: 32259446 DOI: 10.1021/acs.jmedchem.9b02075] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Tuberculosis (TB) continues to claim the lives of around 1.7 million people per year. Most concerning are the reports of multidrug drug resistance. Paradoxically, this global health pandemic is demanding new therapies when resources and interest are waning. However, continued tuberculosis drug discovery is critical to address the global health need and burgeoning multidrug resistance. Many diverse classes of antitubercular compounds have been identified with activity in vitro and in vivo. Our analyses of over 100 active leads are representative of thousands of active compounds generated over the past decade, suggests that they come from few chemical classes or natural product sources. We are therefore repeatedly identifying compounds that are similar to those that preceded them. Our molecule-centered cheminformatics analyses point to the need to dramatically increase the diversity of chemical libraries tested and get outside of the historic Mtb property space if we are to generate novel improved antitubercular leads.
Collapse
Affiliation(s)
- Vadim Makarov
- FRC Fundamentals of Biotechnology, Russian Academy of Science, Moscow 119071, Russia
| | - Elena Salina
- FRC Fundamentals of Biotechnology, Russian Academy of Science, Moscow 119071, Russia
| | - Robert C Reynolds
- Department of Medicine, Division of Hematology and Oncology, University of Alabama at Birmingham, NP 2540 J, 1720 Second Avenue South, Birmingham, Alabama 35294-3300, United States
| | - Phyo Phyo Kyaw Zin
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27695, United States.,Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, North Carolina 27606, United States
| |
Collapse
|
6
|
Sunseri J, Koes DR. libmolgrid: Graphics Processing Unit Accelerated Molecular Gridding for Deep Learning Applications. J Chem Inf Model 2020; 60:1079-1084. [PMID: 32049525 DOI: 10.1021/acs.jcim.9b01145] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We describe libmolgrid, a general-purpose library for representing three-dimensional molecules using multidimensional arrays of voxelized molecular data. libmolgrid provides functionality for sampling batches of data suited to machine learning workflows, and it also supports temporal and spatial recurrences over that data to facilitate work with convolutional and recurrent neural networks. It was designed for seamless integration with popular deep learning frameworks and features optimized performance by leveraging graphics processing units (GPUs). libmolgrid is a free and open source project (GPLv2) that aims to democratize grid-based modeling in computational chemistry.
Collapse
Affiliation(s)
- Jocelyn Sunseri
- Department of Computational and Systems Biology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, Pennsylvania 15260, United States
| | - David R Koes
- Department of Computational and Systems Biology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, Pennsylvania 15260, United States
| |
Collapse
|
7
|
Wang X, Perryman AL, Li SG, Paget SD, Stratton TP, Lemenze A, Olson AJ, Ekins S, Kumar P, Freundlich JS. Intrabacterial Metabolism Obscures the Successful Prediction of an InhA Inhibitor of Mycobacterium tuberculosis. ACS Infect Dis 2019; 5:2148-2163. [PMID: 31625383 DOI: 10.1021/acsinfecdis.9b00295] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Tuberculosis, caused by Mycobacterium tuberculosis (M. tuberculosis), kills 1.6 million people annually. To bridge the gap between structure- and cell-based drug discovery strategies, we are pioneering a computer-aided discovery paradigm that merges structure-based virtual screening with ligand-based, machine learning methods trained with cell-based data. This approach successfully identified N-(3-methoxyphenyl)-7-nitrobenzo[c][1,2,5]oxadiazol-4-amine (JSF-2164) as an inhibitor of purified InhA with whole-cell efficacy versus in vitro cultured M. tuberculosis. When the intrabacterial drug metabolism (IBDM) platform was leveraged, mechanistic studies demonstrated that JSF-2164 underwent a rapid F420H2-dependent biotransformation within M. tuberculosis to afford intrabacterial nitric oxide and two amines, identified as JSF-3616 and JSF-3617. Thus, metabolism of JSF-2164 obscured the InhA inhibition phenotype within cultured M. tuberculosis. This study demonstrates a new docking/Bayesian computational strategy to combine cell- and target-based drug screening and the need to probe intrabacterial metabolism when clarifying the antitubercular mechanism of action.
Collapse
Affiliation(s)
- Xin Wang
- Department of Pharmacology, Physiology, and Neuroscience, Rutgers University−New Jersey Medical School, Medical Sciences Building, 185 South Orange Avenue, Newark, New Jersey 07103, United States
| | - Alexander L. Perryman
- Department of Pharmacology, Physiology, and Neuroscience, Rutgers University−New Jersey Medical School, Medical Sciences Building, 185 South Orange Avenue, Newark, New Jersey 07103, United States
| | - Shao-Gang Li
- Department of Pharmacology, Physiology, and Neuroscience, Rutgers University−New Jersey Medical School, Medical Sciences Building, 185 South Orange Avenue, Newark, New Jersey 07103, United States
| | - Steve D. Paget
- Department of Pharmacology, Physiology, and Neuroscience, Rutgers University−New Jersey Medical School, Medical Sciences Building, 185 South Orange Avenue, Newark, New Jersey 07103, United States
| | - Thomas P. Stratton
- Department of Pharmacology, Physiology, and Neuroscience, Rutgers University−New Jersey Medical School, Medical Sciences Building, 185 South Orange Avenue, Newark, New Jersey 07103, United States
| | - Alex Lemenze
- Division of Infectious Disease, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Reemerging Pathogens, Rutgers University−New Jersey Medical School, Medical Sciences Building, 185 South Orange Avenue, Newark, New Jersey 07103, United States
| | - Arthur J. Olson
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, Room MB112/Mail Drop MB5, 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Sean Ekins
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina 27526, United States
| | - Pradeep Kumar
- Division of Infectious Disease, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Reemerging Pathogens, Rutgers University−New Jersey Medical School, Medical Sciences Building, 185 South Orange Avenue, Newark, New Jersey 07103, United States
| | - Joel S. Freundlich
- Department of Pharmacology, Physiology, and Neuroscience, Rutgers University−New Jersey Medical School, Medical Sciences Building, 185 South Orange Avenue, Newark, New Jersey 07103, United States
- Division of Infectious Disease, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Reemerging Pathogens, Rutgers University−New Jersey Medical School, Medical Sciences Building, 185 South Orange Avenue, Newark, New Jersey 07103, United States
| |
Collapse
|
8
|
Cheminformatics techniques in antimalarial drug discovery and development from natural products 1: basic concepts. PHYSICAL SCIENCES REVIEWS 2019. [DOI: 10.1515/psr-2018-0130] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Abstract
A large number of natural products, especially those used in ethnomedicine of malaria, have shown varying in vitro antiplasmodial activities. Facilitating antimalarial drug development from this wealth of natural products is an imperative and laudable mission to pursue. However, limited manpower, high research cost coupled with high failure rate during preclinical and clinical studies might militate against the pursuit of this mission. These limitations may be overcome with cheminformatic techniques. Cheminformatics involves the organization, integration, curation, standardization, simulation, mining and transformation of pharmacology data (compounds and bioactivity) into knowledge that can drive rational and viable drug development decisions. This chapter will review the application of cheminformatics techniques (including molecular diversity analysis, quantitative-structure activity/property relationships and Machine learning) to natural products with in vitro and in vivo antiplasmodial activities in order to facilitate their development into antimalarial drug candidates and design of new potential antimalarial compounds.
Collapse
|
9
|
A Useful Synthesis of 2-Acylamino-1,3,4-oxadiazoles from Acylthiosemicarbazides Using Potassium Iodate and the Discovery of New Antibacterial Compounds. Molecules 2019; 24:molecules24081490. [PMID: 30988267 PMCID: PMC6515089 DOI: 10.3390/molecules24081490] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Revised: 04/13/2019] [Accepted: 04/15/2019] [Indexed: 12/14/2022] Open
Abstract
A useful method for the synthesis of 2-acylamino-1,3,4-oxadiazoles was developed. By using potassium iodate as an oxidant in water at 60 °C, a wide range of 2-acylamino-1,3,4-oxadiazoles were afforded in moderate to excellent yields within two hours. This method could provide a facile shortcut to generate a series of 2-acylamino-1,3,4-oxadiazoles in medicinal chemistry. Interestingly, some highly potent antibiotic compounds were found through this synthetic method, and some of them displayed a significant improvement in activity compared with the corresponding 1,4-diacylthiosemicarbazides. Compound 2n was the most active against Staphylococcus aureus with MIC (minimum inhibitory concentration) of 1.56 mg/mL, and compounds 2m and 2q were the most active against Bacillus subtilis with MIC of 0.78 mg/mL. The preliminary cytotoxic activities of the most potent compounds 2m, 2n, and 2q against the androgen-independent (PC-3) prostate cancer cell line were more than 30 μM (IC50 > 30 μM).
Collapse
|
10
|
Lane T, Russo DP, Zorn KM, Clark AM, Korotcov A, Tkachenko V, Reynolds RC, Perryman AL, Freundlich JS, Ekins AS. Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery. Mol Pharm 2018; 15:4346-4360. [PMID: 29672063 PMCID: PMC6167198 DOI: 10.1021/acs.molpharmaceut.8b00083] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Tuberculosis is a global health dilemma. In 2016, the WHO reported 10.4 million incidences and 1.7 million deaths. The need to develop new treatments for those infected with Mycobacterium tuberculosis ( Mtb) has led to many large-scale phenotypic screens and many thousands of new active compounds identified in vitro. However, with limited funding, efforts to discover new active molecules against Mtb needs to be more efficient. Several computational machine learning approaches have been shown to have good enrichment and hit rates. We have curated small molecule Mtb data and developed new models with a total of 18,886 molecules with activity cutoffs of 10 μM, 1 μM, and 100 nM. These data sets were used to evaluate different machine learning methods (including deep learning) and metrics and to generate predictions for additional molecules published in 2017. One Mtb model, a combined in vitro and in vivo data Bayesian model at a 100 nM activity yielded the following metrics for 5-fold cross validation: accuracy = 0.88, precision = 0.22, recall = 0.91, specificity = 0.88, kappa = 0.31, and MCC = 0.41. We have also curated an evaluation set ( n = 153 compounds) published in 2017, and when used to test our model, it showed the comparable statistics (accuracy = 0.83, precision = 0.27, recall = 1.00, specificity = 0.81, kappa = 0.36, and MCC = 0.47). We have also compared these models with additional machine learning algorithms showing Bayesian machine learning models constructed with literature Mtb data generated by different laboratories generally were equivalent to or outperformed deep neural networks with external test sets. Finally, we have also compared our training and test sets to show they were suitably diverse and different in order to represent useful evaluation sets. Such Mtb machine learning models could help prioritize compounds for testing in vitro and in vivo.
Collapse
Affiliation(s)
- Thomas Lane
- Collaborations Pharmaceuticals, Inc., Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
- Department of Biochemistry and Biophysics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Daniel P. Russo
- Collaborations Pharmaceuticals, Inc., Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ, 08102, USA
| | - Kimberley M. Zorn
- Collaborations Pharmaceuticals, Inc., Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| | - Alex M. Clark
- Molecular Materials Informatics, Inc., 1900 St. Jacques #302, Montreal H3J 2S1, Quebec, Canada
| | - Alexandru Korotcov
- Science Data Software, LLC, 14914 Bradwill Court, Rockville, MD 20850, USA
| | - Valery Tkachenko
- Science Data Software, LLC, 14914 Bradwill Court, Rockville, MD 20850, USA
| | - Robert C. Reynolds
- Department of Medicine, Division of Hematology and Oncology, University of Alabama at Birmingham, NP 2540 J, 1720 2Avenue South, Birmingham, AL 35294-3300, USA
| | - Alexander L. Perryman
- Department of Pharmacology, Physiology and Neuroscience, Rutgers University-New Jersey Medical School, Newark, New Jersey 07103, USA
| | - Joel S. Freundlich
- Department of Pharmacology, Physiology and Neuroscience, Rutgers University-New Jersey Medical School, Newark, New Jersey 07103, USA
- Division of Infectious Diseases, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University–New Jersey Medical School, Newark, New Jersey 07103, USA
| | - and Sean Ekins
- Collaborations Pharmaceuticals, Inc., Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| |
Collapse
|
11
|
Perryman AL, Patel JS, Russo R, Singleton E, Connell N, Ekins S, Freundlich JS. Naïve Bayesian Models for Vero Cell Cytotoxicity. Pharm Res 2018; 35:170. [PMID: 29959603 DOI: 10.1007/s11095-018-2439-9] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Accepted: 06/05/2018] [Indexed: 11/30/2022]
Abstract
PURPOSE To advance translational research of potential therapeutic small molecules against infectious microbes, the compounds must display a relative lack of mammalian cell cytotoxicity. Vero cell cytotoxicity (CC50) is a common initial assay for this metric. We explored the development of naïve Bayesian models that can enhance the probability of identifying non-cytotoxic compounds. METHODS Vero cell cytotoxicity assays were identified in PubChem, reformatted, and curated to create a training set with 8741 unique small molecules. These data were used to develop Bayesian classifiers, which were assessed with internal cross-validation, external tests with a set of 193 compounds from our laboratory, and independent validation with an additional diverse set of 1609 unique compounds from PubChem. RESULTS Evaluation with independent, external test and validation sets indicated that cytotoxicity Bayesian models constructed with the ECFP_6 descriptor were more accurate than those that used FCFP_6 fingerprints. The best cytotoxicity Bayesian model displayed predictive power in external evaluations, according to conventional and chance-corrected statistics, as well as enrichment factors. CONCLUSIONS The results from external tests demonstrate that our novel cytotoxicity Bayesian model displays sufficient predictive power to help guide translational research. To assist the chemical tool and drug discovery communities, our curated training set is being distributed as part of the Supplementary Material. Graphical Abstract Naive Bayesian models have been trained with publically available data and offer a useful tool for chemical biology and drug discovery to select for small molecules with a high probability of exhibiting acceptably low Vero cell cytotoxicity.
Collapse
Affiliation(s)
- Alexander L Perryman
- Department of Pharmacology, Physiology and Neuroscience, and Medicine, Rutgers University-New Jersey Medical School, Medical Sciences Building, I-503, 185 South Orange Ave, Newark, NJ, 07103, USA
| | - Jimmy S Patel
- Department of Pharmacology, Physiology and Neuroscience, and Medicine, Rutgers University-New Jersey Medical School, Medical Sciences Building, I-503, 185 South Orange Ave, Newark, NJ, 07103, USA
| | - Riccardo Russo
- Division of Infectious Diseases, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University-New Jersey Medical School, Medical Sciences Building, I-503, 185 South Orange Ave, Newark, NJ, 07103, USA
| | - Eric Singleton
- Division of Infectious Diseases, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University-New Jersey Medical School, Medical Sciences Building, I-503, 185 South Orange Ave, Newark, NJ, 07103, USA
| | - Nancy Connell
- Division of Infectious Diseases, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University-New Jersey Medical School, Medical Sciences Building, I-503, 185 South Orange Ave, Newark, NJ, 07103, USA
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., Main Campus Drive Lab 3510, Raleigh, North Carolina,, 27606, USA
| | - Joel S Freundlich
- Department of Pharmacology, Physiology and Neuroscience, and Medicine, Rutgers University-New Jersey Medical School, Medical Sciences Building, I-503, 185 South Orange Ave, Newark, NJ, 07103, USA. .,Division of Infectious Diseases, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University-New Jersey Medical School, Medical Sciences Building, I-503, 185 South Orange Ave, Newark, NJ, 07103, USA.
| |
Collapse
|
12
|
Ekins S, Clark AM, Dole K, Gregory K, Mcnutt AM, Spektor AC, Weatherall C, Litterman NK, Bunin BA. Data Mining and Computational Modeling of High-Throughput Screening Datasets. Methods Mol Biol 2018; 1755:197-221. [PMID: 29671272 DOI: 10.1007/978-1-4939-7724-6_14] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
We are now seeing the benefit of investments made over the last decade in high-throughput screening (HTS) that is resulting in large structure activity datasets entering public and open databases such as ChEMBL and PubChem. The growth of academic HTS screening centers and the increasing move to academia for early stage drug discovery suggests a great need for the informatics tools and methods to mine such data and learn from it. Collaborative Drug Discovery, Inc. (CDD) has developed a number of tools for storing, mining, securely and selectively sharing, as well as learning from such HTS data. We present a new web based data mining and visualization module directly within the CDD Vault platform for high-throughput drug discovery data that makes use of a novel technology stack following modern reactive design principles. We also describe CDD Models within the CDD Vault platform that enables researchers to share models, share predictions from models, and create models from distributed, heterogeneous data. Our system is built on top of the Collaborative Drug Discovery Vault Activity and Registration data repository ecosystem which allows users to manipulate and visualize thousands of molecules in real time. This can be performed in any browser on any platform. In this chapter we present examples of its use with public datasets in CDD Vault. Such approaches can complement other cheminformatics tools, whether open source or commercial, in providing approaches for data mining and modeling of HTS data.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, 27606, USA.
| | - Alex M Clark
- Collaborative Drug Discovery, Inc., Burlingame, CA, USA
- Molecular Materials Informatics, Inc., Montreal, QC, Canada
| | - Krishna Dole
- Collaborative Drug Discovery, Inc., Burlingame, CA, USA
| | | | | | | | | | | | - Barry A Bunin
- Collaborative Drug Discovery, Inc., Burlingame, CA, USA
| |
Collapse
|
13
|
Zheng Y, Qiu L, Hong K, Dong S, Xu X. Copper- or Thermally Induced Divergent Outcomes: Synthesis of 4-Methyl 2H
-Chromenes and Spiro-4H
-Pyrazoles. Chemistry 2017; 24:6705-6711. [DOI: 10.1002/chem.201704759] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Indexed: 12/12/2022]
Affiliation(s)
- Yang Zheng
- Key Laboratory of Organic Synthesis of Jiangsu Province, College of, Chemistry, Chemical Engineering and Materials Science; Soochow University; Suzhou 215123 P.R. China
| | - Lihua Qiu
- Key Laboratory of Organic Synthesis of Jiangsu Province, College of, Chemistry, Chemical Engineering and Materials Science; Soochow University; Suzhou 215123 P.R. China
| | - Kemiao Hong
- Key Laboratory of Organic Synthesis of Jiangsu Province, College of, Chemistry, Chemical Engineering and Materials Science; Soochow University; Suzhou 215123 P.R. China
| | - Shanliang Dong
- Key Laboratory of Organic Synthesis of Jiangsu Province, College of, Chemistry, Chemical Engineering and Materials Science; Soochow University; Suzhou 215123 P.R. China
| | - Xinfang Xu
- Key Laboratory of Organic Synthesis of Jiangsu Province, College of, Chemistry, Chemical Engineering and Materials Science; Soochow University; Suzhou 215123 P.R. China
- State Key Laboratory of Elemento-organic Chemistry; Nankai University; Tianjin 300071 P.R. China
| |
Collapse
|
14
|
Stratton TP, Perryman AL, Vilchèze C, Russo R, Li SG, Patel JS, Singleton E, Ekins S, Connell N, Jacobs WR, Freundlich JS. Addressing the Metabolic Stability of Antituberculars through Machine Learning. ACS Med Chem Lett 2017; 8:1099-1104. [PMID: 29057058 DOI: 10.1021/acsmedchemlett.7b00299] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Accepted: 09/14/2017] [Indexed: 12/26/2022] Open
Abstract
We present the first prospective application of our mouse liver microsomal (MLM) stability Bayesian model. CD117, an antitubercular thienopyrimidine tool compound that suffers from metabolic instability (MLM t1/2 < 1 min), was utilized to assess the predictive power of our new MLM stability model. The S-substituent was removed, a set of commercial reagents was utilized to construct a virtual library of 411 analogues, and our MLM stability model was applied to prioritize 13 analogues for synthesis and biological profiling. In MLM stability assays, all 13 analogues had superior metabolic stability to the parent compound, and six new analogues had acceptable MLM t1/2 values greater than or equal to 60 min. It is noteworthy that whole-cell efficacy and lack of relative mammalian cell cytotoxicity could not be predicted simultaneously. These results support the utility of our new MLM stability model in chemical tool and drug discovery optimization efforts.
Collapse
Affiliation(s)
- Thomas P. Stratton
- Department
of Pharmacology, Physiology, and Neuroscience, Rutgers University−New Jersey Medical School, Newark, New Jersey 07103, United States
| | - Alexander L. Perryman
- Department
of Pharmacology, Physiology, and Neuroscience, Rutgers University−New Jersey Medical School, Newark, New Jersey 07103, United States
| | - Catherine Vilchèze
- Howard
Hughes Medical Institute, Department of Microbiology and Immunology, Albert Einstein College of Medicine, Bronx, New York 10461, United States
| | - Riccardo Russo
- Division
of Infectious Disease, Department of Medicine and the Ruy V. Lourenço
Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University−New Jersey Medical School, Newark, New Jersey 07103, United States
| | - Shao-Gang Li
- Department
of Pharmacology, Physiology, and Neuroscience, Rutgers University−New Jersey Medical School, Newark, New Jersey 07103, United States
| | - Jimmy S. Patel
- Department
of Pharmacology, Physiology, and Neuroscience, Rutgers University−New Jersey Medical School, Newark, New Jersey 07103, United States
| | - Eric Singleton
- Division
of Infectious Disease, Department of Medicine and the Ruy V. Lourenço
Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University−New Jersey Medical School, Newark, New Jersey 07103, United States
| | - Sean Ekins
- Collaborative Drug Discovery, 1633
Bayshore Highway, Suite 342, Burlingame, California 94010, United States
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Nancy Connell
- Division
of Infectious Disease, Department of Medicine and the Ruy V. Lourenço
Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University−New Jersey Medical School, Newark, New Jersey 07103, United States
| | - William R. Jacobs
- Howard
Hughes Medical Institute, Department of Microbiology and Immunology, Albert Einstein College of Medicine, Bronx, New York 10461, United States
| | - Joel S. Freundlich
- Department
of Pharmacology, Physiology, and Neuroscience, Rutgers University−New Jersey Medical School, Newark, New Jersey 07103, United States
- Division
of Infectious Disease, Department of Medicine and the Ruy V. Lourenço
Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University−New Jersey Medical School, Newark, New Jersey 07103, United States
| |
Collapse
|
15
|
Ekins S, Godbole AA, Kéri G, Orfi L, Pato J, Bhat RS, Verma R, Bradley EK, Nagaraja V. Machine learning and docking models for Mycobacterium tuberculosis topoisomerase I. Tuberculosis (Edinb) 2017; 103:52-60. [PMID: 28237034 DOI: 10.1016/j.tube.2017.01.005] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2016] [Revised: 01/14/2017] [Accepted: 01/18/2017] [Indexed: 11/30/2022]
Abstract
There is a shortage of compounds that are directed towards new targets apart from those targeted by the FDA approved drugs used against Mycobacterium tuberculosis. Topoisomerase I (Mttopo I) is an essential mycobacterial enzyme and a promising target in this regard. However, it suffers from a shortage of known inhibitors. We have previously used computational approaches such as homology modeling and docking to propose 38 FDA approved drugs for testing and identified several active molecules. To follow on from this, we now describe the in vitro testing of a library of 639 compounds. These data were used to create machine learning models for Mttopo I which were further validated. The combined Mttopo I Bayesian model had a 5 fold cross validation receiver operator characteristic of 0.74 and sensitivity, specificity and concordance values above 0.76 and was used to select commercially available compounds for testing in vitro. The recently described crystal structure of Mttopo I was also compared with the previously described homology model and then used to dock the Mttopo I actives norclomipramine and imipramine. In summary, we describe our efforts to identify small molecule inhibitors of Mttopo I using a combination of machine learning modeling and docking studies in conjunction with screening of the selected molecules for enzyme inhibition. We demonstrate the experimental inhibition of Mttopo I by small molecule inhibitors and show that the enzyme can be readily targeted for lead molecule development.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94403, USA; Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526, USA.
| | - Adwait Anand Godbole
- Department of Microbiology and Cell Biology, Indian Institute of Science, Bangalore, 560012, India
| | - György Kéri
- Vichem Chemie Research Ltd., Herman Ottó u. 15, H-1022, Budapest, Hungary; Semmelweis Univ, Dept Med Chem, MTA SE Pathobiochem Res Grp, H-1092, Budapest, Hungary
| | - Lászlo Orfi
- Vichem Chemie Research Ltd., Herman Ottó u. 15, H-1022, Budapest, Hungary; Semmelweis Univ, Dept Med Chem, MTA SE Pathobiochem Res Grp, H-1092, Budapest, Hungary
| | - János Pato
- Vichem Chemie Research Ltd., Herman Ottó u. 15, H-1022, Budapest, Hungary
| | - Rajeshwari Subray Bhat
- Department of Microbiology and Cell Biology, Indian Institute of Science, Bangalore, 560012, India
| | - Rinkee Verma
- Department of Microbiology and Cell Biology, Indian Institute of Science, Bangalore, 560012, India
| | | | - Valakunja Nagaraja
- Department of Microbiology and Cell Biology, Indian Institute of Science, Bangalore, 560012, India; Jawaharlal Nehru Centre for Advanced Scientific Research, Bangalore, 560064, India.
| |
Collapse
|
16
|
Collaborative drug discovery for More Medicines for Tuberculosis (MM4TB). Drug Discov Today 2016; 22:555-565. [PMID: 27884746 DOI: 10.1016/j.drudis.2016.10.009] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Revised: 10/11/2016] [Accepted: 10/21/2016] [Indexed: 01/30/2023]
Abstract
Neglected disease drug discovery is generally poorly funded compared with major diseases and hence there is an increasing focus on collaboration and precompetitive efforts such as public-private partnerships (PPPs). The More Medicines for Tuberculosis (MM4TB) project is one such collaboration funded by the EU with the goal of discovering new drugs for tuberculosis. Collaborative Drug Discovery has provided a commercial web-based platform called CDD Vault which is a hosted collaborative solution for securely sharing diverse chemistry and biology data. Using CDD Vault alongside other commercial and free cheminformatics tools has enabled support of this and other large collaborative projects, aiding drug discovery efforts and fostering collaboration. We will describe CDD's efforts in assisting with the MM4TB project.
Collapse
|
17
|
A cell-based approach to characterize antimicrobial compounds through kinetic dose response. Bioorg Med Chem 2016; 24:6315-6319. [PMID: 27713016 DOI: 10.1016/j.bmc.2016.09.053] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Revised: 09/19/2016] [Accepted: 09/21/2016] [Indexed: 12/18/2022]
Abstract
The rapid spread of antibiotic resistance has created a pressing need for the development of novel drug screening platforms. Herein, we report on the use of cell-based kinetic dose response curves for small molecule characterization in antibiotic discovery efforts. Kinetically monitoring bacterial growth at sub-inhibitory concentrations of antimicrobial small molecules generates unique dose response profiles. We show that clustering of profiles by growth characteristics can classify antibiotics by mechanism of action. Furthermore, changes in growth kinetics have the potential to offer insight into the mechanistic action of novel molecules and can be used to predict off-target effects generated through structure-activity relationship studies. Kinetic dose response also allows for detection of unstable compounds early in the lead development process. We propose that this kinetic approach is a rapid and cost-effective means to gather critical information on antimicrobial small molecules during the hit selection and lead development pipeline.
Collapse
|
18
|
Salim KY, Vareki SM, Danter WR, Koropatnick J. COTI-2, a novel small molecule that is active against multiple human cancer cell lines in vitro and in vivo. Oncotarget 2016; 7:41363-41379. [PMID: 27150056 PMCID: PMC5173065 DOI: 10.18632/oncotarget.9133] [Citation(s) in RCA: 115] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2016] [Accepted: 04/16/2016] [Indexed: 12/28/2022] Open
Abstract
Identification of novel anti-cancer compounds with high efficacy and low toxicity is critical in drug development. High-throughput screening and other such strategies are generally resource-intensive. Therefore, in silico computer-aided drug design has gained rapid acceptance and popularity. We employed our proprietary computational platform (CHEMSAS®), which uses a unique combination of traditional and modern pharmacology principles, statistical modeling, medicinal chemistry, and machine-learning technologies to discover and optimize novel compounds that could target various cancers. COTI-2 is a small molecule candidate anti-cancer drug identified using CHEMSAS. This study describes the in vitro and in vivo evaluation of COTI-2. Our data demonstrate that COTI-2 is effective against a diverse group of human cancer cell lines regardless of their tissue of origin or genetic makeup. Most treated cancer cell lines were sensitive to COTI-2 at nanomolar concentrations. When compared to traditional chemotherapy or targeted-therapy agents, COTI-2 showed superior activity against tumor cells, in vitro and in vivo. Despite its potent anti-tumor efficacy, COTI-2 was safe and well-tolerated in vivo. Although the mechanism of action of COTI-2 is still under investigation, preliminary results indicate that it is not a traditional kinase or an Hsp90 inhibitor.
Collapse
Affiliation(s)
| | - Saman Maleki Vareki
- Cancer Research Laboratory Program, Lawson Health Research Institute, London, Ontario, Canada
| | | | - James Koropatnick
- Cancer Research Laboratory Program, Lawson Health Research Institute, London, Ontario, Canada
- Department of Microbiology and Immunology, Western University, London, Ontario, Canada
- Department of Pathology, Western University, London, Ontario, Canada
- Department of Oncology, Western University, London, Ontario, Canada
- Department of Physiology and Pharmacology, Western University, London, Ontario, Canada
| |
Collapse
|
19
|
Ekins S, Perryman AL, Clark AM, Reynolds RC, Freundlich JS. Machine Learning Model Analysis and Data Visualization with Small Molecules Tested in a Mouse Model of Mycobacterium tuberculosis Infection (2014-2015). J Chem Inf Model 2016; 56:1332-43. [PMID: 27335215 PMCID: PMC4962118 DOI: 10.1021/acs.jcim.6b00004] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
![]()
The
renewed urgency to develop new treatments for Mycobacterium
tuberculosis (Mtb)
infection has resulted in large-scale phenotypic screening and thousands
of new active compounds in vitro. The next challenge
is to identify candidates to pursue in a mouse in vivo efficacy model as a step to predicting clinical efficacy. We previously
analyzed over 70 years of this mouse in vivo efficacy
data, which we used to generate and validate machine learning models.
Curation of 60 additional small molecules with in vivo data published in 2014 and 2015 was undertaken to further test these
models. This represents a much larger test set than for the previous
models. Several computational approaches have now been applied to
analyze these molecules and compare their molecular properties beyond
those attempted previously. Our previous machine learning models have
been updated, and a novel aspect has been added in the form of mouse
liver microsomal half-life (MLM t1/2)
and in vitro-based Mtb models incorporating
cytotoxicity data that were used to predict in vivo activity for comparison. Our best Mtbin
vivo models possess fivefold ROC values > 0.7, sensitivity
> 80%, and concordance > 60%, while the best specificity value
is
>40%. Use of an MLM t1/2 Bayesian model
affords comparable results for scoring the 60 compounds tested. Combining
MLM stability and in vitroMtb models
in a novel consensus workflow in the best cases has a positive predicted
value (hit rate) > 77%. Our results indicate that Bayesian models
constructed with literature in vivoMtb data generated by different laboratories in various mouse models
can have predictive value and may be used alongside MLM t1/2 and in vitro-based Mtb models to assist in selecting antitubercular compounds with desirable in vivo efficacy. We demonstrate for the first time that
consensus models of any kind can be used to predict in vivo activity for Mtb. In addition, we describe a new
clustering method for data visualization and apply this to the in vivo training and test data, ultimately making the method
accessible in a mobile app.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborative Drug Discovery , 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States.,Collaborations in Chemistry , 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina 27526, United States
| | - Alexander L Perryman
- Department of Pharmacology, Physiology and Neuroscience, Rutgers University-New Jersey Medical School , Newark, New Jersey 07103, United States
| | - Alex M Clark
- Molecular Materials Informatics, Inc. , 1900 St. Jacques #302, Montreal, Quebec H3J 2S1, Canada
| | - Robert C Reynolds
- Division of Hematology and Oncology, Department of Medicine, and Department of Chemistry, College of Arts and Sciences, University of Alabama at Birmingham , 1530 Third Avenue South, Birmingham, Alabama 35294-1240, United States
| | - Joel S Freundlich
- Department of Pharmacology, Physiology and Neuroscience, Rutgers University-New Jersey Medical School , Newark, New Jersey 07103, United States.,Division of Infectious Diseases, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University-New Jersey Medical School , Newark, New Jersey 07103, United States
| |
Collapse
|
20
|
Predictive modeling targets thymidylate synthase ThyX in Mycobacterium tuberculosis. Sci Rep 2016; 6:27792. [PMID: 27283217 PMCID: PMC4901301 DOI: 10.1038/srep27792] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2015] [Accepted: 05/23/2016] [Indexed: 01/26/2023] Open
Abstract
There is an urgent need to identify new treatments for tuberculosis (TB), a major infectious disease caused by Mycobacterium tuberculosis (Mtb), which results in 1.5 million deaths each year. We have targeted two essential enzymes in this organism that are promising for antibacterial therapy and reported to be inhibited by naphthoquinones. ThyX is an essential thymidylate synthase that is mechanistically and structurally unrelated to the human enzyme. DNA gyrase is a DNA topoisomerase present in bacteria and plants but not animals. The current study set out to understand the structure-activity relationships of these targets in Mtb using a combination of cheminformatics and in vitro screening. Here, we report the identification of new Mtb ThyX inhibitors, 2-chloro-3-(4-methanesulfonylpiperazin-1-yl)-1,4-dihydronaphthalene-1,4-dione) and idebenone, which show modest whole-cell activity and appear to act, at least in part, by targeting ThyX in Mtb.
Collapse
|
21
|
Putative histidine kinase inhibitors with antibacterial effect against multi-drug resistant clinical isolates identified by in vitro and in silico screens. Sci Rep 2016; 6:26085. [PMID: 27173778 PMCID: PMC4865847 DOI: 10.1038/srep26085] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 03/15/2016] [Indexed: 01/21/2023] Open
Abstract
Novel antibacterials are urgently needed to address the growing problem of bacterial resistance to conventional antibiotics. Two-component systems (TCS) are widely used by bacteria to regulate gene expression in response to various environmental stimuli and physiological stress and have been previously proposed as promising antibacterial targets. TCS consist of a sensor histidine kinase (HK) and an effector response regulator. The HK component contains a highly conserved ATP-binding site that is considered to be a promising target for broad-spectrum antibacterial drugs. Here, we describe the identification of putative HK autophosphorylation inhibitors following two independent experimental approaches: in vitro fragment-based screen via differential scanning fluorimetry and in silico structure-based screening, each followed up by the exploration of analogue compounds as identified by ligand-based similarity searches. Nine of the tested compounds showed antibacterial effect against multi-drug resistant clinical isolates of bacterial pathogens and include three novel scaffolds, which have not been explored so far in other antibacterial compounds. Overall, putative HK autophosphorylation inhibitors were found that together provide a promising starting point for further optimization as antibacterials.
Collapse
|
22
|
Perryman AL, Stratton TP, Ekins S, Freundlich JS. Predicting Mouse Liver Microsomal Stability with "Pruned" Machine Learning Models and Public Data. Pharm Res 2016; 33:433-49. [PMID: 26415647 PMCID: PMC4712113 DOI: 10.1007/s11095-015-1800-5] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Accepted: 09/22/2015] [Indexed: 02/07/2023]
Abstract
PURPOSE Mouse efficacy studies are a critical hurdle to advance translational research of potential therapeutic compounds for many diseases. Although mouse liver microsomal (MLM) stability studies are not a perfect surrogate for in vivo studies of metabolic clearance, they are the initial model system used to assess metabolic stability. Consequently, we explored the development of machine learning models that can enhance the probability of identifying compounds possessing MLM stability. METHODS Published assays on MLM half-life values were identified in PubChem, reformatted, and curated to create a training set with 894 unique small molecules. These data were used to construct machine learning models assessed with internal cross-validation, external tests with a published set of antitubercular compounds, and independent validation with an additional diverse set of 571 compounds (PubChem data on percent metabolism). RESULTS "Pruning" out the moderately unstable / moderately stable compounds from the training set produced models with superior predictive power. Bayesian models displayed the best predictive power for identifying compounds with a half-life ≥1 h. CONCLUSIONS Our results suggest the pruning strategy may be of general benefit to improve test set enrichment and provide machine learning models with enhanced predictive value for the MLM stability of small organic molecules. This study represents the most exhaustive study to date of using machine learning approaches with MLM data from public sources.
Collapse
Affiliation(s)
- Alexander L Perryman
- Division of Infectious Disease, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University-New Jersey Medical School, Newark, New Jersey, 07103, USA
| | - Thomas P Stratton
- Department of Pharmacology & Physiology, Rutgers University-New Jersey Medical School, Medical Sciences Building, I-503, 185 South Orange Ave., Newark, New Jersey, 07103, USA
| | - Sean Ekins
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC, 27526, USA
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA, 94010, USA
| | - Joel S Freundlich
- Division of Infectious Disease, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University-New Jersey Medical School, Newark, New Jersey, 07103, USA.
- Department of Pharmacology & Physiology, Rutgers University-New Jersey Medical School, Medical Sciences Building, I-503, 185 South Orange Ave., Newark, New Jersey, 07103, USA.
| |
Collapse
|
23
|
Clark AM, Dole K, Ekins S. Open Source Bayesian Models. 3. Composite Models for Prediction of Binned Responses. J Chem Inf Model 2016; 56:275-85. [PMID: 26750305 PMCID: PMC4764945 DOI: 10.1021/acs.jcim.5b00555] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
![]()
Bayesian models constructed from
structure-derived fingerprints
have been a popular and useful method for drug discovery research
when applied to bioactivity measurements that can be effectively classified
as active or inactive. The results can be used to rank candidate structures
according to their probability of activity, and this ranking benefits
from the high degree of interpretability when structure-based fingerprints
are used, making the results chemically intuitive. Besides selecting
an activity threshold, building a Bayesian model is fast and requires
few or no parameters or user intervention. The method also does not
suffer from such acute overtraining problems as quantitative structure–activity
relationships or quantitative structure–property relationships
(QSAR/QSPR). This makes it an approach highly suitable for automated
workflows that are independent of user expertise or prior knowledge
of the training data. We now describe a new method for creating a
composite group of Bayesian models to extend the method to work with
multiple states, rather than just binary. Incoming activities are
divided into bins, each covering a mutually exclusive range of activities.
For each of these bins, a Bayesian model is created to model whether
or not the compound belongs in the bin. Analyzing putative molecules
using the composite model involves making a prediction for each bin
and examining the relative likelihood for each assignment, for example,
highest value wins. The method has been evaluated on a collection
of hundreds of data sets extracted from ChEMBL v20 and validated data
sets for ADME/Tox and bioactivity.
Collapse
Affiliation(s)
- Alex M Clark
- Molecular Materials Informatics, Inc. , 1900 St. Jacques #302, Montreal H3J 2S1, Quebec, Canada
| | - Krishna Dole
- Collaborative Drug Discovery, Inc. , 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States
| | - Sean Ekins
- Collaborative Drug Discovery, Inc. , 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States.,Collaborations in Chemistry , 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina 27526, United States
| |
Collapse
|
24
|
Ekins S, Freundlich JS, Clark AM, Anantpadma M, Davey RA, Madrid P. Machine learning models identify molecules active against the Ebola virus in vitro. F1000Res 2016; 4:1091. [PMID: 26834994 DOI: 10.12688/f1000research.7217.2] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/23/2015] [Indexed: 12/15/2022] Open
Abstract
The search for small molecule inhibitors of Ebola virus (EBOV) has led to several high throughput screens over the past 3 years. These have identified a range of FDA-approved active pharmaceutical ingredients (APIs) with anti-EBOV activity in vitro and several of which are also active in a mouse infection model. There are millions of additional commercially-available molecules that could be screened for potential activities as anti-EBOV compounds. One way to prioritize compounds for testing is to generate computational models based on the high throughput screening data and then virtually screen compound libraries. In the current study, we have generated Bayesian machine learning models with viral pseudotype entry assay and the EBOV replication assay data. We have validated the models internally and externally. We have also used these models to computationally score the MicroSource library of drugs to select those likely to be potential inhibitors. Three of the highest scoring molecules that were not in the model training sets, quinacrine, pyronaridine and tilorone, were tested in vitro and had EC50 values of 350, 420 and 230 nM, respectively. Pyronaridine is a component of a combination therapy for malaria that was recently approved by the European Medicines Agency, which may make it more readily accessible for clinical testing. Like other known antimalarial drugs active against EBOV, it shares the 4-aminoquinoline scaffold. Tilorone, is an investigational antiviral agent that has shown a broad array of biological activities including cell growth inhibition in cancer cells, antifibrotic properties, α7 nicotinic receptor agonist activity, radioprotective activity and activation of hypoxia inducible factor-1. Quinacrine is an antimalarial but also has use as an anthelmintic. Our results suggest data sets with less than 1,000 molecules can produce validated machine learning models that can in turn be utilized to identify novel EBOV inhibitors in vitro.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry, Fuquay-Varina, NC, 27526, USA
- Collaborations Pharmaceuticals Inc, Fuquay-Varina, NC, 27526, USA
- Collaborative Drug Discovery, Burlingame, CA, 94010, USA
| | - Joel S Freundlich
- Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, UMDNJ, New Jersey Medical School, Newark, NJ, 07103, USA
| | - Alex M Clark
- Molecular Materials Informatics, Inc., Montreal, 94025, Canada
| | - Manu Anantpadma
- Texas Biomedical Research Institute, San Antonio, TX, 78227, USA
| | - Robert A Davey
- Texas Biomedical Research Institute, San Antonio, TX, 78227, USA
| | | |
Collapse
|
25
|
Ekins S, Madrid PB, Sarker M, Li SG, Mittal N, Kumar P, Wang X, Stratton TP, Zimmerman M, Talcott C, Bourbon P, Travers M, Yadav M, Freundlich JS. Combining Metabolite-Based Pharmacophores with Bayesian Machine Learning Models for Mycobacterium tuberculosis Drug Discovery. PLoS One 2015; 10:e0141076. [PMID: 26517557 PMCID: PMC4627656 DOI: 10.1371/journal.pone.0141076] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Accepted: 10/05/2015] [Indexed: 12/15/2022] Open
Abstract
Integrated computational approaches for Mycobacterium tuberculosis (Mtb) are useful to identify new molecules that could lead to future tuberculosis (TB) drugs. Our approach uses information derived from the TBCyc pathway and genome database, the Collaborative Drug Discovery TB database combined with 3D pharmacophores and dual event Bayesian models of whole-cell activity and lack of cytotoxicity. We have prioritized a large number of molecules that may act as mimics of substrates and metabolites in the TB metabolome. We computationally searched over 200,000 commercial molecules using 66 pharmacophores based on substrates and metabolites from Mtb and further filtering with Bayesian models. We ultimately tested 110 compounds in vitro that resulted in two compounds of interest, BAS 04912643 and BAS 00623753 (MIC of 2.5 and 5 μg/mL, respectively). These molecules were used as a starting point for hit-to-lead optimization. The most promising class proved to be the quinoxaline di-N-oxides, evidenced by transcriptional profiling to induce mRNA level perturbations most closely resembling known protonophores. One of these, SRI58 exhibited an MIC = 1.25 μg/mL versus Mtb and a CC50 in Vero cells of >40 μg/mL, while featuring fair Caco-2 A-B permeability (2.3 x 10−6 cm/s), kinetic solubility (125 μM at pH 7.4 in PBS) and mouse metabolic stability (63.6% remaining after 1 h incubation with mouse liver microsomes). Despite demonstration of how a combined bioinformatics/cheminformatics approach afforded a small molecule with promising in vitro profiles, we found that SRI58 did not exhibit quantifiable blood levels in mice.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborative Drug Discovery Inc., 1633 Bayshore Highway, Suite 342, Burlingame, CA, 94010, United States of America
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC, 27526, United States of America
- * E-mail: (SE); (PBM); (JSF)
| | - Peter B. Madrid
- SRI International, 333 Ravenswood Avenue, Menlo Park, CA, 94025, United States of America
- * E-mail: (SE); (PBM); (JSF)
| | - Malabika Sarker
- SRI International, 333 Ravenswood Avenue, Menlo Park, CA, 94025, United States of America
| | - Shao-Gang Li
- Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, Rutgers University–New Jersey Medical School, 185 South Orange Avenue, Newark, NJ, 07103, United States of America
| | - Nisha Mittal
- Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, Rutgers University–New Jersey Medical School, 185 South Orange Avenue, Newark, NJ, 07103, United States of America
| | - Pradeep Kumar
- Department of Medicine, Center for Emerging and Reemerging Pathogens, Rutgers University–New Jersey Medical School, 185 South Orange Avenue, Newark, NJ, 07103, United States of America
| | - Xin Wang
- Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, Rutgers University–New Jersey Medical School, 185 South Orange Avenue, Newark, NJ, 07103, United States of America
| | - Thomas P. Stratton
- Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, Rutgers University–New Jersey Medical School, 185 South Orange Avenue, Newark, NJ, 07103, United States of America
| | - Matthew Zimmerman
- Public Health Research Institute, Rutgers University–New Jersey Medical School, Newark, NJ, 07103, United States of America
| | - Carolyn Talcott
- SRI International, 333 Ravenswood Avenue, Menlo Park, CA, 94025, United States of America
| | - Pauline Bourbon
- SRI International, 333 Ravenswood Avenue, Menlo Park, CA, 94025, United States of America
| | - Mike Travers
- Collaborative Drug Discovery Inc., 1633 Bayshore Highway, Suite 342, Burlingame, CA, 94010, United States of America
| | - Maneesh Yadav
- SRI International, 333 Ravenswood Avenue, Menlo Park, CA, 94025, United States of America
| | - Joel S. Freundlich
- Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, Rutgers University–New Jersey Medical School, 185 South Orange Avenue, Newark, NJ, 07103, United States of America
- * E-mail: (SE); (PBM); (JSF)
| |
Collapse
|
26
|
Ekins S, Freundlich JS, Clark AM, Anantpadma M, Davey RA, Madrid P. Machine learning models identify molecules active against the Ebola virus in vitro. F1000Res 2015; 4:1091. [PMID: 26834994 DOI: 10.12688/f1000research.7217.1] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/15/2015] [Indexed: 12/23/2022] Open
Abstract
The search for small molecule inhibitors of Ebola virus (EBOV) has led to several high throughput screens over the past 3 years. These have identified a range of FDA-approved active pharmaceutical ingredients (APIs) with anti-EBOV activity in vitro and several of which are also active in a mouse infection model. There are millions of additional commercially-available molecules that could be screened for potential activities as anti-EBOV compounds. One way to prioritize compounds for testing is to generate computational models based on the high throughput screening data and then virtually screen compound libraries. In the current study, we have generated Bayesian machine learning models with viral pseudotype entry assay and the EBOV replication assay data. We have validated the models internally and externally. We have also used these models to computationally score the MicroSource library of drugs to select those likely to be potential inhibitors. Three of the highest scoring molecules that were not in the model training sets, quinacrine, pyronaridine and tilorone, were tested in vitro and had EC 50 values of 350, 420 and 230 nM, respectively. Pyronaridine is a component of a combination therapy for malaria that was recently approved by the European Medicines Agency, which may make it more readily accessible for clinical testing. Like other known antimalarial drugs active against EBOV, it shares the 4-aminoquinoline scaffold. Tilorone, is an investigational antiviral agent that has shown a broad array of biological activities including cell growth inhibition in cancer cells, antifibrotic properties, α7 nicotinic receptor agonist activity, radioprotective activity and activation of hypoxia inducible factor-1. Quinacrine is an antimalarial but also has use as an anthelmintic. Our results suggest data sets with less than 1,000 molecules can produce validated machine learning models that can in turn be utilized to identify novel EBOV inhibitors in vitro.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry, Fuquay-Varina, NC, 27526, USA.,Collaborations Pharmaceuticals Inc, Fuquay-Varina, NC, 27526, USA.,Collaborative Drug Discovery, Burlingame, CA, 94010, USA
| | - Joel S Freundlich
- Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, UMDNJ, New Jersey Medical School, Newark, NJ, 07103, USA
| | - Alex M Clark
- Molecular Materials Informatics, Inc., Montreal, 94025, Canada
| | - Manu Anantpadma
- Texas Biomedical Research Institute, San Antonio, TX, 78227, USA
| | - Robert A Davey
- Texas Biomedical Research Institute, San Antonio, TX, 78227, USA
| | | |
Collapse
|
27
|
Ekins S, Freundlich JS, Clark AM, Anantpadma M, Davey RA, Madrid P. Machine learning models identify molecules active against the Ebola virus in vitro. F1000Res 2015; 4:1091. [PMID: 26834994 PMCID: PMC4706063 DOI: 10.12688/f1000research.7217.3] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/17/2017] [Indexed: 12/21/2022] Open
Abstract
The search for small molecule inhibitors of Ebola virus (EBOV) has led to several high throughput screens over the past 3 years. These have identified a range of FDA-approved active pharmaceutical ingredients (APIs) with anti-EBOV activity
in vitro and several of which are also active in a mouse infection model. There are millions of additional commercially-available molecules that could be screened for potential activities as anti-EBOV compounds. One way to prioritize compounds for testing is to generate computational models based on the high throughput screening data and then virtually screen compound libraries. In the current study, we have generated Bayesian machine learning models with viral pseudotype entry assay and the EBOV replication assay data. We have validated the models internally and externally. We have also used these models to computationally score the MicroSource library of drugs to select those likely to be potential inhibitors. Three of the highest scoring molecules that were not in the model training sets, quinacrine, pyronaridine and tilorone, were tested
in vitro and had EC
50 values of 350, 420 and 230 nM, respectively. Pyronaridine is a component of a combination therapy for malaria that was recently approved by the European Medicines Agency, which may make it more readily accessible for clinical testing. Like other known antimalarial drugs active against EBOV, it shares the 4-aminoquinoline scaffold. Tilorone, is an investigational antiviral agent that has shown a broad array of biological activities including cell growth inhibition in cancer cells, antifibrotic properties, α7 nicotinic receptor agonist activity, radioprotective activity and activation of hypoxia inducible factor-1. Quinacrine is an antimalarial but also has use as an anthelmintic. Our results suggest data sets with less than 1,000 molecules can produce validated machine learning models that can in turn be utilized to identify novel EBOV inhibitors
in vitro.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry, Fuquay-Varina, NC, 27526, USA.,Collaborations Pharmaceuticals Inc, Fuquay-Varina, NC, 27526, USA.,Collaborative Drug Discovery, Burlingame, CA, 94010, USA
| | - Joel S Freundlich
- Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, UMDNJ, New Jersey Medical School, Newark, NJ, 07103, USA
| | - Alex M Clark
- Molecular Materials Informatics, Inc., Montreal, 94025, Canada
| | - Manu Anantpadma
- Texas Biomedical Research Institute, San Antonio, TX, 78227, USA
| | - Robert A Davey
- Texas Biomedical Research Institute, San Antonio, TX, 78227, USA
| | | |
Collapse
|
28
|
Ekins S, Litterman NK, Lipinski CA, Bunin BA. Thermodynamic Proxies to Compensate for Biases in Drug Discovery Methods. Pharm Res 2015; 33:194-205. [DOI: 10.1007/s11095-015-1779-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2015] [Accepted: 08/13/2015] [Indexed: 11/24/2022]
|
29
|
Wicht KJ, Combrinck JM, Smith PJ, Egan TJ. Bayesian models trained with HTS data for predicting β-haematin inhibition and in vitro antimalarial activity. Bioorg Med Chem 2015; 23:5210-7. [PMID: 25573118 PMCID: PMC4475507 DOI: 10.1016/j.bmc.2014.12.020] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2014] [Revised: 12/09/2014] [Accepted: 12/11/2014] [Indexed: 11/29/2022]
Abstract
A large quantity of high throughput screening (HTS) data for antimalarial activity has become available in recent years. This includes both phenotypic and target-based activity. Realising the maximum value of these data remains a challenge. In this respect, methods that allow such data to be used for virtual screening maximise efficiency and reduce costs. In this study both in vitro antimalarial activity and inhibitory data for β-haematin formation, largely obtained from publically available sources, has been used to develop Bayesian models for inhibitors of β-haematin formation and in vitro antimalarial activity. These models were used to screen two in silico compound libraries. In the first, the 1510 U.S. Food and Drug Administration approved drugs available on PubChem were ranked from highest to lowest Bayesian score based on a training set of β-haematin inhibiting compounds active against Plasmodium falciparum that did not include any of the clinical antimalarials or close analogues. The six known clinical antimalarials that inhibit β-haematin formation were ranked in the top 2.1% of compounds. Furthermore, the in vitro antimalarial hit-rate for this prioritised set of compounds was found to be 81% in the case of the subset where activity data are available in PubChem. In the second, a library of about 5000 commercially available compounds (Aldrich(CPR)) was virtually screened for ability to inhibit β-haematin formation and then for in vitro antimalarial activity. A selection of 34 compounds was purchased and tested, of which 24 were predicted to be β-haematin inhibitors. The hit rate for inhibition of β-haematin formation was found to be 25% and a third of these were active against P. falciparum, corresponding to enrichments estimated at about 25- and 140-fold relative to random screening, respectively.
Collapse
Affiliation(s)
- Kathryn J Wicht
- Department of Chemistry, University of Cape Town, Rondebosch 7701, South Africa
| | - Jill M Combrinck
- Department of Chemistry, University of Cape Town, Rondebosch 7701, South Africa; Division of Pharmacology, Department of Medicine, Faculty of Health Sciences, University of Cape Town, Observatory 7925, South Africa
| | - Peter J Smith
- Division of Pharmacology, Department of Medicine, Faculty of Health Sciences, University of Cape Town, Observatory 7925, South Africa
| | - Timothy J Egan
- Department of Chemistry, University of Cape Town, Rondebosch 7701, South Africa.
| |
Collapse
|
30
|
Ekins S, Lage de Siqueira-Neto J, McCall LI, Sarker M, Yadav M, Ponder EL, Kallel EA, Kellar D, Chen S, Arkin M, Bunin BA, McKerrow JH, Talcott C. Machine Learning Models and Pathway Genome Data Base for Trypanosoma cruzi Drug Discovery. PLoS Negl Trop Dis 2015; 9:e0003878. [PMID: 26114876 PMCID: PMC4482694 DOI: 10.1371/journal.pntd.0003878] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2015] [Accepted: 06/05/2015] [Indexed: 12/21/2022] Open
Abstract
Background Chagas disease is a neglected tropical disease (NTD) caused by the eukaryotic parasite Trypanosoma cruzi. The current clinical and preclinical pipeline for T. cruzi is extremely sparse and lacks drug target diversity. Methodology/Principal Findings In the present study we developed a computational approach that utilized data from several public whole-cell, phenotypic high throughput screens that have been completed for T. cruzi by the Broad Institute, including a single screen of over 300,000 molecules in the search for chemical probes as part of the NIH Molecular Libraries program. We have also compiled and curated relevant biological and chemical compound screening data including (i) compounds and biological activity data from the literature, (ii) high throughput screening datasets, and (iii) predicted metabolites of T. cruzi metabolic pathways. This information was used to help us identify compounds and their potential targets. We have constructed a Pathway Genome Data Base for T. cruzi. In addition, we have developed Bayesian machine learning models that were used to virtually screen libraries of compounds. Ninety-seven compounds were selected for in vitro testing, and 11 of these were found to have EC50 < 10μM. We progressed five compounds to an in vivo mouse efficacy model of Chagas disease and validated that the machine learning model could identify in vitro active compounds not in the training set, as well as known positive controls. The antimalarial pyronaridine possessed 85.2% efficacy in the acute Chagas mouse model. We have also proposed potential targets (for future verification) for this compound based on structural similarity to known compounds with targets in T. cruzi. Conclusions/ Significance We have demonstrated how combining chemoinformatics and bioinformatics for T. cruzi drug discovery can bring interesting in vivo active molecules to light that may have been overlooked. The approach we have taken is broadly applicable to other NTDs. Chagas disease is a neglected tropical disease (NTD) caused by the eukaryotic parasite Trypanosoma cruzi. The disease is endemic to Latin America but is increasingly found in North America and Europe, primarily through immigration, and the spread of this disease is bringing new attention to the need for novel, safe, and effective therapeutics to treat T. cruzi infection. We have used data from a phenotypic screen to build Bayesian models to predict anti-parasitic activity against T. cruzi in vitro. These models were used to score various small libraries of molecules. We selected less than 100 compounds for testing and found in vitro actives, some of which were tested in an in vivo efficacy model. We identified the antimalarial pyronaridine as having in vivo efficacy and provides us with a new starting point for further investigation and optimization.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborative Drug Discovery, Burlingame, California, United States of America
- Collaborations in Chemistry, Fuquay-Varina, North Carolina, United States of America
- * E-mail:
| | - Jair Lage de Siqueira-Neto
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, California, United States of America
| | - Laura-Isobel McCall
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, California, United States of America
| | - Malabika Sarker
- SRI International, Menlo Park, California, United States of America
| | - Maneesh Yadav
- SRI International, Menlo Park, California, United States of America
| | - Elizabeth L. Ponder
- Chemistry, Engineering & Medicine for Human Health (ChEM-H), Stanford, California, United States of America
| | - E. Adam Kallel
- Collaborative Drug Discovery, Burlingame, California, United States of America
| | - Danielle Kellar
- Department of Pathology, University of California, San Francisco, San Francisco, California, United States of America
| | - Steven Chen
- Small Molecule Discovery Center and Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California, United States of America
| | - Michelle Arkin
- Small Molecule Discovery Center and Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California, United States of America
| | - Barry A. Bunin
- Collaborative Drug Discovery, Burlingame, California, United States of America
| | - James H. McKerrow
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, California, United States of America
| | - Carolyn Talcott
- SRI International, Menlo Park, California, United States of America
| |
Collapse
|
31
|
Mori G, Chiarelli LR, Esposito M, Makarov V, Bellinzoni M, Hartkoorn RC, Degiacomi G, Boldrin F, Ekins S, de Jesus Lopes Ribeiro AL, Marino LB, Centárová I, Svetlíková Z, Blaško J, Kazakova E, Lepioshkin A, Barilone N, Zanoni G, Porta A, Fondi M, Fani R, Baulard AR, Mikušová K, Alzari PM, Manganelli R, de Carvalho LPS, Riccardi G, Cole ST, Pasca MR. Thiophenecarboxamide Derivatives Activated by EthA Kill Mycobacterium tuberculosis by Inhibiting the CTP Synthetase PyrG. ACTA ACUST UNITED AC 2015; 22:917-27. [PMID: 26097035 PMCID: PMC4521081 DOI: 10.1016/j.chembiol.2015.05.016] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2015] [Revised: 05/25/2015] [Accepted: 05/30/2015] [Indexed: 11/06/2022]
Abstract
To combat the emergence of drug-resistant strains of Mycobacterium tuberculosis, new antitubercular agents and novel drug targets are needed. Phenotypic screening of a library of 594 hit compounds uncovered two leads that were active against M. tuberculosis in its replicating, non-replicating, and intracellular states: compounds 7947882 (5-methyl-N-(4-nitrophenyl)thiophene-2-carboxamide) and 7904688 (3-phenyl-N-[(4-piperidin-1-ylphenyl)carbamothioyl]propanamide). Mutants resistant to both compounds harbored mutations in ethA (rv3854c), the gene encoding the monooxygenase EthA, and/or in pyrG (rv1699) coding for the CTP synthetase, PyrG. Biochemical investigations demonstrated that EthA is responsible for the activation of the compounds, and by mass spectrometry we identified the active metabolite of 7947882, which directly inhibits PyrG activity. Metabolomic studies revealed that pharmacological inhibition of PyrG strongly perturbs DNA and RNA biosynthesis, and other metabolic processes requiring nucleotides. Finally, the crystal structure of PyrG was solved, paving the way for rational drug design with this newly validated drug target. Two compounds activated by EthA kill M. tuberculosis through PyrG inhibition EthA metabolite is active against PyrG and M. tuberculosis growth Definition of the mechanism of activation and validation of PyrG as a new drug target
Collapse
Affiliation(s)
- Giorgia Mori
- Department of Biology and Biotechnology "Lazzaro Spallanzani", University of Pavia, 27100 Pavia, Italy
| | - Laurent R Chiarelli
- Department of Biology and Biotechnology "Lazzaro Spallanzani", University of Pavia, 27100 Pavia, Italy
| | - Marta Esposito
- Department of Biology and Biotechnology "Lazzaro Spallanzani", University of Pavia, 27100 Pavia, Italy
| | - Vadim Makarov
- A. N. Bakh Institute of Biochemistry, Russian Academy of Science, 119071 Moscow, Russia
| | - Marco Bellinzoni
- Institut Pasteur, Unité de Microbiologie Structurale, CNRS-UMR3528, Université Paris Diderot, Sorbonne Paris Cité, 25 rue du Dr. Roux, 75724 Paris Cedex 15, France
| | - Ruben C Hartkoorn
- Global Health Institute, Ecole Polytechnique Fédérale de Lausanne, Station 19, 1015 Lausanne, Switzerland
| | - Giulia Degiacomi
- Department of Molecular Medicine, University of Padova, 35128 Padua, Italy
| | - Francesca Boldrin
- Department of Molecular Medicine, University of Padova, 35128 Padua, Italy
| | - Sean Ekins
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94010, USA
| | | | - Leonardo B Marino
- Francis Crick Institute, Mill Hill Laboratory, The Ridgeway, Mill Hill, London NW7 1AA, UK; Faculty of Pharmaceutical Sciences, UNESP - Univ Estadual Paulista, Araraquara, São Paulo 14801-902, Brazil
| | - Ivana Centárová
- Department of Biochemistry, Faculty of Natural Sciences, Comenius University in Bratislava, Ilkovičova 6, Mlynská dolina, 84215 Bratislava, Slovakia
| | - Zuzana Svetlíková
- Department of Biochemistry, Faculty of Natural Sciences, Comenius University in Bratislava, Ilkovičova 6, Mlynská dolina, 84215 Bratislava, Slovakia
| | - Jaroslav Blaško
- Institute of Chemistry, Faculty of Natural Sciences, Comenius University in Bratislava, Ilkovičova 6, Mlynská dolina, 84215 Bratislava, Slovak Republic
| | - Elena Kazakova
- A. N. Bakh Institute of Biochemistry, Russian Academy of Science, 119071 Moscow, Russia
| | - Alexander Lepioshkin
- A. N. Bakh Institute of Biochemistry, Russian Academy of Science, 119071 Moscow, Russia
| | - Nathalie Barilone
- Institut Pasteur, Unité de Microbiologie Structurale, CNRS-UMR3528, Université Paris Diderot, Sorbonne Paris Cité, 25 rue du Dr. Roux, 75724 Paris Cedex 15, France
| | - Giuseppe Zanoni
- Department of Chemistry, University of Pavia, 27100 Pavia, Italy
| | - Alessio Porta
- Department of Chemistry, University of Pavia, 27100 Pavia, Italy
| | - Marco Fondi
- Department of Biology, University of Florence, Sesto Fiorentino, Florence 50019, Italy
| | - Renato Fani
- Department of Biology, University of Florence, Sesto Fiorentino, Florence 50019, Italy
| | - Alain R Baulard
- Institut Pasteur de Lille, Center for Infection and Immunity, 59019 Lille, France
| | - Katarína Mikušová
- Department of Biochemistry, Faculty of Natural Sciences, Comenius University in Bratislava, Ilkovičova 6, Mlynská dolina, 84215 Bratislava, Slovakia
| | - Pedro M Alzari
- Institut Pasteur, Unité de Microbiologie Structurale, CNRS-UMR3528, Université Paris Diderot, Sorbonne Paris Cité, 25 rue du Dr. Roux, 75724 Paris Cedex 15, France
| | | | | | - Giovanna Riccardi
- Department of Biology and Biotechnology "Lazzaro Spallanzani", University of Pavia, 27100 Pavia, Italy
| | - Stewart T Cole
- Global Health Institute, Ecole Polytechnique Fédérale de Lausanne, Station 19, 1015 Lausanne, Switzerland.
| | - Maria Rosalia Pasca
- Department of Biology and Biotechnology "Lazzaro Spallanzani", University of Pavia, 27100 Pavia, Italy.
| |
Collapse
|
32
|
Clark AM, Dole K, Coulon-Spektor A, McNutt A, Grass G, Freundlich JS, Reynolds RC, Ekins S. Open Source Bayesian Models. 1. Application to ADME/Tox and Drug Discovery Datasets. J Chem Inf Model 2015; 55:1231-45. [PMID: 25994950 PMCID: PMC4478615 DOI: 10.1021/acs.jcim.5b00143] [Citation(s) in RCA: 84] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
![]()
On the order of hundreds of absorption,
distribution, metabolism,
excretion, and toxicity (ADME/Tox) models have been described in the
literature in the past decade which are more often than not inaccessible
to anyone but their authors. Public accessibility is also an issue
with computational models for bioactivity, and the ability to share
such models still remains a major challenge limiting drug discovery.
We describe the creation of a reference implementation of a Bayesian
model-building software module, which we have released as an open
source component that is now included in the Chemistry Development
Kit (CDK) project, as well as implemented in the CDD Vault and
in several mobile apps. We use this implementation to build an array
of Bayesian models for ADME/Tox, in vitro and in vivo bioactivity, and other physicochemical properties.
We show that these models possess cross-validation receiver operator
curve values comparable to those generated previously in prior publications
using alternative tools. We have now described how the implementation
of Bayesian models with FCFP6 descriptors generated in the CDD Vault
enables the rapid production of robust machine learning models from
public data or the user’s own datasets. The current study sets
the stage for generating models in proprietary software (such as CDD)
and exporting these models in a format that could be run in open source
software using CDK components. This work also demonstrates that we
can enable biocomputation across distributed private or public datasets
to enhance drug discovery.
Collapse
Affiliation(s)
- Alex M Clark
- †Molecular Materials Informatics, Inc., 1900 St. Jacques No. 302, Montreal H3J 2S1, Quebec, Canada
| | - Krishna Dole
- ‡Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States
| | - Anna Coulon-Spektor
- ‡Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States
| | - Andrew McNutt
- ‡Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States
| | - George Grass
- §G2 Research, Inc., P.O. Box 1242, Tahoe City, California 96145, United States
| | | | - Robert C Reynolds
- #Department of Chemistry, College of Arts and Sciences, University of Alabama at Birmingham, , 1530 Third Avenue South, Birmingham, Alabama 35294-1240, United States
| | - Sean Ekins
- ‡Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States.,∇Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina 27526, United States
| |
Collapse
|
33
|
Clark AM, Ekins S. Open Source Bayesian Models. 2. Mining a "Big Dataset" To Create and Validate Models with ChEMBL. J Chem Inf Model 2015; 55:1246-60. [PMID: 25995041 DOI: 10.1021/acs.jcim.5b00144] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
In an associated paper, we have described a reference implementation of Laplacian-corrected naïve Bayesian model building using extended connectivity (ECFP)- and molecular function class fingerprints of maximum diameter 6 (FCFP)-type fingerprints. As a follow-up, we have now undertaken a large-scale validation study in order to ensure that the technique generalizes to a broad variety of drug discovery datasets. To achieve this, we have used the ChEMBL (version 20) database and split it into more than 2000 separate datasets, each of which consists of compounds and measurements with the same target and activity measurement. In order to test these datasets with the two-state Bayesian classification, we developed an automated algorithm for detecting a suitable threshold for active/inactive designation, which we applied to all collections. With these datasets, we were able to establish that our Bayesian model implementation is effective for the large majority of cases, and we were able to quantify the impact of fingerprint folding on the receiver operator curve cross-validation metrics. We were also able to study the impact that the choice of training/testing set partitioning has on the resulting recall rates. The datasets have been made publicly available to be downloaded, along with the corresponding model data files, which can be used in conjunction with the CDK and several mobile apps. We have also explored some novel visualization methods which leverage the structural origins of the ECFP/FCFP fingerprints to attribute regions of a molecule responsible for positive and negative contributions to activity. The ability to score molecules across thousands of relevant datasets across organisms also may help to access desirable and undesirable off-target effects as well as suggest potential targets for compounds derived from phenotypic screens.
Collapse
Affiliation(s)
- Alex M Clark
- †Molecular Materials Informatics, Inc., 1900 St. Jacques No. 302, Montreal H3J 2S1, Quebec, Canada
| | - Sean Ekins
- ‡Collaborations Pharmaceuticals, Inc., 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina 27526, United States.,§Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina 27526, United States.,∥Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States
| |
Collapse
|
34
|
Abstract
The recent outbreak of the Ebola virus in West Africa has highlighted the clear shortage of broad-spectrum antiviral drugs for emerging viruses. There are numerous FDA approved drugs and other small molecules described in the literature that could be further evaluated for their potential as antiviral compounds. These molecules are in addition to the few new antivirals that have been tested in Ebola patients but were not originally developed against the Ebola virus, and may play an important role as we await an effective vaccine. The balance between using FDA approved drugs versus novel antivirals with minimal safety and no efficacy data in humans should be considered. We have evaluated 55 molecules from the perspective of an experienced medicinal chemist as well as using simple molecular properties and have highlighted 16 compounds that have desirable qualities as well as those that may be less desirable. In addition we propose that a collaborative database for sharing such published and novel information on small molecules is needed for the research community studying the Ebola virus.
Collapse
Affiliation(s)
- Nadia Litterman
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA, 94010, USA
| | - Christopher Lipinski
- Christopher A. Lipinski, Ph.D., LLC., 10 Connshire Drive, Waterford, CT, 06385-4122, USA
| | - Sean Ekins
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA, 94010, USA ; Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay Varina, NC, 27526, USA
| |
Collapse
|
35
|
Ekins S, Clark AM, Swamidass SJ, Litterman N, Williams AJ. Bigger data, collaborative tools and the future of predictive drug discovery. J Comput Aided Mol Des 2014; 28:997-1008. [PMID: 24943138 PMCID: PMC4198464 DOI: 10.1007/s10822-014-9762-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2014] [Accepted: 06/09/2014] [Indexed: 12/31/2022]
Abstract
Over the past decade we have seen a growth in the provision of chemistry data and cheminformatics tools as either free websites or software as a service commercial offerings. These have transformed how we find molecule-related data and use such tools in our research. There have also been efforts to improve collaboration between researchers either openly or through secure transactions using commercial tools. A major challenge in the future will be how such databases and software approaches handle larger amounts of data as it accumulates from high throughput screening and enables the user to draw insights, enable predictions and move projects forward. We now discuss how information from some drug discovery datasets can be made more accessible and how privacy of data should not overwhelm the desire to share it at an appropriate time with collaborators. We also discuss additional software tools that could be made available and provide our thoughts on the future of predictive drug discovery in this age of big data. We use some examples from our own research on neglected diseases, collaborations, mobile apps and algorithm development to illustrate these ideas.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC, 27526, USA,
| | | | | | | | | |
Collapse
|
36
|
Clark AM, Sarker M, Ekins S. New target prediction and visualization tools incorporating open source molecular fingerprints for TB Mobile 2.0. J Cheminform 2014; 6:38. [PMID: 25302078 PMCID: PMC4190048 DOI: 10.1186/s13321-014-0038-2] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Accepted: 06/30/2014] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND We recently developed a freely available mobile app (TB Mobile) for both iOS and Android platforms that displays Mycobacterium tuberculosis (Mtb) active molecule structures and their targets with links to associated data. The app was developed to make target information available to as large an audience as possible. RESULTS We now report a major update of the iOS version of the app. This includes enhancements that use an implementation of ECFP_6 fingerprints that we have made open source. Using these fingerprints, the user can propose compounds with possible anti-TB activity, and view the compounds within a cluster landscape. Proposed compounds can also be compared to existing target data, using a näive Bayesian scoring system to rank probable targets. We have curated an additional 60 new compounds and their targets for Mtb and added these to the original set of 745 compounds. We have also curated 20 further compounds (many without targets in TB Mobile) to evaluate this version of the app with 805 compounds and associated targets. CONCLUSIONS TB Mobile can now manage a small collection of compounds that can be imported from external sources, or exported by various means such as email or app-to-app inter-process communication. This means that TB Mobile can be used as a node within a growing ecosystem of mobile apps for cheminformatics. It can also cluster compounds and use internal algorithms to help identify potential targets based on molecular similarity. TB Mobile represents a valuable dataset, data-visualization aid and target prediction tool.
Collapse
Affiliation(s)
- Alex M Clark
- Molecular Materials Informatics, 1900 St. Jacques #302, Montreal H3J 2S1, Quebec, Canada
| | - Malabika Sarker
- SRI International, 333 Ravenswood Avenue, Menlo Park 94025, CA, USA
| | - Sean Ekins
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame 94010, CA, USA
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina 27526, NC, USA
| |
Collapse
|
37
|
Niu X, Yang B, Fang S, Li Y, Zhang Z, Jia J, Ma C. An efficient one-pot synthesis of 1,2,4-triazoloquinoxalines. Tetrahedron 2014. [DOI: 10.1016/j.tet.2014.05.029] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
38
|
Ekins S, Freundlich JS, Reynolds RC. Are bigger data sets better for machine learning? Fusing single-point and dual-event dose response data for Mycobacterium tuberculosis. J Chem Inf Model 2014; 54:2157-65. [PMID: 24968215 DOI: 10.1021/ci500264r] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Tuberculosis is a major, neglected disease for which the quest to find new treatments continues. There is an abundance of data from large phenotypic screens in the public domain against Mycobacterium tuberculosis (Mtb). Since machine learning methods can learn from past data, we were interested in addressing whether more data builds better models. We now describe using Bayesian machine learning to assess whether we can improve our models by combining the large quantities of single-point data with the much smaller (higher quality) dual-event data sets, which use both dose-response data for both whole-cell antitubercular activity and Vero cell cytotoxicity. We have evaluated 12 models ranging from different single-point, dual-event dose-response, single-point and dual-event dose-response as well as combined data sets for three distinct data sets from the same laboratory. We used a fourth data set of active and inactive compounds from the same group as well as a smaller set of 177 active compounds from GlaxoSmithKline as test sets. Our data suggest combining single-point with dual-event dose-response data does not diminish the internal or external predictive ability of the models based on the receiver operator curve (ROC) for these models (internal ROC range 0.83-0.91, external ROC range 0.62-0.83) compared to the orders of magnitude smaller dual-event models (internal ROC range 0.6-0.83 and external ROC 0.54-0.83). In conclusion, models developed with 1200-5000 compounds appear to be as predictive as those generated with 25 000-350 000 molecules. Our results have implications for justifying further high-throughput screening versus focused testing based on model predictions.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry , 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina 27526, United States
| | | | | |
Collapse
|
39
|
Dahlin JL, Walters MA. The essential roles of chemistry in high-throughput screening triage. Future Med Chem 2014; 6:1265-90. [PMID: 25163000 PMCID: PMC4465542 DOI: 10.4155/fmc.14.60] [Citation(s) in RCA: 95] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
It is increasingly clear that academic high-throughput screening (HTS) and virtual HTS triage suffers from a lack of scientists trained in the art and science of early drug discovery chemistry. Many recent publications report the discovery of compounds by screening that are most likely artifacts or promiscuous bioactive compounds, and these results are not placed into the context of previous studies. For HTS to be most successful, it is our contention that there must exist an early partnership between biologists and medicinal chemists. Their combined skill sets are necessary to design robust assays and efficient workflows that will weed out assay artifacts, false positives, promiscuous bioactive compounds and intractable screening hits, efforts that ultimately give projects a better chance at identifying truly useful chemical matter. Expertise in medicinal chemistry, cheminformatics and purification sciences (analytical chemistry) can enhance the post-HTS triage process by quickly removing these problematic chemotypes from consideration, while simultaneously prioritizing the more promising chemical matter for follow-up testing. It is only when biologists and chemists collaborate effectively that HTS can manifest its full promise.
Collapse
Affiliation(s)
- Jayme L Dahlin
- Department of Molecular Pharmacology & Experimental Therapeutics, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
- Medical Scientist Training Program, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Michael A Walters
- Institute for Therapeutics Discovery & Development, University of Minnesota, Minneapolis, MN 55414, USA
| |
Collapse
|
40
|
Ekins S, Nuermberger EL, Freundlich JS. Minding the gaps in tuberculosis research. Drug Discov Today 2014; 19:1279-82. [PMID: 24993157 DOI: 10.1016/j.drudis.2014.06.022] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2014] [Revised: 06/20/2014] [Accepted: 06/23/2014] [Indexed: 10/25/2022]
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526, USA; Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94010, USA.
| | - Eric L Nuermberger
- Center for Tuberculosis Research, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Joel S Freundlich
- Department of Pharmacology & Physiology, Rutgers University - New Jersey Medical School, 185 South Orange Avenue, Newark, NJ 07103, USA; Department of Medicine, Center for Emerging and Reemerging Pathogens, Rutgers University - New Jersey Medical School, 185 South Orange Avenue, Newark, NJ 07103, USA.
| |
Collapse
|
41
|
Ekins S, Pottorf R, Reynolds R, Williams AJ, Clark AM, Freundlich JS. Looking back to the future: predicting in vivo efficacy of small molecules versus Mycobacterium tuberculosis. J Chem Inf Model 2014; 54:1070-82. [PMID: 24665947 PMCID: PMC4004261 DOI: 10.1021/ci500077v] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2014] [Indexed: 02/07/2023]
Abstract
Selecting and translating in vitro leads for a disease into molecules with in vivo activity in an animal model of the disease is a challenge that takes considerable time and money. As an example, recent years have seen whole-cell phenotypic screens of millions of compounds yielding over 1500 inhibitors of Mycobacterium tuberculosis (Mtb). These must be prioritized for testing in the mouse in vivo assay for Mtb infection, a validated model utilized to select compounds for further testing. We demonstrate learning from in vivo active and inactive compounds using machine learning classification models (Bayesian, support vector machines, and recursive partitioning) consisting of 773 compounds. The Bayesian model predicted 8 out of 11 additional in vivo actives not included in the model as an external test set. Curation of 70 years of Mtb data can therefore provide statistically robust computational models to focus resources on in vivo active small molecule antituberculars. This highlights a cost-effective predictor for in vivo testing elsewhere in other diseases.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborative
Drug Discovery, 1633
Bayshore Highway, Suite 342, Burlingame, California 94010, United States
- Collaborations
in Chemistry, 5616 Hilltop
Needmore Road, Fuquay-Varina, North Carolina 27526, United States
| | - Richard Pottorf
- Department
of Pharmacology & Physiology, Rutgers
University − New Jersey Medical School, 185 South Orange Avenue, Newark, New Jersey 07103, United States
| | - Robert
C. Reynolds
- Department
of Chemistry, University of Alabama at Birmingham, 1530 Third Avenue South, Birmingham, Alabama 35294-1240, United States
| | - Antony J. Williams
- Royal
Society of Chemistry, 904 Tamaras Circle, Wake Forest, North Carolina 27587, United States
| | - Alex M. Clark
- Molecular
Materials Informatics, 1900 St. Jacques #302, Montreal, Quebec, Canada H3J 2S1
| | - Joel S. Freundlich
- Department
of Pharmacology & Physiology, Rutgers
University − New Jersey Medical School, 185 South Orange Avenue, Newark, New Jersey 07103, United States
- Department
of Medicine, Center for Emerging and Reemerging
Pathogens, Rutgers University − New
Jersey Medical School, 185 South Orange Avenue, Newark, New Jersey 07103, United States
| |
Collapse
|
42
|
Ekins S, Casey AC, Roberts D, Parish T, Bunin BA. Bayesian models for screening and TB Mobile for target inference with Mycobacterium tuberculosis. Tuberculosis (Edinb) 2014; 94:162-9. [PMID: 24440548 PMCID: PMC4394018 DOI: 10.1016/j.tube.2013.12.001] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2013] [Revised: 12/04/2013] [Accepted: 12/09/2013] [Indexed: 12/19/2022]
Abstract
The search for compounds active against Mycobacterium tuberculosis is reliant upon high-throughput screening (HTS) in whole cells. We have used Bayesian machine learning models which can predict anti-tubercular activity to filter an internal library of over 150,000 compounds prior to in vitro testing. We used this to select and test 48 compounds in vitro; 11 were active with MIC values ranging from 0.4 μM to 10.2 μM, giving a high hit rate of 22.9%. Among the hits, we identified several compounds belonging to the same series including five quinolones (including ciprofloxacin), three molecules with long aliphatic linkers and three singletons. This approach represents a rapid method to prioritize compounds for testing that can be used alongside medicinal chemistry insight and other filters to identify active molecules. Such models can significantly increase the hit rate of HTS, above the usual 1% or lower rates seen. In addition, the potential targets for the 11 molecules were predicted using TB Mobile and clustering alongside a set of over 740 molecules with known M. tuberculosis target annotations. These predictions may serve as a mechanism for prioritizing compounds for further optimization.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94010, USA; Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526, USA.
| | - Allen C Casey
- Infectious Disease Research Institute, Seattle, WA, USA
| | - David Roberts
- Infectious Disease Research Institute, Seattle, WA, USA
| | - Tanya Parish
- Infectious Disease Research Institute, Seattle, WA, USA
| | - Barry A Bunin
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94010, USA
| |
Collapse
|
43
|
Ekins S, Freundlich JS, Reynolds RC. Fusing dual-event data sets for Mycobacterium tuberculosis machine learning models and their evaluation. J Chem Inf Model 2013; 53:3054-63. [PMID: 24144044 DOI: 10.1021/ci400480s] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
The search for new tuberculosis treatments continues as we need to find molecules that can act more quickly, be accommodated in multidrug regimens, and overcome ever increasing levels of drug resistance. Multiple large scale phenotypic high-throughput screens against Mycobacterium tuberculosis (Mtb) have generated dose response data, enabling the generation of machine learning models. These models also incorporated cytotoxicity data and were recently validated with a large external data set. A cheminformatics data-fusion approach followed by Bayesian machine learning, Support Vector Machine, or Recursive Partitioning model development (based on publicly available Mtb screening data) was used to compare individual data sets and subsequent combined models. A set of 1924 commercially available molecules with promising antitubercular activity (and lack of relative cytotoxicity to Vero cells) were used to evaluate the predictive nature of the models. We demonstrate that combining three data sets incorporating antitubercular and cytotoxicity data in Vero cells from our previous screens results in external validation receiver operator curve (ROC) of 0.83 (Bayesian or RP Forest). Models that do not have the highest 5-fold cross-validation ROC scores can outperform other models in a test set dependent manner. We demonstrate with predictions for a recently published set of Mtb leads from GlaxoSmithKline that no single machine learning model may be enough to identify compounds of interest. Data set fusion represents a further useful strategy for machine learning construction as illustrated with Mtb. Coverage of chemistry and Mtb target spaces may also be limiting factors for the whole-cell screening data generated to date.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States
| | | | | |
Collapse
|