1
|
Alqudaihi KS, Aslam N, Khan IU, Almuhaideb AM, Alsunaidi SJ, Ibrahim NMAR, Alhaidari FA, Shaikh FS, Alsenbel YM, Alalharith DM, Alharthi HM, Alghamdi WM, Alshahrani MS. Cough Sound Detection and Diagnosis Using Artificial Intelligence Techniques: Challenges and Opportunities. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2021; 9:102327-102344. [PMID: 34786317 PMCID: PMC8545201 DOI: 10.1109/access.2021.3097559] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 07/09/2021] [Indexed: 06/02/2023]
Abstract
Coughing is a common symptom of several respiratory diseases. The sound and type of cough are useful features to consider when diagnosing a disease. Respiratory infections pose a significant risk to human lives worldwide as well as a significant economic downturn, particularly in countries with limited therapeutic resources. In this study we reviewed the latest proposed technologies that were used to control the impact of respiratory diseases. Artificial Intelligence (AI) is a promising technology that aids in data analysis and prediction of results, thereby ensuring people's well-being. We conveyed that the cough symptom can be reliably used by AI algorithms to detect and diagnose different types of known diseases including pneumonia, pulmonary edema, asthma, tuberculosis (TB), COVID19, pertussis, and other respiratory diseases. We also identified different techniques that produced the best results for diagnosing respiratory disease using cough samples. This study presents the most recent challenges, solutions, and opportunities in respiratory disease detection and diagnosis, allowing practitioners and researchers to develop better techniques.
Collapse
Affiliation(s)
- Kawther S. Alqudaihi
- Department of Computer ScienceCollege of Computer Science and Information TechnologyImam Abdulrahman Bin Faisal UniversityDammam31441Saudi Arabia
| | - Nida Aslam
- Department of Computer ScienceCollege of Computer Science and Information TechnologyImam Abdulrahman Bin Faisal UniversityDammam31441Saudi Arabia
| | - Irfan Ullah Khan
- Department of Computer ScienceCollege of Computer Science and Information TechnologyImam Abdulrahman Bin Faisal UniversityDammam31441Saudi Arabia
| | - Abdullah M. Almuhaideb
- Department of Networks and CommunicationsCollege of Computer Science and Information TechnologyImam Abdulrahman Bin Faisal UniversityDammam31441Saudi Arabia
| | - Shikah J. Alsunaidi
- Department of Computer ScienceCollege of Computer Science and Information TechnologyImam Abdulrahman Bin Faisal UniversityDammam31441Saudi Arabia
| | - Nehad M. Abdel Rahman Ibrahim
- Department of Computer ScienceCollege of Computer Science and Information TechnologyImam Abdulrahman Bin Faisal UniversityDammam31441Saudi Arabia
| | - Fahd A. Alhaidari
- Department of Networks and CommunicationsCollege of Computer Science and Information TechnologyImam Abdulrahman Bin Faisal UniversityDammam31441Saudi Arabia
| | - Fatema S. Shaikh
- Department of Computer Information SystemsCollege of Computer Science and Information TechnologyImam Abdulrahman Bin Faisal UniversityDammam31441Saudi Arabia
| | - Yasmine M. Alsenbel
- Department of Computer ScienceCollege of Computer Science and Information TechnologyImam Abdulrahman Bin Faisal UniversityDammam31441Saudi Arabia
| | - Dima M. Alalharith
- Department of Computer ScienceCollege of Computer Science and Information TechnologyImam Abdulrahman Bin Faisal UniversityDammam31441Saudi Arabia
| | - Hajar M. Alharthi
- Department of Computer ScienceCollege of Computer Science and Information TechnologyImam Abdulrahman Bin Faisal UniversityDammam31441Saudi Arabia
| | - Wejdan M. Alghamdi
- Department of Computer ScienceCollege of Computer Science and Information TechnologyImam Abdulrahman Bin Faisal UniversityDammam31441Saudi Arabia
| | - Mohammed S. Alshahrani
- Department of Emergency MedicineCollege of MedicineImam Abdulrahman Bin Faisal UniversityDammam31441Saudi Arabia
| |
Collapse
|
2
|
Winkler DA. Use of Artificial Intelligence and Machine Learning for Discovery of Drugs for Neglected Tropical Diseases. Front Chem 2021; 9:614073. [PMID: 33791277 PMCID: PMC8005575 DOI: 10.3389/fchem.2021.614073] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 01/18/2021] [Indexed: 12/11/2022] Open
Abstract
Neglected tropical diseases continue to create high levels of morbidity and mortality in a sizeable fraction of the world’s population, despite ongoing research into new treatments. Some of the most important technological developments that have accelerated drug discovery for diseases of affluent countries have not flowed down to neglected tropical disease drug discovery. Pharmaceutical development business models, cost of developing new drug treatments and subsequent costs to patients, and accessibility of technologies to scientists in most of the affected countries are some of the reasons for this low uptake and slow development relative to that for common diseases in developed countries. Computational methods are starting to make significant inroads into discovery of drugs for neglected tropical diseases due to the increasing availability of large databases that can be used to train ML models, increasing accuracy of these methods, lower entry barrier for researchers, and widespread availability of public domain machine learning codes. Here, the application of artificial intelligence, largely the subset called machine learning, to modelling and prediction of biological activities and discovery of new drugs for neglected tropical diseases is summarized. The pathways for the development of machine learning methods in the short to medium term and the use of other artificial intelligence methods for drug discovery is discussed. The current roadblocks to, and likely impacts of, synergistic new technological developments on the use of ML methods for neglected tropical disease drug discovery in the future are also discussed.
Collapse
Affiliation(s)
- David A Winkler
- Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, VIC, Australia.,Latrobe Institute for Molecular Science, La Trobe University, Bundoora, VIC, Australia.,School of Pharmacy, University of Nottingham, Nottingham, United Kingdom.,CSIRO Data61, Pullenvale, QLD, Australia
| |
Collapse
|
3
|
The Natural Product Eugenol Is an Inhibitor of the Ebola Virus In Vitro. Pharm Res 2019; 36:104. [PMID: 31101988 DOI: 10.1007/s11095-019-2629-0] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Accepted: 04/18/2019] [Indexed: 12/30/2022]
Abstract
PURPOSE Since the 2014 Ebola virus (EBOV) outbreak in West Africa there has been considerable effort towards developing drugs to treat Ebola virus disease and yet to date there is no FDA approved treatment. This is important as at the time of writing this manuscript there is an ongoing outbreak in the Democratic Republic of the Congo which has killed over 1000. METHODS We have evaluated a small number of natural products, some of which had shown antiviral activity against other pathogens. This is exemplified with eugenol, which is found in high concentrations in multiple essential oils, and has shown antiviral activity against feline calicivirus, tomato yellow leaf curl virus, Influenza A virus, Herpes Simplex virus type 1 and 2, and four airborne phages. RESULTS Four compounds possessed EC50 values less than or equal to 11 μM. Of these, eugenol, had an EC50 of 1.3 μM against EBOV and is present in several plants including clove, cinnamon, basil and bay. Eugenol is much smaller and structurally unlike any compound that has been previously identified as an inhibitor of EBOV, therefore it may provide new mechanistic insights. CONCLUSION This compound is readily accessible in bulk quantities, is inexpensive, and has a long history of human consumption, which endorses the idea for further assessment as an antiviral therapeutic. This work also suggests that a more exhaustive assessment of natural product libraries against EBOV and other viruses is warranted to improve our ability to identify compounds that are so distinct from FDA approved drugs.
Collapse
|
4
|
Zorn KM, Lane TR, Russo DP, Clark AM, Makarov V, Ekins S. Multiple Machine Learning Comparisons of HIV Cell-based and Reverse Transcriptase Data Sets. Mol Pharm 2019; 16:1620-1632. [PMID: 30779585 DOI: 10.1021/acs.molpharmaceut.8b01297] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The human immunodeficiency virus (HIV) causes over a million deaths every year and has a huge economic impact in many countries. The first class of drugs approved were nucleoside reverse transcriptase inhibitors. A newer generation of reverse transcriptase inhibitors have become susceptible to drug resistant strains of HIV, and hence, alternatives are urgently needed. We have recently pioneered the use of Bayesian machine learning to generate models with public data to identify new compounds for testing against different disease targets. The current study has used the NIAID ChemDB HIV, Opportunistic Infection and Tuberculosis Therapeutics Database for machine learning studies. We curated and cleaned data from HIV-1 wild-type cell-based and reverse transcriptase (RT) DNA polymerase inhibition assays. Compounds from this database with ≤1 μM HIV-1 RT DNA polymerase activity inhibition and cell-based HIV-1 inhibition are correlated (Pearson r = 0.44, n = 1137, p < 0.0001). Models were trained using multiple machine learning approaches (Bernoulli Naive Bayes, AdaBoost Decision Tree, Random Forest, support vector classification, k-Nearest Neighbors, and deep neural networks as well as consensus approaches) and then their predictive abilities were compared. Our comparison of different machine learning methods demonstrated that support vector classification, deep learning, and a consensus were generally comparable and not significantly different from each other using 5-fold cross validation and using 24 training and test set combinations. This study demonstrates findings in line with our previous studies for various targets that training and testing with multiple data sets does not demonstrate a significant difference between support vector machine and deep neural networks.
Collapse
Affiliation(s)
- Kimberley M Zorn
- Collaborations Pharmaceuticals, Inc. , Main Campus Drive, Lab 3510 , Raleigh , North Carolina 27606 , United States
| | - Thomas R Lane
- Collaborations Pharmaceuticals, Inc. , Main Campus Drive, Lab 3510 , Raleigh , North Carolina 27606 , United States
| | - Daniel P Russo
- Collaborations Pharmaceuticals, Inc. , Main Campus Drive, Lab 3510 , Raleigh , North Carolina 27606 , United States.,The Rutgers Center for Computational and Integrative Biology , Camden , New Jersey 08102 , United States
| | - Alex M Clark
- Molecular Materials Informatics, Inc. , 2234 Duvernay Street , Montreal , Quebec H3J2Y3 , Canada
| | - Vadim Makarov
- Bach Institute of Biochemistry , Research Center of Biotechnology of the Russian Academy of Sciences , Leninsky Prospekt 33-2 , Moscow 119071 , Russia
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc. , Main Campus Drive, Lab 3510 , Raleigh , North Carolina 27606 , United States
| |
Collapse
|
5
|
Anantpadma M, Lane T, Zorn KM, Lingerfelt MA, Clark AM, Freundlich JS, Davey RA, Madrid PB, Ekins S. Ebola Virus Bayesian Machine Learning Models Enable New in Vitro Leads. ACS OMEGA 2019; 4:2353-2361. [PMID: 30729228 PMCID: PMC6356859 DOI: 10.1021/acsomega.8b02948] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Accepted: 01/17/2019] [Indexed: 05/08/2023]
Abstract
We have previously described the first Bayesian machine learning models from FDA-approved drug screens, for identifying compounds active against the Ebola virus (EBOV). These models led to the identification of three active molecules in vitro: tilorone, pyronaridine, and quinacrine. A follow-up study demonstrated that one of these compounds, tilorone, has 100% in vivo efficacy in mice infected with mouse-adapted EBOV at 30 mg/kg/day intraperitoneal. This suggested that we can learn from the published data on EBOV inhibition and use it to select new compounds for testing that are active in vivo. We used these previously built Bayesian machine learning EBOV models alongside our chemical insights for the selection of 12 molecules, absent from the training set, to test for in vitro EBOV inhibition. Nine molecules were directly selected using the model, and eight of these molecules possessed a promising in vitro activity (EC50 < 15 μM). Three further compounds were selected for an in vitro evaluation because they were antimalarials, and compounds of this class like pyronaridine and quinacrine have previously been shown to inhibit EBOV. We identified the antimalarial drug arterolane (IC50 = 4.53 μM) and the anticancer clinical candidate lucanthone (IC50 = 3.27 μM) as novel compounds that have EBOV inhibitory activity in HeLa cells and generally lack cytotoxicity. This work provides further validation for using machine learning and medicinal chemistry expertize to prioritize compounds for testing in vitro prior to more costly in vivo tests. These studies provide further corroboration of this strategy and suggest that it can likely be applied to other pathogens in the future.
Collapse
Affiliation(s)
- Manu Anantpadma
- Department
of Virology and Immunology, Texas Biomedical
Research Institute, 8715
West Military Drive, San Antonio, Texas 78227, United
States
| | - Thomas Lane
- Collaborations
Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Kimberley M. Zorn
- Collaborations
Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Mary A. Lingerfelt
- Collaborations
Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Alex M. Clark
- Molecular
Materials Informatics, Inc., 1900 St. Jacques #302, Montreal H3J 2S1, Quebec, Canada
| | - Joel S. Freundlich
- Departments
of Pharmacology, Physiology, and Neuroscience & Medicine, Center
for Emerging and Reemerging Pathogens, Rutgers
University—New Jersey Medical School, 185 South Orange Avenue, Newark, New Jersey 07103, United States
| | - Robert A. Davey
- Department
of Virology and Immunology, Texas Biomedical
Research Institute, 8715
West Military Drive, San Antonio, Texas 78227, United
States
| | - Peter B. Madrid
- SRI
International, 333 Ravenswood Avenue, Menlo Park, California 94025, United States
| | - Sean Ekins
- Collaborations
Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| |
Collapse
|
6
|
Hernandez HW, Soeung M, Zorn KM, Ashoura N, Mottin M, Andrade CH, Caffrey CR, de Siqueira-Neto JL, Ekins S. High Throughput and Computational Repurposing for Neglected Diseases. Pharm Res 2018; 36:27. [PMID: 30560386 PMCID: PMC6792295 DOI: 10.1007/s11095-018-2558-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Accepted: 12/09/2018] [Indexed: 12/21/2022]
Abstract
Purpose Neglected tropical diseases (NTDs) represent are a heterogeneous group of communicable diseases that are found within the poorest populations of the world. There are 23 NTDs that have been prioritized by the World Health Organization, which are endemic in 149 countries and affect more than 1.4 billion people, costing these developing economies billions of dollars annually. The NTDs result from four different causative pathogens: protozoa, bacteria, helminth and virus. The majority of the diseases lack effective treatments. Therefore, new therapeutics for NTDs are desperately needed. Methods We describe various high throughput screening and computational approaches that have been performed in recent years. We have collated the molecules identified in these studies and calculated molecular properties. Results Numerous global repurposing efforts have yielded some promising compounds for various neglected tropical diseases. These compounds when analyzed as one would expect appear drug-like. Several large datasets are also now in the public domain and this enables machine learning models to be constructed that then facilitate the discovery of new molecules for these pathogens. Conclusions In the space of a few years many groups have either performed experimental or computational repurposing high throughput screens against neglected diseases. These have identified compounds which in many cases are already approved drugs. Such approaches perhaps offer a more efficient way to develop treatments which are generally not a focus for global pharmaceutical companies because of the economics or the lack of a viable market. Other diseases could perhaps benefit from these repurposing approaches. Electronic supplementary material The online version of this article (10.1007/s11095-018-2558-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Melinda Soeung
- MD Anderson Cancer Center, University of Texas, Houston, Texas, USA
| | - Kimberley M Zorn
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina, 27606, USA
| | | | - Melina Mottin
- LabMol - Laboratory for Molecular Modeling and Drug Design Faculdade de Farmacia, Universidade Federal de Goias - UFG, Goiânia, GO, 74605-170, Brazil
| | - Carolina Horta Andrade
- LabMol - Laboratory for Molecular Modeling and Drug Design Faculdade de Farmacia, Universidade Federal de Goias - UFG, Goiânia, GO, 74605-170, Brazil
| | - Conor R Caffrey
- Center for Discovery and Innovation in Parasitic Diseases, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, California, 92093, USA
| | - Jair Lage de Siqueira-Neto
- Center for Discovery and Innovation in Parasitic Diseases, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, California, 92093, USA
| | - Sean Ekins
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina, 27606, USA.
| |
Collapse
|
7
|
Capuzzi SJ, Sun W, Muratov EN, Martínez-Romero C, He S, Zhu W, Li H, Tawa G, Fisher EG, Xu M, Shinn P, Qiu X, García-Sastre A, Zheng W, Tropsha A. Computer-Aided Discovery and Characterization of Novel Ebola Virus Inhibitors. J Med Chem 2018; 61:3582-3594. [PMID: 29624387 DOI: 10.1021/acs.jmedchem.8b00035] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The Ebola virus (EBOV) causes severe human infection that lacks effective treatment. A recent screen identified a series of compounds that block EBOV-like particle entry into human cells. Using data from this screen, quantitative structure-activity relationship models were built and employed for virtual screening of a ∼17 million compound library. Experimental testing of 102 hits yielded 14 compounds with IC50 values under 10 μM, including several sub-micromolar inhibitors, and more than 10-fold selectivity against host cytotoxicity. These confirmed hits include FDA-approved drugs and clinical candidates with non-antiviral indications, as well as compounds with novel scaffolds and no previously known bioactivity. Five selected hits inhibited BSL-4 live-EBOV infection in a dose-dependent manner, including vindesine (0.34 μM). Additional studies of these novel anti-EBOV compounds revealed their mechanisms of action, including the inhibition of NPC1 protein, cathepsin B/L, and lysosomal function. Compounds identified in this study are among the most potent and well-characterized anti-EBOV inhibitors reported to date.
Collapse
Affiliation(s)
- Stephen J Capuzzi
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry , UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill , Chapel Hill , North Carolina 27599 , United States
| | - Wei Sun
- National Center for Advancing Translational Sciences , National Institutes of Health , Bethesda , Maryland 20892 , United States
| | - Eugene N Muratov
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry , UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill , Chapel Hill , North Carolina 27599 , United States.,Department of Chemical Technology , Odessa National Polytechnic University , Odessa 65000 , Ukraine
| | - Carles Martínez-Romero
- Department of Microbiology , Icahn School of Medicine at Mount Sinai , New York , New York 10029 , United States.,Global Health and Emerging Pathogens Institute , Icahn School of Medicine at Mount Sinai , New York , New York 10029 , United States
| | - Shihua He
- Special Pathogens Program, National Microbiology Laboratory , Public Health Agency of Canada , 1015 Arlington Street , Winnipeg , Manitoba R3E 3R2 , Canada
| | - Wenjun Zhu
- Special Pathogens Program, National Microbiology Laboratory , Public Health Agency of Canada , 1015 Arlington Street , Winnipeg , Manitoba R3E 3R2 , Canada.,Department of Medical Microbiology , University of Manitoba , 745 Bannatyne Avenue , Winnipeg , Manitoba R3E 0J9 , Canada
| | - Hao Li
- National Center for Advancing Translational Sciences , National Institutes of Health , Bethesda , Maryland 20892 , United States
| | - Gregory Tawa
- National Center for Advancing Translational Sciences , National Institutes of Health , Bethesda , Maryland 20892 , United States
| | - Ethan G Fisher
- National Center for Advancing Translational Sciences , National Institutes of Health , Bethesda , Maryland 20892 , United States
| | - Miao Xu
- National Center for Advancing Translational Sciences , National Institutes of Health , Bethesda , Maryland 20892 , United States
| | - Paul Shinn
- National Center for Advancing Translational Sciences , National Institutes of Health , Bethesda , Maryland 20892 , United States
| | - Xiangguo Qiu
- Special Pathogens Program, National Microbiology Laboratory , Public Health Agency of Canada , 1015 Arlington Street , Winnipeg , Manitoba R3E 3R2 , Canada.,Department of Medical Microbiology , University of Manitoba , 745 Bannatyne Avenue , Winnipeg , Manitoba R3E 0J9 , Canada
| | - Adolfo García-Sastre
- Department of Microbiology , Icahn School of Medicine at Mount Sinai , New York , New York 10029 , United States.,Global Health and Emerging Pathogens Institute , Icahn School of Medicine at Mount Sinai , New York , New York 10029 , United States.,Department of Medicine, Division of Infectious Diseases , Icahn School of Medicine at Mount Sinai , New York , New York 10029 , United States
| | - Wei Zheng
- National Center for Advancing Translational Sciences , National Institutes of Health , Bethesda , Maryland 20892 , United States
| | - Alexander Tropsha
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry , UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill , Chapel Hill , North Carolina 27599 , United States
| |
Collapse
|
8
|
Ekins S, Clark AM, Dole K, Gregory K, Mcnutt AM, Spektor AC, Weatherall C, Litterman NK, Bunin BA. Data Mining and Computational Modeling of High-Throughput Screening Datasets. Methods Mol Biol 2018; 1755:197-221. [PMID: 29671272 DOI: 10.1007/978-1-4939-7724-6_14] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
We are now seeing the benefit of investments made over the last decade in high-throughput screening (HTS) that is resulting in large structure activity datasets entering public and open databases such as ChEMBL and PubChem. The growth of academic HTS screening centers and the increasing move to academia for early stage drug discovery suggests a great need for the informatics tools and methods to mine such data and learn from it. Collaborative Drug Discovery, Inc. (CDD) has developed a number of tools for storing, mining, securely and selectively sharing, as well as learning from such HTS data. We present a new web based data mining and visualization module directly within the CDD Vault platform for high-throughput drug discovery data that makes use of a novel technology stack following modern reactive design principles. We also describe CDD Models within the CDD Vault platform that enables researchers to share models, share predictions from models, and create models from distributed, heterogeneous data. Our system is built on top of the Collaborative Drug Discovery Vault Activity and Registration data repository ecosystem which allows users to manipulate and visualize thousands of molecules in real time. This can be performed in any browser on any platform. In this chapter we present examples of its use with public datasets in CDD Vault. Such approaches can complement other cheminformatics tools, whether open source or commercial, in providing approaches for data mining and modeling of HTS data.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, 27606, USA.
| | - Alex M Clark
- Collaborative Drug Discovery, Inc., Burlingame, CA, USA
- Molecular Materials Informatics, Inc., Montreal, QC, Canada
| | - Krishna Dole
- Collaborative Drug Discovery, Inc., Burlingame, CA, USA
| | | | | | | | | | | | - Barry A Bunin
- Collaborative Drug Discovery, Inc., Burlingame, CA, USA
| |
Collapse
|
9
|
Korotcov A, Tkachenko V, Russo DP, Ekins S. Comparison of Deep Learning With Multiple Machine Learning Methods and Metrics Using Diverse Drug Discovery Data Sets. Mol Pharm 2017; 14:4462-4475. [PMID: 29096442 PMCID: PMC5741413 DOI: 10.1021/acs.molpharmaceut.7b00578] [Citation(s) in RCA: 180] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Machine learning methods have been applied to many data sets in pharmaceutical research for several decades. The relative ease and availability of fingerprint type molecular descriptors paired with Bayesian methods resulted in the widespread use of this approach for a diverse array of end points relevant to drug discovery. Deep learning is the latest machine learning algorithm attracting attention for many of pharmaceutical applications from docking to virtual screening. Deep learning is based on an artificial neural network with multiple hidden layers and has found considerable traction for many artificial intelligence applications. We have previously suggested the need for a comparison of different machine learning methods with deep learning across an array of varying data sets that is applicable to pharmaceutical research. End points relevant to pharmaceutical research include absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) properties, as well as activity against pathogens and drug discovery data sets. In this study, we have used data sets for solubility, probe-likeness, hERG, KCNQ1, bubonic plague, Chagas, tuberculosis, and malaria to compare different machine learning methods using FCFP6 fingerprints. These data sets represent whole cell screens, individual proteins, physicochemical properties as well as a data set with a complex end point. Our aim was to assess whether deep learning offered any improvement in testing when assessed using an array of metrics including AUC, F1 score, Cohen's kappa, Matthews correlation coefficient and others. Based on ranked normalized scores for the metrics or data sets Deep Neural Networks (DNN) ranked higher than SVM, which in turn was ranked higher than all the other machine learning methods. Visualizing these properties for training and test sets using radar type plots indicates when models are inferior or perhaps over trained. These results also suggest the need for assessing deep learning further using multiple metrics with much larger scale comparisons, prospective testing as well as assessment of different fingerprints and DNN architectures beyond those used.
Collapse
Affiliation(s)
- Alexandru Korotcov
- Science Data Software, LLC, 14914 Bradwill Court, Rockville, MD 20850, USA
| | - Valery Tkachenko
- Science Data Software, LLC, 14914 Bradwill Court, Rockville, MD 20850, USA
| | - Daniel P Russo
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ, 08102, USA
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| |
Collapse
|
10
|
Williams AJ, Peck L, Ekins S. The new alchemy: Online networking, data sharing and research activity distribution tools for scientists. F1000Res 2017; 6:1315. [PMID: 28928951 PMCID: PMC5580431 DOI: 10.12688/f1000research.12185.1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/02/2017] [Indexed: 12/11/2022] Open
Abstract
There is an abundance of free online tools accessible to scientists and others that can be used for online networking, data sharing and measuring research impact. Despite this, few scientists know how these tools can be used or fail to take advantage of using them as an integrated pipeline to raise awareness of their research outputs. In this article, the authors describe their experiences with these tools and how they can make best use of them to make their scientific research generally more accessible, extending its reach beyond their own direct networks, and communicating their ideas to new audiences. These efforts have the potential to drive science by sparking new collaborations and interdisciplinary research projects that may lead to future publications, funding and commercial opportunities. The intent of this article is to: describe some of these freely accessible networking tools and affiliated products; demonstrate from our own experiences how they can be utilized effectively; and, inspire their adoption by new users for the benefit of science.
Collapse
Affiliation(s)
- Antony J Williams
- National Center for Computational Toxicology, Environmental Protection Agency, Durham, NC, 27711, USA
| | - Lou Peck
- Lou Peck Consulting, Swansea, SA4 3JQ, UK
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., Raleigh, NC, 27606, USA
| |
Collapse
|
11
|
Collaborative drug discovery for More Medicines for Tuberculosis (MM4TB). Drug Discov Today 2016; 22:555-565. [PMID: 27884746 DOI: 10.1016/j.drudis.2016.10.009] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Revised: 10/11/2016] [Accepted: 10/21/2016] [Indexed: 01/30/2023]
Abstract
Neglected disease drug discovery is generally poorly funded compared with major diseases and hence there is an increasing focus on collaboration and precompetitive efforts such as public-private partnerships (PPPs). The More Medicines for Tuberculosis (MM4TB) project is one such collaboration funded by the EU with the goal of discovering new drugs for tuberculosis. Collaborative Drug Discovery has provided a commercial web-based platform called CDD Vault which is a hosted collaborative solution for securely sharing diverse chemistry and biology data. Using CDD Vault alongside other commercial and free cheminformatics tools has enabled support of this and other large collaborative projects, aiding drug discovery efforts and fostering collaboration. We will describe CDD's efforts in assisting with the MM4TB project.
Collapse
|
12
|
Ekins S, Perryman AL, Clark AM, Reynolds RC, Freundlich JS. Machine Learning Model Analysis and Data Visualization with Small Molecules Tested in a Mouse Model of Mycobacterium tuberculosis Infection (2014-2015). J Chem Inf Model 2016; 56:1332-43. [PMID: 27335215 PMCID: PMC4962118 DOI: 10.1021/acs.jcim.6b00004] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
![]()
The
renewed urgency to develop new treatments for Mycobacterium
tuberculosis (Mtb)
infection has resulted in large-scale phenotypic screening and thousands
of new active compounds in vitro. The next challenge
is to identify candidates to pursue in a mouse in vivo efficacy model as a step to predicting clinical efficacy. We previously
analyzed over 70 years of this mouse in vivo efficacy
data, which we used to generate and validate machine learning models.
Curation of 60 additional small molecules with in vivo data published in 2014 and 2015 was undertaken to further test these
models. This represents a much larger test set than for the previous
models. Several computational approaches have now been applied to
analyze these molecules and compare their molecular properties beyond
those attempted previously. Our previous machine learning models have
been updated, and a novel aspect has been added in the form of mouse
liver microsomal half-life (MLM t1/2)
and in vitro-based Mtb models incorporating
cytotoxicity data that were used to predict in vivo activity for comparison. Our best Mtbin
vivo models possess fivefold ROC values > 0.7, sensitivity
> 80%, and concordance > 60%, while the best specificity value
is
>40%. Use of an MLM t1/2 Bayesian model
affords comparable results for scoring the 60 compounds tested. Combining
MLM stability and in vitroMtb models
in a novel consensus workflow in the best cases has a positive predicted
value (hit rate) > 77%. Our results indicate that Bayesian models
constructed with literature in vivoMtb data generated by different laboratories in various mouse models
can have predictive value and may be used alongside MLM t1/2 and in vitro-based Mtb models to assist in selecting antitubercular compounds with desirable in vivo efficacy. We demonstrate for the first time that
consensus models of any kind can be used to predict in vivo activity for Mtb. In addition, we describe a new
clustering method for data visualization and apply this to the in vivo training and test data, ultimately making the method
accessible in a mobile app.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborative Drug Discovery , 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States.,Collaborations in Chemistry , 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina 27526, United States
| | - Alexander L Perryman
- Department of Pharmacology, Physiology and Neuroscience, Rutgers University-New Jersey Medical School , Newark, New Jersey 07103, United States
| | - Alex M Clark
- Molecular Materials Informatics, Inc. , 1900 St. Jacques #302, Montreal, Quebec H3J 2S1, Canada
| | - Robert C Reynolds
- Division of Hematology and Oncology, Department of Medicine, and Department of Chemistry, College of Arts and Sciences, University of Alabama at Birmingham , 1530 Third Avenue South, Birmingham, Alabama 35294-1240, United States
| | - Joel S Freundlich
- Department of Pharmacology, Physiology and Neuroscience, Rutgers University-New Jersey Medical School , Newark, New Jersey 07103, United States.,Division of Infectious Diseases, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University-New Jersey Medical School , Newark, New Jersey 07103, United States
| |
Collapse
|
13
|
Ekins S, Mietchen D, Coffee M, Stratton TP, Freundlich JS, Freitas-Junior L, Muratov E, Siqueira-Neto J, Williams AJ, Andrade C. Open drug discovery for the Zika virus. F1000Res 2016; 5:150. [PMID: 27134728 PMCID: PMC4841202 DOI: 10.12688/f1000research.8013.1] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/08/2016] [Indexed: 01/20/2023] Open
Abstract
The Zika virus (ZIKV) outbreak in the Americas has caused global concern that we may be on the brink of a healthcare crisis. The lack of research on ZIKV in the over 60 years that we have known about it has left us with little in the way of starting points for drug discovery. Our response can build on previous efforts with virus outbreaks and lean heavily on work done on other flaviviruses such as dengue virus. We provide some suggestions of what might be possible and propose an open drug discovery effort that mobilizes global science efforts and provides leadership, which thus far has been lacking. We also provide a listing of potential resources and molecules that could be prioritized for testing as
in vitro assays for ZIKV are developed. We propose also that in order to incentivize drug discovery, a neglected disease priority review voucher should be available to those who successfully develop an FDA approved treatment. Learning from the response to the ZIKV, the approaches to drug discovery used and the success and failures will be critical for future infectious disease outbreaks.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry Inc, Fuquay-Varina, NC, USA; Collaborations Pharmaceuticals Inc., Fuquay-Varina, NC, USA; Collaborative Drug Discovery Inc., Burlingame, CA, USA
| | | | - Megan Coffee
- The International Rescue Committee , NY, NY, USA
| | - Thomas P Stratton
- Department of Pharmacology, Physiology and Neuroscience, Rutgers University-New Jersey Medical School, Newark, NJ, USA
| | - Joel S Freundlich
- Department of Pharmacology, Physiology and Neuroscience, Rutgers University-New Jersey Medical School, Newark, NJ, USA; Division of Infectious Diseases, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University-New Jersey Medical School, Newark, NJ, USA
| | - Lucio Freitas-Junior
- Chemical Biology and Screening Platform, Brazilian Laboratory of Biosciences (LNBio), CNPEM, Campinas, Brazil
| | - Eugene Muratov
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, USA
| | - Jair Siqueira-Neto
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | | | - Carolina Andrade
- LabMol - Laboratory for Molecular Modeling and Drug Design, Faculty of Pharmacy, Federal University of Goias, Goiânia, Brazil
| |
Collapse
|