Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Riniker S, Wang Y, Jenkins JL, Landrum GA. Using information from historical high-throughput screens to predict active compounds. J Chem Inf Model 2014;54:1880-91. [PMID: 24933016 DOI: 10.1021/ci500190p] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

For:	Riniker S, Wang Y, Jenkins JL, Landrum GA. Using information from historical high-throughput screens to predict active compounds. J Chem Inf Model 2014;54:1880-91. [PMID: 24933016 DOI: 10.1021/ci500190p] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

Number

Cited by Other Article(s)

Riaz IB, Khan MA, Haddad TC. Potential application of artificial intelligence in cancer therapy. Curr Opin Oncol 2024;36:437-448. [PMID: 39007164 DOI: 10.1097/cco.0000000000001068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]

Fallani A, Medrano Sandonas L, Tkatchenko A. Inverse mapping of quantum properties to structures for chemical space of small organic molecules. Nat Commun 2024;15:6061. [PMID: 39025883 PMCID: PMC11258234 DOI: 10.1038/s41467-024-50401-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 07/01/2024] [Indexed: 07/20/2024] Open

Odje F, Meijer D, von Coburg E, van der Hooft JJJ, Dunst S, Medema MH, Volkamer A. Unleashing the potential of cell painting assays for compound activities and hazards prediction. FRONTIERS IN TOXICOLOGY 2024;6:1401036. [PMID: 39086553 PMCID: PMC11288911 DOI: 10.3389/ftox.2024.1401036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 06/14/2024] [Indexed: 08/02/2024] Open

Fredin Haslum J, Lardeau CH, Karlsson J, Turkki R, Leuchowius KJ, Smith K, Müllers E. Cell Painting-based bioactivity prediction boosts high-throughput screening hit-rates and compound diversity. Nat Commun 2024;15:3470. [PMID: 38658534 PMCID: PMC11043326 DOI: 10.1038/s41467-024-47171-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 03/22/2024] [Indexed: 04/26/2024] Open

Thomas JR, Shelton C, Murphy J, Brittain S, Bray MA, Aspesi P, Concannon J, King FJ, Ihry RJ, Ho DJ, Henault M, Hadjikyriacou A, Neri M, Sigoillot FD, Pham HT, Shum M, Barys L, Jones MD, Martin EJ, Blechschmidt A, Rieffel S, Troxler TJ, Mapa FA, Jenkins JL, Jain RK, Kutchukian PS, Schirle M, Renner S. Enhancing the Small-Scale Screenable Biological Space beyond Known Chemogenomics Libraries with Gray Chemical Matter─Compounds with Novel Mechanisms from High-Throughput Screening Profiles. ACS Chem Biol 2024;19:938-952. [PMID: 38565185 PMCID: PMC11040606 DOI: 10.1021/acschembio.3c00737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 02/28/2024] [Accepted: 03/01/2024] [Indexed: 04/04/2024]

Affiliation(s)

Jason R. Thomas Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Claude Shelton Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Jason Murphy Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Scott Brittain Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Mark-Anthony Bray Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Peter Aspesi Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
John Concannon Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Frederick J. King Novartis Biomedical Research, San Diego, California 92121, United States
Robert J. Ihry Novartis Biomedical Research, San Diego, California 92121, United States
Daniel J. Ho Novartis Biomedical Research, San Diego, California 92121, United States
Martin Henault Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Andrea Hadjikyriacou Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Marilisa Neri Novartis Biomedical Research, Basel 4056, Switzerland
Frederic D. Sigoillot Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Helen T. Pham Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Matthew Shum Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Louise Barys Novartis Biomedical Research, Basel 4056, Switzerland
Michael D. Jones Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Eric J. Martin Novartis Biomedical Research, Emeryville, California 94608, United States
Anke Blechschmidt Novartis Biomedical Research, Basel 4056, Switzerland
Sébastien Rieffel Novartis Biomedical Research, Basel 4056, Switzerland
Thomas J. Troxler Novartis Biomedical Research, Basel 4056, Switzerland
Felipa A. Mapa Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Jeremy L. Jenkins Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Rishi K. Jain Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Peter S. Kutchukian Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Markus Schirle Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Steffen Renner Novartis Biomedical Research, Basel 4056, Switzerland

Collapse

Hassan J, Saeed SM, Deka L, Uddin MJ, Das DB. Applications of Machine Learning (ML) and Mathematical Modeling (MM) in Healthcare with Special Focus on Cancer Prognosis and Anticancer Therapy: Current Status and Challenges. Pharmaceutics 2024;16:260. [PMID: 38399314 PMCID: PMC10892549 DOI: 10.3390/pharmaceutics16020260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 01/29/2024] [Accepted: 02/07/2024] [Indexed: 02/25/2024] Open

Feng D, Liu B, Chen Z, Xu J, Geng M, Duan W, Ai J, Zhang H. Discovery of hematopoietic progenitor kinase 1 inhibitors using machine learning-based screening and free energy perturbation. J Biomol Struct Dyn 2024:1-13. [PMID: 38198294 DOI: 10.1080/07391102.2024.2301754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 12/30/2023] [Indexed: 01/12/2024]

Yu L, He X, Fang X, Liu L, Liu J. Deep Learning with Geometry-Enhanced Molecular Representation for Augmentation of Large-Scale Docking-Based Virtual Screening. J Chem Inf Model 2023;63:6501-6514. [PMID: 37882338 DOI: 10.1021/acs.jcim.3c01371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2023]

Xiaolin X, Xiaozhi L, Guoping H, Hongwei L, Jinkuo G, Xiyun B, Zhen T, Xiaofang M, Yanxia L, Na X, Chunyan Z, Rui G, Kuan W, Cheng Z, Cuancuan W, Mingyong L, Xinping D. Overfit deep neural network for predicting drug-target interactions. iScience 2023;26:107646. [PMID: 37680476 PMCID: PMC10480310 DOI: 10.1016/j.isci.2023.107646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2022] [Revised: 06/28/2023] [Accepted: 08/11/2023] [Indexed: 09/09/2023] Open

Affiliation(s)

Xiao Xiaolin Department of Cardiology, Tianjin Fifth Central Hospital, Tianjin, China Tianjin Key Laboratory of Epigenetics for Organ Development of Premature Infants, Tianjin Fifth Central Hospital, Tianjin, China Central Laboratory, Tianjin Fifth Central Hospital, Tianjin, China
Liu Xiaozhi Tianjin Key Laboratory of Epigenetics for Organ Development of Premature Infants, Tianjin Fifth Central Hospital, Tianjin, China Central Laboratory, Tianjin Fifth Central Hospital, Tianjin, China
He Guoping Geriatrics Department, Traditional Chinese Medicine Hospital of Binhai New Area, Tianjin, China
Liu Hongwei School of Clinical Medicine, North China University of Science and Technology, Tangshan, Hebei, China Department of Anesthesiology, Tangshan Maternal and Child Health Hospital, Tangshan, Hebei, China
Guo Jinkuo Tianjin Key Laboratory of Epigenetics for Organ Development of Premature Infants, Tianjin Fifth Central Hospital, Tianjin, China College of Food Science and Engineering, Tianjin University of Science & Technology, Tianjin, China
Bian Xiyun Tianjin Key Laboratory of Epigenetics for Organ Development of Premature Infants, Tianjin Fifth Central Hospital, Tianjin, China Central Laboratory, Tianjin Fifth Central Hospital, Tianjin, China
Tian Zhen Deepwater Technology Research Institute, China National Offshore Oil Corporation, Tianjin, China
Ma Xiaofang Tianjin Key Laboratory of Epigenetics for Organ Development of Premature Infants, Tianjin Fifth Central Hospital, Tianjin, China Central Laboratory, Tianjin Fifth Central Hospital, Tianjin, China
Li Yanxia Tianjin Key Laboratory of Epigenetics for Organ Development of Premature Infants, Tianjin Fifth Central Hospital, Tianjin, China Central Laboratory, Tianjin Fifth Central Hospital, Tianjin, China
Xue Na Tianjin Key Laboratory of Epigenetics for Organ Development of Premature Infants, Tianjin Fifth Central Hospital, Tianjin, China Central Laboratory, Tianjin Fifth Central Hospital, Tianjin, China
Zhang Chunyan Tianjin Key Laboratory of Epigenetics for Organ Development of Premature Infants, Tianjin Fifth Central Hospital, Tianjin, China Central Laboratory, Tianjin Fifth Central Hospital, Tianjin, China
Gao Rui Tianjin Key Laboratory of Epigenetics for Organ Development of Premature Infants, Tianjin Fifth Central Hospital, Tianjin, China
Wang Kuan Department of Cardiology, Tianjin Fifth Central Hospital, Tianjin, China
Zhang Cheng Department of Cardiology, Tianjin Fifth Central Hospital, Tianjin, China
Wang Cuancuan Department of Cardiology, Tianjin Fifth Central Hospital, Tianjin, China
Liu Mingyong Tianjin Key Laboratory of Epigenetics for Organ Development of Premature Infants, Tianjin Fifth Central Hospital, Tianjin, China Department of Urology, Tianjin Fifth Central Hospital, Tianjin, China
Du Xinping Department of Cardiology, Tianjin Fifth Central Hospital, Tianjin, China Tianjin Key Laboratory of Epigenetics for Organ Development of Premature Infants, Tianjin Fifth Central Hospital, Tianjin, China College of Food Science and Engineering, Tianjin University of Science & Technology, Tianjin, China

Collapse

Seifermann M, Reiser P, Friederich P, Levkin PA. High-Throughput Synthesis and Machine Learning Assisted Design of Photodegradable Hydrogels. SMALL METHODS 2023;7:e2300553. [PMID: 37287430 DOI: 10.1002/smtd.202300553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Indexed: 06/09/2023]

Combining metabolome and clinical indicators with machine learning provides some promising diagnostic markers to precisely detect smear-positive/negative pulmonary tuberculosis. BMC Infect Dis 2022;22:707. [PMID: 36008772 PMCID: PMC9403968 DOI: 10.1186/s12879-022-07694-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 08/22/2022] [Indexed: 11/30/2022] Open

Abstract

Background

Tuberculosis (TB) had been the leading lethal infectious disease worldwide for a long time (2014–2019) until the COVID-19 global pandemic, and it is still one of the top 10 death causes worldwide. One important reason why there are so many TB patients and death cases in the world is because of the difficulties in precise diagnosis of TB using common detection methods, especially for some smear-negative pulmonary tuberculosis (SNPT) cases. The rapid development of metabolome and machine learning offers a great opportunity for precision diagnosis of TB. However, the metabolite biomarkers for the precision diagnosis of smear-positive and smear-negative pulmonary tuberculosis (SPPT/SNPT) remain to be uncovered. In this study, we combined metabolomics and clinical indicators with machine learning to screen out newly diagnostic biomarkers for the precise identification of SPPT and SNPT patients.

Methods

Untargeted plasma metabolomic profiling was performed for 27 SPPT patients, 37 SNPT patients and controls. The orthogonal partial least squares-discriminant analysis (OPLS-DA) was then conducted to screen differential metabolites among the three groups. Metabolite enriched pathways, random forest (RF), support vector machines (SVM) and multilayer perceptron neural network (MLP) were performed using Metaboanalyst 5.0, “caret” R package, “e1071” R package and “Tensorflow” Python package, respectively.

Results

Metabolomic analysis revealed significant enrichment of fatty acid and amino acid metabolites in the plasma of SPPT and SNPT patients, where SPPT samples showed a more serious dysfunction in fatty acid and amino acid metabolisms. Further RF analysis revealed four optimized diagnostic biomarker combinations including ten features (two lipid/lipid-like molecules and seven organic acids/derivatives, and one clinical indicator) for the identification of SPPT, SNPT patients and controls with high accuracy (83–93%), which were further verified by SVM and MLP. Among them, MLP displayed the best classification performance on simultaneously precise identification of the three groups (94.74%), suggesting the advantage of MLP over RF/SVM to some extent.

Conclusions

Our findings reveal plasma metabolomic characteristics of SPPT and SNPT patients, provide some novel promising diagnostic markers for precision diagnosis of various types of TB, and show the potential of machine learning in screening out biomarkers from big data.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12879-022-07694-8.

Collapse

He K. Pharmacological affinity fingerprints derived from bioactivity data for the identification of designer drugs. J Cheminform 2022;14:35. [PMID: 35672835 PMCID: PMC9171973 DOI: 10.1186/s13321-022-00607-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 05/05/2022] [Indexed: 12/15/2022] Open

Wet-dry-wet drug screen leads to the synthesis of TS1, a novel compound reversing lung fibrosis through inhibition of myofibroblast differentiation. Cell Death Dis 2021;13:2. [PMID: 34916483 PMCID: PMC8677786 DOI: 10.1038/s41419-021-04439-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 11/18/2021] [Accepted: 11/29/2021] [Indexed: 11/09/2022]

Mohanty E, Mohanty A. Role of artificial intelligence in peptide vaccine design against RNA viruses. INFORMATICS IN MEDICINE UNLOCKED 2021;26:100768. [PMID: 34722851 PMCID: PMC8536498 DOI: 10.1016/j.imu.2021.100768] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 10/16/2021] [Accepted: 10/16/2021] [Indexed: 01/18/2023] Open

Aghamiri SS, Amin R, Helikar T. Recent applications of quantitative systems pharmacology and machine learning models across diseases. J Pharmacokinet Pharmacodyn 2021;49:19-37. [PMID: 34671863 PMCID: PMC8528185 DOI: 10.1007/s10928-021-09790-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 10/07/2021] [Indexed: 12/29/2022]

Wilm A, Garcia de Lomana M, Stork C, Mathai N, Hirte S, Norinder U, Kühnl J, Kirchmair J. Predicting the Skin Sensitization Potential of Small Molecules with Machine Learning Models Trained on Biologically Meaningful Descriptors. Pharmaceuticals (Basel) 2021;14:ph14080790. [PMID: 34451887 PMCID: PMC8402010 DOI: 10.3390/ph14080790] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 08/03/2021] [Accepted: 08/06/2021] [Indexed: 02/06/2023] Open

Garcia de Lomana M, Morger A, Norinder U, Buesen R, Landsiedel R, Volkamer A, Kirchmair J, Mathea M. ChemBioSim: Enhancing Conformal Prediction of In Vivo Toxicity by Use of Predicted Bioactivities. J Chem Inf Model 2021;61:3255-3272. [PMID: 34153183 PMCID: PMC8317154 DOI: 10.1021/acs.jcim.1c00451] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Indexed: 02/07/2023]

Abstract

Computational methods such as machine learning approaches have a strong track record of success in predicting the outcomes of in vitro assays. In contrast, their ability to predict in vivo endpoints is more limited due to the high number of parameters and processes that may influence the outcome. Recent studies have shown that the combination of chemical and biological data can yield better models for in vivo endpoints. The ChemBioSim approach presented in this work aims to enhance the performance of conformal prediction models for in vivo endpoints by combining chemical information with (predicted) bioactivity assay outcomes. Three in vivo toxicological endpoints, capturing genotoxic (MNT), hepatic (DILI), and cardiological (DICC) issues, were selected for this study due to their high relevance for the registration and authorization of new compounds. Since the sparsity of available biological assay data is challenging for predictive modeling, predicted bioactivity descriptors were introduced instead. Thus, a machine learning model for each of the 373 collected biological assays was trained and applied on the compounds of the in vivo toxicity data sets. Besides the chemical descriptors (molecular fingerprints and physicochemical properties), these predicted bioactivities served as descriptors for the models of the three in vivo endpoints. For this study, a workflow based on a conformal prediction framework (a method for confidence estimation) built on random forest models was developed. Furthermore, the most relevant chemical and bioactivity descriptors for each in vivo endpoint were preselected with lasso models. The incorporation of bioactivity descriptors increased the mean F1 scores of the MNT model from 0.61 to 0.70 and for the DICC model from 0.72 to 0.82 while the mean efficiencies increased by roughly 0.10 for both endpoints. In contrast, for the DILI endpoint, no significant improvement in model performance was observed. Besides pure performance improvements, an analysis of the most important bioactivity features allowed detection of novel and less intuitive relationships between the predicted biological assay outcomes used as descriptors and the in vivo endpoints. This study presents how the prediction of in vivo toxicity endpoints can be improved by the incorporation of biological information-which is not necessarily captured by chemical descriptors-in an automated workflow without the need for adding experimental workload for the generation of bioactivity descriptors as predicted outcomes of bioactivity assays were utilized. All bioactivity CP models for deriving the predicted bioactivities, as well as the in vivo toxicity CP models, can be freely downloaded from https://doi.org/10.5281/zenodo.4761225.

Collapse

Esposito C, Landrum GA, Schneider N, Stiefl N, Riniker S. GHOST: Adjusting the Decision Threshold to Handle Imbalanced Data in Machine Learning. J Chem Inf Model 2021;61:2623-2640. [PMID: 34100609 DOI: 10.1021/acs.jcim.1c00160] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Biological activity-based modeling identifies antiviral leads against SARS-CoV-2. Nat Biotechnol 2021;39:747-753. [PMID: 33623157 PMCID: PMC9843700 DOI: 10.1038/s41587-021-00839-1] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Accepted: 01/25/2021] [Indexed: 01/29/2023]

Discovery of Novel eEF2K Inhibitors Using HTS Fingerprint Generated from Predicted Profiling of Compound-Protein Interactions. MEDICINES 2021;8:medicines8050023. [PMID: 34065377 PMCID: PMC8161098 DOI: 10.3390/medicines8050023] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 04/24/2021] [Accepted: 05/18/2021] [Indexed: 11/29/2022]

Kamerzell TJ, Middaugh CR. Prediction Machines: Applied Machine Learning for Therapeutic Protein Design and Development. J Pharm Sci 2020;110:665-681. [PMID: 33278409 DOI: 10.1016/j.xphs.2020.11.034] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 11/27/2020] [Accepted: 11/27/2020] [Indexed: 12/11/2022]

Early lung cancer diagnostic biomarker discovery by machine learning methods. Transl Oncol 2020;14:100907. [PMID: 33217646 PMCID: PMC7683339 DOI: 10.1016/j.tranon.2020.100907] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 08/21/2020] [Accepted: 09/25/2020] [Indexed: 02/07/2023] Open

Abstract

•

Early diagnosis could improve lung cancer survival rate.

•

The availability of blood-based screening could increase lung cancer patient uptake.

•

An interdisciplinary mechanism combines metabolomics and machine learning methods.

•

Metabolic biomarkers could be potential screening biomarkers for early detection of lung cancer.

•

Naïve Bayes is recommended as an exploitable tool for early lung tumor prediction.

Early diagnosis has been proved to improve survival rate of lung cancer patients. The availability of blood-based screening could increase early lung cancer patient uptake. Our present study attempted to discover Chinese patients’ plasma metabolites as diagnostic biomarkers for lung cancer. In this work, we use a pioneering interdisciplinary mechanism, which is firstly applied to lung cancer, to detect early lung cancer diagnostic biomarkers by combining metabolomics and machine learning methods. We collected total 110 lung cancer patients and 43 healthy individuals in our study. Levels of 61 plasma metabolites were from targeted metabolomic study using LC-MS/MS. A specific combination of six metabolic biomarkers note-worthily enabling the discrimination between stage I lung cancer patients and healthy individuals (AUC = 0.989, Sensitivity = 98.1%, Specificity = 100.0%). And the top 5 relative importance metabolic biomarkers developed by FCBF algorithm also could be potential screening biomarkers for early detection of lung cancer. Naïve Bayes is recommended as an exploitable tool for early lung tumor prediction. This research will provide strong support for the feasibility of blood-based screening, and bring a more accurate, quick and integrated application tool for early lung cancer diagnostic. The proposed interdisciplinary method could be adapted to other cancer beyond lung cancer.

Collapse

Hsieh JH, Sedykh A, Mutlu E, Germolec DR, Auerbach SS, Rider CV. Harnessing In Silico, In Vitro, and In Vivo Data to Understand the Toxicity Landscape of Polycyclic Aromatic Compounds (PACs). Chem Res Toxicol 2020;34:268-285. [PMID: 33063992 DOI: 10.1021/acs.chemrestox.0c00213] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]

Abstract

Polycyclic aromatic compounds (PACs) are compounds with a minimum of two six-atom aromatic fused rings. PACs arise from incomplete combustion or thermal decomposition of organic matter and are ubiquitous in the environment. Within PACs, carcinogenicity is generally regarded to be the most important public health concern. However, toxicity in other systems (reproductive and developmental toxicity, immunotoxicity) has also been reported. Despite the large number of PACs identified in the environment, research attention to understand exposure and health effects of PACs has focused on a relatively limited subset, namely polycyclic aromatic hydrocarbons (PAHs), the PACs with only carbon and hydrogen atoms. To triage the rest of the vast number of PACs for more resource-intensive testing, we developed a data-driven approach to contextualize hazard characterization of PACs, by leveraging the available data from various data streams (in silico toxicity, in vitro activity, structural fingerprints, and in vivo data availability). The PACs were clustered on the basis of their in silico toxicity profiles containing predictions from 8 different categories (carcinogenicity, cardiotoxicity, developmental toxicity, genotoxicity, hepatotoxicity, neurotoxicity, reproductive toxicity, and urinary toxicity). We found that PACs with the same parent structure (e.g., fluorene) could have diverse in silico toxicity profiles. In contrast, PACs with similar substituted groups (e.g., alkylated-PAHs) or heterocyclics (e.g., N-PACs) with varying ring sizes could have similar in silico toxicity profiles, suggesting that these groups are better candidates for toxicity read-across analysis. The clusters/regions associated with certain in silico toxicity, in vitro activity, and structural fingerprints were identified. We found that genotoxicity/carcinogenicity (in silico toxicity) and xenobiotic homeostasis and stress response (in vitro activity), respectively, dominate the toxicity/activity variation seen in the PACs. The "hot spots" with enriched toxicity/activity in conjunction with availability of in vivo carcinogenicity data revealed regions of either data-poor (hydroxylated-PAHs) or data-rich (unsubstituted, parent PAHs) PACs. These regions offer potential targets for prioritization of further in vivo assessment and for chemical read-across efforts. The analysis results are searchable through an interactive web application (https://ntp.niehs.nih.gov/go/pacs_tableau), allowing for alternative hypothesis generation.

Collapse

Stojanović L, Popović M, Tijanić N, Rakočević G, Kalinić M. Improved Scaffold Hopping in Ligand-Based Virtual Screening Using Neural Representation Learning. J Chem Inf Model 2020;60:4629-4639. [DOI: 10.1021/acs.jcim.0c00622] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Raschka S, Kaufman B. Machine learning and AI-based approaches for bioactive ligand discovery and GPCR-ligand recognition. Methods 2020;180:89-110. [PMID: 32645448 PMCID: PMC8457393 DOI: 10.1016/j.ymeth.2020.06.016] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 06/23/2020] [Accepted: 06/23/2020] [Indexed: 02/06/2023] Open

Škuta C, Cortés-Ciriano I, Dehaen W, Kříž P, van Westen GJP, Tetko IV, Bender A, Svozil D. QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping. J Cheminform 2020;12:39. [PMID: 33431038 PMCID: PMC7260783 DOI: 10.1186/s13321-020-00443-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Accepted: 05/16/2020] [Indexed: 02/11/2023] Open

Norinder U, Spjuth O, Svensson F. Using Predicted Bioactivity Profiles to Improve Predictive Modeling. J Chem Inf Model 2020;60:2830-2837. [PMID: 32374618 DOI: 10.1021/acs.jcim.0c00250] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

Wang Y, Chen Z, Bian F, Shang L, Zhu K, Zhao Y. Advances of droplet-based microfluidics in drug discovery. Expert Opin Drug Discov 2020;15:969-979. [DOI: 10.1080/17460441.2020.1758663] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Réda C, Kaufmann E, Delahaye-Duriez A. Machine learning applications in drug development. Comput Struct Biotechnol J 2019;18:241-252. [PMID: 33489002 PMCID: PMC7790737 DOI: 10.1016/j.csbj.2019.12.006] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2019] [Revised: 12/10/2019] [Accepted: 12/10/2019] [Indexed: 02/07/2023] Open

Hessler G, Grebner C, Matter H. Computational Approaches for Target Inference. ACTA ACUST UNITED AC 2019. [DOI: 10.1002/9783527818242.ch10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, Li B, Madabhushi A, Shah P, Spitzer M, Zhao S. Applications of machine learning in drug discovery and development. Nat Rev Drug Discov 2019;18:463-477. [PMID: 30976107 DOI: 10.1038/s41573-019-0024-5] [Citation(s) in RCA: 979] [Impact Index Per Article: 195.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

David L, Arús-Pous J, Karlsson J, Engkvist O, Bjerrum EJ, Kogej T, Kriegl JM, Beck B, Chen H. Applications of Deep-Learning in Exploiting Large-Scale and Heterogeneous Compound Data in Industrial Pharmaceutical Research. Front Pharmacol 2019;10:1303. [PMID: 31749705 PMCID: PMC6848277 DOI: 10.3389/fphar.2019.01303] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Accepted: 10/14/2019] [Indexed: 12/21/2022] Open

Lu Y, Anand S, Shirley W, Gedeck P, Kelley BP, Skolnik S, Rodde S, Nguyen M, Lindvall M, Jia W. Prediction of pKa Using Machine Learning Methods with Rooted Topological Torsion Fingerprints: Application to Aliphatic Amines. J Chem Inf Model 2019;59:4706-4719. [DOI: 10.1021/acs.jcim.9b00498] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Laufkötter O, Sturm N, Bajorath J, Chen H, Engkvist O. Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability. J Cheminform 2019;11:54. [PMID: 31396716 PMCID: PMC6686534 DOI: 10.1186/s13321-019-0376-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 07/31/2019] [Indexed: 11/29/2022] Open

Predicting kinase inhibitors using bioactivity matrix derived informer sets. PLoS Comput Biol 2019;15:e1006813. [PMID: 31381559 PMCID: PMC6695194 DOI: 10.1371/journal.pcbi.1006813] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 08/15/2019] [Accepted: 07/13/2019] [Indexed: 12/21/2022] Open

Abstract

Prediction of compounds that are active against a desired biological target is a common step in drug discovery efforts. Virtual screening methods seek some active-enriched fraction of a library for experimental testing. Where data are too scarce to train supervised learning models for compound prioritization, initial screening must provide the necessary data. Commonly, such an initial library is selected on the basis of chemical diversity by some pseudo-random process (for example, the first few plates of a larger library) or by selecting an entire smaller library. These approaches may not produce a sufficient number or diversity of actives. An alternative approach is to select an informer set of screening compounds on the basis of chemogenomic information from previous testing of compounds against a large number of targets. We compare different ways of using chemogenomic data to choose a small informer set of compounds based on previously measured bioactivity data. We develop this Informer-Based-Ranking (IBR) approach using the Published Kinase Inhibitor Sets (PKIS) as the chemogenomic data to select the informer sets. We test the informer compounds on a target that is not part of the chemogenomic data, then predict the activity of the remaining compounds based on the experimental informer data and the chemogenomic data. Through new chemical screening experiments, we demonstrate the utility of IBR strategies in a prospective test on three kinase targets not included in the PKIS.

In the early stages of drug discovery efforts, computational models are used to predict activity and prioritize compounds for experimental testing. New targets commonly lack the data necessary to build effective models, and the screening needed to generate that experimental data can be costly. We seek to improve the efficiency of the initial screening phase, and of the process of prioritizing compounds for subsequent screening.

We choose a small informer set of compounds based on publicly available prior screening data on distinct targets. We then collect experimental data on these informer compounds and use that data to predict the activity of other compounds in the set for the target of interest. Computational and statistical tools are needed to identify informer compounds and to prioritize other compounds for subsequent phases of screening. We find that selection of informer compounds on the basis of bioactivity data from previous screening efforts is superior to the traditional approach of selection of a chemically diverse subset of compounds. We demonstrate the success of this approach in retrospective tests on the Published Kinase Inhibitor Sets (PKIS) chemogenomic data and in prospective experimental screens against three additional non-human kinase targets.

Collapse

Jansen JM, De Pascale G, Fong S, Lindvall M, Moser HE, Pfister K, Warne B, Wartchow C. Biased Complement Diversity Selection for Effective Exploration of Chemical Space in Hit-Finding Campaigns. J Chem Inf Model 2019;59:1709-1714. [DOI: 10.1021/acs.jcim.9b00048] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]

Sturm N, Sun J, Vandriessche Y, Mayr A, Klambauer G, Carlsson L, Engkvist O, Chen H. Application of Bioactivity Profile-Based Fingerprints for Building Machine Learning Models. J Chem Inf Model 2018;59:962-972. [DOI: 10.1021/acs.jcim.8b00550] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

Mason DJ, Eastman RT, Lewis RPI, Stott IP, Guha R, Bender A. Using Machine Learning to Predict Synergistic Antimalarial Compound Combinations With Novel Structures. Front Pharmacol 2018;9:1096. [PMID: 30333748 PMCID: PMC6176478 DOI: 10.3389/fphar.2018.01096] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Accepted: 09/07/2018] [Indexed: 01/28/2023] Open

Abstract

The parasite Plasmodium falciparum is the most lethal species of Plasmodium to cause serious malaria infection in humans, and with resistance developing rapidly novel treatment modalities are currently being sought, one of which being combinations of existing compounds. The discovery of combinations of antimalarial drugs that act synergistically with one another is hence of great importance; however an exhaustive experimental screen of large drug space in a pairwise manner is not an option. In this study we apply our machine learning approach, Combination Synergy Estimation (CoSynE), which can predict novel synergistic drug interactions using only prior experimental combination screening data and knowledge of compound molecular structures, to a dataset of 1,540 antimalarial drug combinations in which 22.2% were synergistic. Cross validation of our model showed that synergistic CoSynE predictions are enriched 2.74 × compared to random selection when both compounds in a predicted combination are known from other combinations among the training data, 2.36 × when only one compound is known from the training data, and 1.5 × for entirely novel combinations. We prospectively validated our model by making predictions for 185 combinations of 23 entirely novel compounds. CoSynE predicted 20 combinations to be synergistic, which was experimentally validated for nine of them (45%), corresponding to an enrichment of 1.70 × compared to random selection from this prospective data set. Such enrichment corresponds to a 41% reduction in experimental effort. Interestingly, we found that pairwise screening of the compounds CoSynE individually predicted to be synergistic would result in an enrichment of 1.36 × compared to random selection, indicating that synergy among compound combinations is not a random event. The nine novel and correctly predicted synergistic compound combinations mainly (where sufficient bioactivity information is available) consist of efflux or transporter inhibitors (such as hydroxyzine), combined with compounds exhibiting antimalarial activity alone (such as sorafenib, apicidin, or dihydroergotamine). However, not all compound synergies could be rationalized easily in this way. Overall, this study highlights the potential for predictive modeling to expedite the discovery of novel drug combinations in fight against antimalarial resistance, while the underlying approach is also generally applicable.

Collapse

Paricharak S, Méndez-Lucio O, Chavan Ravindranath A, Bender A, IJzerman AP, van Westen GJP. Data-driven approaches used for compound library design, hit triage and bioactivity modeling in high-throughput screening. Brief Bioinform 2018;19:277-285. [PMID: 27789427 PMCID: PMC6018726 DOI: 10.1093/bib/bbw105] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Revised: 09/26/2016] [Indexed: 12/25/2022] Open

Cortes Cabrera A, Petrone PM. Optimal HTS Fingerprint Definitions by Using a Desirability Function and a Genetic Algorithm. J Chem Inf Model 2018;58:641-646. [DOI: 10.1021/acs.jcim.7b00447] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Hit-to-Lead: Hit Validation and Assessment. Methods Enzymol 2018;610:265-309. [DOI: 10.1016/bs.mie.2018.09.022] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Pertusi DA, O’Donnell G, Homsher MF, Solly K, Patel A, Stahler SL, Riley D, Finley MF, Finger EN, Adam GC, Meng J, Bell DJ, Zuck PD, Hudak EM, Weber MJ, Nothstein JE, Locco L, Quinn C, Amoss A, Squadroni B, Hartnett M, Heo MR, White T, May SA, Boots E, Roberts K, Cocchiarella P, Wolicki A, Kreamer A, Kutchukian PS, Wassermann AM, Uebele VN, Glick M, Rusinko A, Culberson JC. Prospective Assessment of Virtual Screening Heuristics Derived Using a Novel Fusion Score. SLAS DISCOVERY 2017;22:995-1006. [DOI: 10.1177/2472555217706058] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Affiliation(s)

Dante A. Pertusi Modeling and Informatics, Merck & Co., Inc., West Point, PA, USA
Gregory O’Donnell Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA Merck & Co., Inc., West Point, PA, USA
Michelle F. Homsher Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA Merck & Co., Inc., West Point, PA, USA
Kelli Solly Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA Merck & Co., Inc., West Point, PA, USA
Amita Patel Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA Merck & Co., Inc., West Point, PA, USA
Shannon L. Stahler Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA Merck & Co., Inc., West Point, PA, USA
Daniel Riley Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA Merck & Co., Inc., West Point, PA, USA
Michael F. Finley Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA Discovery Sciences, Janssen Research and Development LLC, Spring House, PA, USA
Eleftheria N. Finger Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA Discovery & Preclinical Development, GlaxoSmithKline, Collegeville, PA, USA
Gregory C. Adam Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA Merck & Co., Inc., West Point, PA, USA
Juncai Meng Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA
David J. Bell Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA Merck & Co., Inc., North Wales, PA, USA
Paul D. Zuck Merck & Co., Inc., North Wales, PA, USA Automation and Engineering, Merck & Co., Inc., North Wales, PA, USA
Edward M. Hudak Discovery Sample Management, Merck & Co., Inc., North Wales, PA, USA
Michael J. Weber Automation and Engineering, Merck & Co., Inc., North Wales, PA, USA
Jennifer E. Nothstein Merck & Co., Inc., West Point, PA, USA Automation and Engineering, Merck & Co., Inc., North Wales, PA, USA
Louis Locco Automation and Engineering, Merck & Co., Inc., North Wales, PA, USA
Carissa Quinn Discovery Sciences, Janssen Research and Development LLC, Spring House, PA, USA Automation and Engineering, Merck & Co., Inc., North Wales, PA, USA
Adam Amoss Automation and Engineering, Merck & Co., Inc., North Wales, PA, USA
Brian Squadroni Merck & Co., Inc., West Point, PA, USA Automation and Engineering, Merck & Co., Inc., North Wales, PA, USA
Michelle Hartnett Discovery Sciences, Janssen Research and Development LLC, Spring House, PA, USA Automation and Engineering, Merck & Co., Inc., North Wales, PA, USA
Mee Ra Heo Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA Merck & Co., Inc., North Wales, PA, USA
Tara White Discovery Sample Management, Merck & Co., Inc., North Wales, PA, USA
S. Alex May Automation and Engineering, Merck & Co., Inc., North Wales, PA, USA
Evelyn Boots Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA
Kenneth Roberts Automation and Engineering, Merck & Co., Inc., North Wales, PA, USA
Patrick Cocchiarella Discovery Sample Management, Merck & Co., Inc., North Wales, PA, USA
Alex Wolicki Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA
Anthony Kreamer Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA Merck & Co., Inc., Kenilworth, NJ, USA
Peter S. Kutchukian Modeling and Informatics, Merck & Co., Inc., Boston, MA, USA
Anne Mai Wassermann Modeling and Informatics, Merck & Co., Inc., Boston, MA, USA
Victor N. Uebele Screening and Protein Sciences, Merck & Co., Inc., North Wales, PA, USA Merck & Co., Inc., North Wales, PA, USA
Meir Glick Modeling and Informatics, Merck & Co., Inc., Boston, MA, USA
Andrew Rusinko Modeling and Informatics, Merck & Co., Inc., West Point, PA, USA
J. Christopher Culberson Modeling and Informatics, Merck & Co., Inc., West Point, PA, USA

Collapse

Kutchukian PS, Warren L, Magliaro BC, Amoss A, Cassaday JA, O’Donnell G, Squadroni B, Zuck P, Pascarella D, Culberson JC, Cooke AJ, Hurzy D, Schlegel KAS, Thomson F, Johnson EN, Uebele VN, Hermes JD, Parmentier-Batteur S, Finley M. Iterative Focused Screening with Biological Fingerprints Identifies Selective Asc-1 Inhibitors Distinct from Traditional High Throughput Screening. ACS Chem Biol 2017;12:519-527. [PMID: 28032990 DOI: 10.1021/acschembio.6b00913] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Affiliation(s)

Peter S. Kutchukian Modeling and Informatics, Merck & Co., Inc., MRL, Boston, Massachusetts, United States
Lee Warren Neuroscience, Merck & Co., Inc., MRL, West Point, Pennsylvania, United States
Brian C. Magliaro Pharmacology, Merck & Co., Inc., MRL, West Point, Pennsylvania, United States
Adam Amoss Screening and Protein Sciences, Merck & Co., Inc., MRL, North Wales, Pennsylvania, United States
Jason A. Cassaday Screening and Protein Sciences, Merck & Co., Inc., MRL, North Wales, Pennsylvania, United States
Gregory O’Donnell Screening and Protein Sciences, Merck & Co., Inc., MRL, North Wales, Pennsylvania, United States
Brian Squadroni Screening and Protein Sciences, Merck & Co., Inc., MRL, North Wales, Pennsylvania, United States
Paul Zuck Screening and Protein Sciences, Merck & Co., Inc., MRL, North Wales, Pennsylvania, United States
Danette Pascarella Pharmacology, Merck & Co., Inc., MRL, West Point, Pennsylvania, United States
J. Chris Culberson Modeling and Informatics, Merck & Co., Inc., MRL, West Point, Pennsylvania, United States
Andrew J. Cooke Chemistry, Merck & Co., Inc., MRL, West Point, Pennsylvania, United States
Danielle Hurzy Chemistry, Merck & Co., Inc., MRL, West Point, Pennsylvania, United States
Kelly-Ann Sondra Schlegel Chemistry, Merck & Co., Inc., MRL, West Point, Pennsylvania, United States
Fiona Thomson Neuroscience, Merck & Co., Inc., MRL, West Point, Pennsylvania, United States
Eric N. Johnson Screening and Protein Sciences, Merck & Co., Inc., MRL, North Wales, Pennsylvania, United States
Victor N. Uebele Screening and Protein Sciences, Merck & Co., Inc., MRL, North Wales, Pennsylvania, United States
Jeffrey D. Hermes Screening and Protein Sciences, Merck & Co., Inc., MRL, North Wales, Pennsylvania, United States
Sophie Parmentier-Batteur Neuroscience, Merck & Co., Inc., MRL, West Point, Pennsylvania, United States
Michael Finley Screening and Protein Sciences, Merck & Co., Inc., MRL, North Wales, Pennsylvania, United States

Collapse

Merget B, Turk S, Eid S, Rippmann F, Fulle S. Profiling Prediction of Kinase Inhibitors: Toward the Virtual Assay. J Med Chem 2016;60:474-485. [PMID: 27966949 DOI: 10.1021/acs.jmedchem.6b01611] [Citation(s) in RCA: 73] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

Cortes Cabrera A, Lucena-Agell D, Redondo-Horcajo M, Barasoain I, Díaz JF, Fasching B, Petrone PM. Aggregated Compound Biological Signatures Facilitate Phenotypic Drug Discovery and Target Elucidation. ACS Chem Biol 2016;11:3024-3034. [PMID: 27564241 DOI: 10.1021/acschembio.6b00358] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Gütlein M, Kramer S. Filtered circular fingerprints improve either prediction or runtime performance while retaining interpretability. J Cheminform 2016;8:60. [PMID: 27853484 PMCID: PMC5088672 DOI: 10.1186/s13321-016-0173-z] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Accepted: 10/18/2016] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Even though circular fingerprints have been first introduced more than 50 years ago, they are still widely used for building highly predictive, state-of-the-art (Q)SAR models. Historically, these structural fragments were designed to search large molecular databases. Hence, to derive a compact representation, circular fingerprint fragments are often folded to comparatively short bit-strings. However, folding fingerprints introduces bit collisions, and therefore adds noise to the encoded structural information and removes its interpretability. Both representations, folded as well as unprocessed fingerprints, are often used for (Q)SAR modeling.

RESULTS

We show that it can be preferable to build (Q)SAR models with circular fingerprint fragments that have been filtered by supervised feature selection, instead of applying folded or all fragments. Compared to folded fingerprints, filtered fingerprints significantly increase predictive performance and remain unambiguous and interpretable. Compared to unprocessed fingerprints, filtered fingerprints reduce the computational effort and are a more compact and less redundant feature representation. Depending on the selected learning algorithm filtering yields about equally predictive (Q)SAR models. We demonstrate the suitability of filtered fingerprints for (Q)SAR modeling by presenting our freely available web service Collision-free Filtered Circular Fingerprints that provides rationales for predictions by highlighting important structural features in the query compound (see http://coffer.informatik.uni-mainz.de).

CONCLUSIONS

Circular fingerprints are potent structural features that yield highly predictive models and encode interpretable structural information. However, to not lose interpretability, circular fingerprints should not be folded when building prediction models. Our experiments show that filtering is a suitable option to reduce the high computational effort when working with all fingerprint fragments. Additionally, our experiments suggest that the area under precision recall curve is a more sensible statistic for validating (Q)SAR models for virtual screening than the area under ROC or other measures for early recognition.

GRAPHICAL ABSTRACT

Collapse

O'Hagan S, Kell DB. MetMaxStruct: A Tversky-Similarity-Based Strategy for Analysing the (Sub)Structural Similarities of Drugs and Endogenous Metabolites. Front Pharmacol 2016;7:266. [PMID: 27597830 PMCID: PMC4992690 DOI: 10.3389/fphar.2016.00266] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2015] [Accepted: 08/08/2016] [Indexed: 12/23/2022] Open

Abstract

BACKGROUND

Previous studies compared the molecular similarity of marketed drugs and endogenous human metabolites (endogenites), using a series of fingerprint-type encodings, variously ranked and clustered using the Tanimoto (Jaccard) similarity coefficient (TS). Because this gives equal weight to all parts of the encoding (thence to different substructures in the molecule) it may not be optimal, since in many cases not all parts of the molecule will bind to their macromolecular targets. Unsupervised methods cannot alone uncover this. We here explore the kinds of differences that may be observed when the TS is replaced-in a manner more equivalent to semi-supervised learning-by variants of the asymmetric Tversky (TV) similarity, that includes α and β parameters.

RESULTS

Dramatic differences are observed in (i) the drug-endogenite similarity heatmaps, (ii) the cumulative "greatest similarity" curves, and (iii) the fraction of drugs with a Tversky similarity to a metabolite exceeding a given value when the Tversky α and β parameters are varied from their Tanimoto values. The same is true when the sum of the α and β parameters is varied. A clear trend toward increased endogenite-likeness of marketed drugs is observed when α or β adopt values nearer the extremes of their range, and when their sum is smaller. The kinds of molecules exhibiting the greatest similarity to two interrogating drug molecules (chlorpromazine and clozapine) also vary in both nature and the values of their similarity as α and β are varied. The same is true for the converse, when drugs are interrogated with an endogenite. The fraction of drugs with a Tversky similarity to a molecule in a library exceeding a given value depends on the contents of that library, and α and β may be "tuned" accordingly, in a semi-supervised manner. At some values of α and β drug discovery library candidates or natural products can "look" much more like (i.e., have a numerical similarity much closer to) drugs than do even endogenites.

CONCLUSIONS

Overall, the Tversky similarity metrics provide a more useful range of examples of molecular similarity than does the simpler Tanimoto similarity, and help to draw attention to molecular similarities that would not be recognized if Tanimoto alone were used. Hence, the Tversky similarity metrics are likely to be of significant value in many general problems in cheminformatics.

Collapse

Paricharak S, IJzerman AP, Jenkins JL, Bender A, Nigsch F. Data-Driven Derivation of an "Informer Compound Set" for Improved Selection of Active Compounds in High-Throughput Screening. J Chem Inf Model 2016;56:1622-30. [PMID: 27487177 DOI: 10.1021/acs.jcim.6b00244] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Abstract

Despite the usefulness of high-throughput screening (HTS) in drug discovery, for some systems, low assay throughput or high screening cost can prohibit the screening of large numbers of compounds. In such cases, iterative cycles of screening involving active learning (AL) are employed, creating the need for smaller "informer sets" that can be routinely screened to build predictive models for selecting compounds from the screening collection for follow-up screens. Here, we present a data-driven derivation of an informer compound set with improved predictivity of active compounds in HTS, and we validate its benefit over randomly selected training sets on 46 PubChem assays comprising at least 300,000 compounds and covering a wide range of assay biology. The informer compound set showed improvement in BEDROC(α = 100), PRAUC, and ROCAUC values averaged over all assays of 0.024, 0.014, and 0.016, respectively, compared to randomly selected training sets, all with paired t-test p-values <10(-15). A per-assay assessment showed that the BEDROC(α = 100), which is of particular relevance for early retrieval of actives, improved for 38 out of 46 assays, increasing the success rate of smaller follow-up screens. Overall, we showed that an informer set derived from historical HTS activity data can be employed for routine small-scale exploratory screening in an assay-agnostic fashion. This approach led to a consistent improvement in hit rates in follow-up screens without compromising scaffold retrieval. The informer set is adjustable in size depending on the number of compounds one intends to screen, as performance gains are realized for sets with more than 3,000 compounds, and this set is therefore applicable to a variety of situations. Finally, our results indicate that random sampling may not adequately cover descriptor space, drawing attention to the importance of the composition of the training set for predicting actives.

Collapse

Raevsky OA, Polianczyk DE, Mukhametov A, Grigorev VY. Assessment of the classification abilities of the CNS multi-parametric optimization approach by the method of logistic regression. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2016;27:629-635. [PMID: 27477321 DOI: 10.1080/1062936x.2016.1212922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2016] [Accepted: 07/11/2016] [Indexed: 06/06/2023]

Wang Y, Cornett A, King FJ, Mao Y, Nigsch F, Paris CG, McAllister G, Jenkins JL. Evidence-Based and Quantitative Prioritization of Tool Compounds in Phenotypic Drug Discovery. Cell Chem Biol 2016;23:862-874. [PMID: 27427232 DOI: 10.1016/j.chembiol.2016.05.016] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Revised: 04/29/2016] [Accepted: 05/13/2016] [Indexed: 01/07/2023]