1
|
Shi S, Fu L, Yi J, Yang Z, Zhang X, Deng Y, Wang W, Wu C, Zhao W, Hou T, Zeng X, Lyu A, Cao D. ChemFH: an integrated tool for screening frequent false positives in chemical biology and drug discovery. Nucleic Acids Res 2024; 52:W439-W449. [PMID: 38783035 PMCID: PMC11223804 DOI: 10.1093/nar/gkae424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 04/25/2024] [Accepted: 05/10/2024] [Indexed: 05/25/2024] Open
Abstract
High-throughput screening rapidly tests an extensive array of chemical compounds to identify hit compounds for specific biological targets in drug discovery. However, false-positive results disrupt hit compound screening, leading to wastage of time and resources. To address this, we propose ChemFH, an integrated online platform facilitating rapid virtual evaluation of potential false positives, including colloidal aggregators, spectroscopic interference compounds, firefly luciferase inhibitors, chemical reactive compounds, promiscuous compounds, and other assay interferences. By leveraging a dataset containing 823 391 compounds, we constructed high-quality prediction models using multi-task directed message-passing network (DMPNN) architectures combining uncertainty estimation, yielding an average AUC value of 0.91. Furthermore, ChemFH incorporated 1441 representative alert substructures derived from the collected data and ten commonly used frequent hitter screening rules. ChemFH was validated with an external set of 75 compounds. Subsequently, the virtual screening capability of ChemFH was successfully confirmed through its application to five virtual screening libraries. Furthermore, ChemFH underwent additional validation on two natural products and FDA-approved drugs, yielding reliable and accurate results. ChemFH is a comprehensive, reliable, and computationally efficient screening pipeline that facilitates the identification of true positive results in assays, contributing to enhanced efficiency and success rates in drug discovery. ChemFH is freely available via https://chemfh.scbdd.com/.
Collapse
Affiliation(s)
- Shaohua Shi
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, P.R. China
- School of Chinese Medicine, Hong Kong Baptist University, Kowloon, Hong Kong SAR, 999077, P.R. China
| | - Li Fu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, P.R. China
| | - Jiacai Yi
- School of Computer Science, National University of Defense Technology, Changsha, Hunan 410073, P.R. China
| | - Ziyi Yang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, P.R. China
| | - Xiaochen Zhang
- School of Information Technology, Shangqiu Normal University, Shangqiu, Henan 476000, P.R. China
| | - Youchao Deng
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, P.R. China
| | - Wenxuan Wang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, P.R. China
| | - Chengkun Wu
- School of Computer Science, National University of Defense Technology, Changsha, Hunan 410073, P.R. China
| | - Wentao Zhao
- School of Computer Science, National University of Defense Technology, Changsha, Hunan 410073, P.R. China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, P.R. China
| | - Xiangxiang Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan 410082, P.R. China
| | - Aiping Lyu
- School of Chinese Medicine, Hong Kong Baptist University, Kowloon, Hong Kong SAR, 999077, P.R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, P.R. China
| |
Collapse
|
2
|
Tan L, Hirte S, Palmacci V, Stork C, Kirchmair J. Tackling assay interference associated with small molecules. Nat Rev Chem 2024; 8:319-339. [PMID: 38622244 DOI: 10.1038/s41570-024-00593-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/29/2024] [Indexed: 04/17/2024]
Abstract
Biochemical and cell-based assays are essential to discovering and optimizing efficacious and safe drugs, agrochemicals and cosmetics. However, false assay readouts stemming from colloidal aggregation, chemical reactivity, chelation, light signal attenuation and emission, membrane disruption, and other interference mechanisms remain a considerable challenge in screening synthetic compounds and natural products. To address assay interference, a range of powerful experimental approaches are available and in silico methods are now gaining traction. This Review begins with an overview of the scope and limitations of experimental approaches for tackling assay interference. It then focuses on theoretical methods, discusses strategies for their integration with experimental approaches, and provides recommendations for best practices. The Review closes with a summary of the critical facts and an outlook on potential future developments.
Collapse
Affiliation(s)
- Lu Tan
- Drug Discovery Sciences, Boehringer Ingelheim RCV GmbH & Co KG, Vienna, Austria
| | - Steffen Hirte
- Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, Vienna, Austria
- Vienna Doctoral School of Pharmaceutical, Nutritional and Sport Sciences (PhaNuSpo), University of Vienna, Vienna, Austria
| | - Vincenzo Palmacci
- Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, Vienna, Austria
- Vienna Doctoral School of Pharmaceutical, Nutritional and Sport Sciences (PhaNuSpo), University of Vienna, Vienna, Austria
| | - Conrad Stork
- Department of Informatics, Center for Bioinformatics, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, Hamburg, Germany
- BASF SE, Ludwigshafen am Rhein, Germany
| | - Johannes Kirchmair
- Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, Vienna, Austria.
- Christian Doppler Laboratory for Molecular Informatics in the Biosciences, Department for Pharmaceutical Sciences, University of Vienna, Vienna, Austria.
| |
Collapse
|
3
|
Li X, He X, Lin B, Li L, Deng Q, Wang C, Zhang J, Chen Y, Zhao J, Li X, Li Y, Xi Q, Zhang R. Quercetin Limits Tumor Immune Escape through PDK1/CD47 Axis in Melanoma. THE AMERICAN JOURNAL OF CHINESE MEDICINE 2024; 52:541-563. [PMID: 38490807 DOI: 10.1142/s0192415x2450023x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/17/2024]
Abstract
Quercetin (3,3[Formula: see text],4[Formula: see text],5,7-pentahydroxyflavone) is a bioactive plant-derived flavonoid, abundant in fruits and vegetables, that can effectively inhibit the growth of many types of tumors without toxicity. Nevertheless, the effect of quercetin on melanoma immunology has yet to be determined. This study aimed to investigate the role and mechanism of the antitumor immunity action of quercetin in melanoma through both in vivo and in vitro methods. Our research revealed that quercetin has the ability to boost antitumor immunity by modulating the tumor immune microenvironment through increasing the percentages of M1 macrophages, CD8[Formula: see text] T lymphocytes, and CD4[Formula: see text] T lymphocytes and promoting the secretion of IL-2 and IFN-[Formula: see text] from CD8[Formula: see text] T cells, consequently suppressing the growth of melanoma. Furthermore, we revealed that quercetin can inhibit cell proliferation and migration of B16 cells in a dose-dependent manner. In addition, down-regulating PDK1 can inhibit the mRNA and protein expression levels of CD47. In the rescue experiment, we overexpressed PDK1 and found that the protein and mRNA expression levels of CD47 increased correspondingly, while the addition of quercetin reversed this effect. Moreover, quercetin could stimulate the proliferation and enhance the function of CD8[Formula: see text] T cells. Therefore, our results identified a novel mechanism through which CD47 is regulated by quercetin to promote phagocytosis, and elucidated the regulation of quercetin on macrophages and CD8[Formula: see text] T cells in the tumor immune microenvironment. The use of quercetin as a therapeutic drug holds potential benefits for immunotherapy, enhancing the efficacy of existing treatments for melanoma.
Collapse
Affiliation(s)
- Xin Li
- Laboratory of Immunology and Inflammation, Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou 510006, P. R. China
| | - Xue He
- Laboratory of Immunology and Inflammation, Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou 510006, P. R. China
| | - Bing Lin
- Laboratory of Immunology and Inflammation, Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou 510006, P. R. China
| | - Li Li
- Laboratory of Immunology and Inflammation, Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou 510006, P. R. China
| | - Qifeng Deng
- Laboratory of Immunology and Inflammation, Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou 510006, P. R. China
| | - Chengzhi Wang
- Department of Immunology, Key Laboratory of Immune Microenvironment and Diseases of Educational Ministry of China, School of Basic Sciences, Tianjin Medical University, Tianjin 300203, P. R. China
| | - Jing Zhang
- Laboratory of Immunology and Inflammation, Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou 510006, P. R. China
| | - Ying Chen
- Laboratory of Immunology and Inflammation, Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou 510006, P. R. China
| | - Jingyi Zhao
- Laboratory of Immunology and Inflammation, Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou 510006, P. R. China
| | - Xinrui Li
- Laboratory of Immunology and Inflammation, Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou 510006, P. R. China
| | - Yan Li
- Laboratory of Immunology and Inflammation, Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou 510006, P. R. China
| | - Qing Xi
- Department of Gastroenterology, The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou 510062, P. R. China
- School of Biomedical Sciences and Engineering, South China University of Technology, Guangzhou 510641, P. R. China
| | - Rongxin Zhang
- Laboratory of Immunology and Inflammation, Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou 510006, P. R. China
| |
Collapse
|
4
|
Draper MR, Waterman A, Dannatt JE, Patel P. Integrating multiscale and machine learning approaches towards the SAMPL9 log P challenge. Phys Chem Chem Phys 2024; 26:7907-7919. [PMID: 38376855 PMCID: PMC10938873 DOI: 10.1039/d3cp04140a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
The partition coefficient (log P) is an important physicochemical property that provides information regarding a molecule's pharmacokinetics, toxicity, and bioavailability. Methods to accurately predict the partition coefficient have the potential to accelerate drug design. In an effort to test current methods and explore new computational techniques, the statistical assessment of the modeling of proteins and ligands (SAMPL) has established a blind prediction challenge. The ninth iteration challenge was to predict the toluene-water partition coefficient (log Ptol/w) of sixteen drug molecules. Herein, three approaches are reported broadly under the categories of quantum mechanics (QM), molecular mechanics (MM), and data-driven machine learning (ML). The three blind submissions yield mean unsigned errors (MUE) ranging from 1.53-2.93 log Ptol/w units. The MUEs were reduced to 1.00 log Ptol/w for the QM methods. While MM and ML methods outperformed DFT approaches for challenge molecules with fewer rotational degrees of freedom, they suffered for the larger molecules in this dataset. Overall, DFT functionals paired with a triple-ζ basis set were the simplest and most effective tool to obtain quantitatively accurate partition coefficients.
Collapse
Affiliation(s)
- Michael R Draper
- Chemistry Department, University of Dallas, Irving, Texas, 75062, USA.
| | - Asa Waterman
- Chemistry Department, University of Dallas, Irving, Texas, 75062, USA.
| | | | - Prajay Patel
- Chemistry Department, University of Dallas, Irving, Texas, 75062, USA.
| |
Collapse
|
5
|
Mohr SE, Kim AR, Hu Y, Perrimon N. Finding information about uncharacterized Drosophila melanogaster genes. Genetics 2023; 225:iyad187. [PMID: 37933691 PMCID: PMC10697813 DOI: 10.1093/genetics/iyad187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 10/02/2023] [Indexed: 11/08/2023] Open
Abstract
Genes that have been identified in the genome but remain uncharacterized with regards to function offer an opportunity to uncover novel biological information. Novelty is exciting but can also be a barrier. If nothing is known, how does one start planning and executing experiments? Here, we provide a recommended information-mining workflow and a corresponding guide to accessing information about uncharacterized Drosophila melanogaster genes, such as those assigned only a systematic coding gene identifier. The available information can provide insights into where and when the gene is expressed, what the function of the gene might be, whether there are similar genes in other species, whether there are known relationships to other genes, and whether any other features have already been determined. In addition, available information about relevant reagents can inspire and facilitate experimental studies. Altogether, mining available information can help prioritize genes for further study, as well as provide starting points for experimental assays and other analyses.
Collapse
Affiliation(s)
- Stephanie E Mohr
- Department of Genetics, Blavatnik Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Ah-Ram Kim
- Department of Genetics, Blavatnik Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Yanhui Hu
- Department of Genetics, Blavatnik Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Norbert Perrimon
- Department of Genetics, Blavatnik Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
- Howard Hughes Medical Institute, Boston, MA 02115, USA
| |
Collapse
|
6
|
Linciano P, Quotadamo A, Luciani R, Santucci M, Zorn KM, Foil DH, Lane TR, Cordeiro da Silva A, Santarem N, B Moraes C, Freitas-Junior L, Wittig U, Mueller W, Tonelli M, Ferrari S, Venturelli A, Gul S, Kuzikov M, Ellinger B, Reinshagen J, Ekins S, Costi MP. High-Throughput Phenotypic Screening and Machine Learning Methods Enabled the Selection of Broad-Spectrum Low-Toxicity Antitrypanosomatidic Agents. J Med Chem 2023; 66:15230-15255. [PMID: 37921561 PMCID: PMC10683024 DOI: 10.1021/acs.jmedchem.3c01322] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 10/14/2023] [Accepted: 10/18/2023] [Indexed: 11/04/2023]
Abstract
Broad-spectrum anti-infective chemotherapy agents with activity against Trypanosomes, Leishmania, and Mycobacterium tuberculosis species were identified from a high-throughput phenotypic screening program of the 456 compounds belonging to the Ty-Box, an in-house industry database. Compound characterization using machine learning approaches enabled the identification and synthesis of 44 compounds with broad-spectrum antiparasitic activity and minimal toxicity against Trypanosoma brucei, Leishmania Infantum, and Trypanosoma cruzi. In vitro studies confirmed the predictive models identified in compound 40 which emerged as a new lead, featured by an innovative N-(5-pyrimidinyl)benzenesulfonamide scaffold and promising low micromolar activity against two parasites and low toxicity. Given the volume and complexity of data generated by the diverse high-throughput screening assays performed on the compounds of the Ty-Box library, the chemoinformatic and machine learning tools enabled the selection of compounds eligible for further evaluation of their biological and toxicological activities and aided in the decision-making process toward the design and optimization of the identified lead.
Collapse
Affiliation(s)
- Pasquale Linciano
- Department
of Life Sciences, University of Modena and
Reggio Emilia, Via Campi 103, 41125 Modena, Italy
| | - Antonio Quotadamo
- Department
of Life Sciences, University of Modena and
Reggio Emilia, Via Campi 103, 41125 Modena, Italy
| | - Rosaria Luciani
- Department
of Life Sciences, University of Modena and
Reggio Emilia, Via Campi 103, 41125 Modena, Italy
| | - Matteo Santucci
- Department
of Life Sciences, University of Modena and
Reggio Emilia, Via Campi 103, 41125 Modena, Italy
| | - Kimberley M. Zorn
- Collaborations
Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Daniel H. Foil
- Collaborations
Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Thomas R. Lane
- Collaborations
Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Anabela Cordeiro da Silva
- Institute
for Molecular and Cell Biology, 4150-180 Porto, Portugal
- Instituto
de Investigaçao e Inovaçao em Saúde, Universidade do Porto and Institute for Molecular
and Cell Biology, 4150-180 Porto, Portugal
| | - Nuno Santarem
- Institute
for Molecular and Cell Biology, 4150-180 Porto, Portugal
- Instituto
de Investigaçao e Inovaçao em Saúde, Universidade do Porto and Institute for Molecular
and Cell Biology, 4150-180 Porto, Portugal
| | - Carolina B Moraes
- Brazilian
Biosciences National Laboratory (LNBio), Brazilian Center for Research in Energy and Materials (CNPEM), 13083-970 Campinas, São Paulo, Brazil
| | - Lucio Freitas-Junior
- Brazilian
Biosciences National Laboratory (LNBio), Brazilian Center for Research in Energy and Materials (CNPEM), 13083-970 Campinas, São Paulo, Brazil
| | - Ulrike Wittig
- Scientific
Databases and Visualization Group and Molecular and Cellular Modelling
Group, Heidelberg Institute for Theoretical
Studies (HITS), D-69118 Heidelberg, Germany
| | - Wolfgang Mueller
- Scientific
Databases and Visualization Group and Molecular and Cellular Modelling
Group, Heidelberg Institute for Theoretical
Studies (HITS), D-69118 Heidelberg, Germany
| | - Michele Tonelli
- Department
of Pharmacy, University of Genoa, Viale Benedetto XV n.3, 16132 Genoa, Italy
| | - Stefania Ferrari
- Department
of Life Sciences, University of Modena and
Reggio Emilia, Via Campi 103, 41125 Modena, Italy
| | - Alberto Venturelli
- Department
of Life Sciences, University of Modena and
Reggio Emilia, Via Campi 103, 41125 Modena, Italy
- TYDOCK
PHARMA S.r.l., Strada
Gherbella 294/b, 41126 Modena, Italy
| | - Sheraz Gul
- Fraunhofer
Translational Medicine and Pharmacology, Schnackenburgallee 114, D-22525 Hamburg, Germany
- Fraunhofer Cluster of Excellence Immune-Mediated Diseases
CIMD, Schnackenburgallee
114, D-22525 Hamburg, Germany
| | - Maria Kuzikov
- Fraunhofer
Translational Medicine and Pharmacology, Schnackenburgallee 114, D-22525 Hamburg, Germany
- Fraunhofer Cluster of Excellence Immune-Mediated Diseases
CIMD, Schnackenburgallee
114, D-22525 Hamburg, Germany
| | - Bernhard Ellinger
- Fraunhofer
Translational Medicine and Pharmacology, Schnackenburgallee 114, D-22525 Hamburg, Germany
- Fraunhofer Cluster of Excellence Immune-Mediated Diseases
CIMD, Schnackenburgallee
114, D-22525 Hamburg, Germany
| | - Jeanette Reinshagen
- Fraunhofer
Translational Medicine and Pharmacology, Schnackenburgallee 114, D-22525 Hamburg, Germany
- Fraunhofer Cluster of Excellence Immune-Mediated Diseases
CIMD, Schnackenburgallee
114, D-22525 Hamburg, Germany
| | - Sean Ekins
- Collaborations
Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Maria Paola Costi
- Department
of Life Sciences, University of Modena and
Reggio Emilia, Via Campi 103, 41125 Modena, Italy
| |
Collapse
|
7
|
Tran-Nguyen VK, Junaid M, Simeon S, Ballester PJ. A practical guide to machine-learning scoring for structure-based virtual screening. Nat Protoc 2023; 18:3460-3511. [PMID: 37845361 DOI: 10.1038/s41596-023-00885-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 07/03/2023] [Indexed: 10/18/2023]
Abstract
Structure-based virtual screening (SBVS) via docking has been used to discover active molecules for a range of therapeutic targets. Chemical and protein data sets that contain integrated bioactivity information have increased both in number and in size. Artificial intelligence and, more concretely, its machine-learning (ML) branch, including deep learning, have effectively exploited these data sets to build scoring functions (SFs) for SBVS against targets with an atomic-resolution 3D model (e.g., generated by X-ray crystallography or predicted by AlphaFold2). Often outperforming their generic and non-ML counterparts, target-specific ML-based SFs represent the state of the art for SBVS. Here, we present a comprehensive and user-friendly protocol to build and rigorously evaluate these new SFs for SBVS. This protocol is organized into four sections: (i) using a public benchmark of a given target to evaluate an existing generic SF; (ii) preparing experimental data for a target from public repositories; (iii) partitioning data into a training set and a test set for subsequent target-specific ML modeling; and (iv) generating and evaluating target-specific ML SFs by using the prepared training-test partitions. All necessary code and input/output data related to three example targets (acetylcholinesterase, HMG-CoA reductase, and peroxisome proliferator-activated receptor-α) are available at https://github.com/vktrannguyen/MLSF-protocol , can be run by using a single computer within 1 week and make use of easily accessible software/programs (e.g., Smina, CNN-Score, RF-Score-VS and DeepCoy) and web resources. Our aim is to provide practical guidance on how to augment training data to enhance SBVS performance, how to identify the most suitable supervised learning algorithm for a data set, and how to build an SF with the highest likelihood of discovering target-active molecules within a given compound library.
Collapse
Affiliation(s)
| | - Muhammad Junaid
- Centre de Recherche en Cancérologie de Marseille, Marseille, France
| | - Saw Simeon
- Centre de Recherche en Cancérologie de Marseille, Marseille, France
| | | |
Collapse
|
8
|
Long TZ, Shi SH, Liu S, Lu AP, Liu ZQ, Li M, Hou TJ, Cao DS. Structural Analysis and Prediction of Hematotoxicity Using Deep Learning Approaches. J Chem Inf Model 2023; 63:111-125. [PMID: 36472475 DOI: 10.1021/acs.jcim.2c01088] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Hematotoxicity has been becoming a serious but overlooked toxicity in drug discovery. However, only a few in silico models have been reported for the prediction of hematotoxicity. In this study, we constructed a high-quality dataset comprising 759 hematotoxic compounds and 1623 nonhematotoxic compounds and then established a series of classification models based on a combination of seven machine learning (ML) algorithms and nine molecular representations. The results based on two data partitioning strategies and applicability domain (AD) analysis illustrate that the best prediction model based on Attentive FP yielded a balanced accuracy (BA) of 72.6%, an area under the receiver operating characteristic curve (AUC) value of 76.8% for the validation set, and a BA of 69.2%, an AUC of 75.9% for the test set. In addition, compared with existing filtering rules and models, our model achieved the highest BA value of 67.5% for the external validation set. Additionally, the shapley additive explanation (SHAP) and atom heatmap approaches were utilized to discover the important features and structural fragments related to hematotoxicity, which could offer helpful tips to detect undesired positive substances. Furthermore, matched molecular pair analysis (MMPA) and representative substructure derivation technique were employed to further characterize and investigate the transformation principles and distinctive structural features of hematotoxic chemicals. We believe that the novel graph-based deep learning algorithms and insightful interpretation presented in this study can be used as a trustworthy and effective tool to assess hematotoxicity in the development of new drugs.
Collapse
Affiliation(s)
- Teng-Zhi Long
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Shao-Hua Shi
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China.,Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, 0000, P. R. China
| | - Shao Liu
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008, Hunan, P. R. China
| | - Ai-Ping Lu
- Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, 0000, P. R. China
| | - Zhao-Qian Liu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha 410083, P. R. China
| | - Ting-Jun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China.,Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, 0000, P. R. China.,Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008, Hunan, P. R. China
| |
Collapse
|
9
|
Urbina F, Ekins S. The Commoditization of AI for Molecule Design. ARTIFICIAL INTELLIGENCE IN THE LIFE SCIENCES 2022; 2:100031. [PMID: 36211981 PMCID: PMC9541920 DOI: 10.1016/j.ailsci.2022.100031] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Anyone involved in designing or finding molecules in the life sciences over the past few years has witnessed a dramatic change in how we now work due to the COVID-19 pandemic. Computational technologies like artificial intelligence (AI) seemed to become ubiquitous in 2020 and have been increasingly applied as scientists worked from home and were separated from the laboratory and their colleagues. This shift may be more permanent as the future of molecule design across different industries will increasingly require machine learning models for design and optimization of molecules as they become "designed by AI". AI and machine learning has essentially become a commodity within the pharmaceutical industry. This perspective will briefly describe our personal opinions of how machine learning has evolved and is being applied to model different molecule properties that crosses industries in their utility and ultimately suggests the potential for tight integration of AI into equipment and automated experimental pipelines. It will also describe how many groups have implemented generative models covering different architectures, for de novo design of molecules. We also highlight some of the companies at the forefront of using AI to demonstrate how machine learning has impacted and influenced our work. Finally, we will peer into the future and suggest some of the areas that represent the most interesting technologies that may shape the future of molecule design, highlighting how we can help increase the efficiency of the design-make-test cycle which is currently a major focus across industries.
Collapse
Affiliation(s)
- Fabio Urbina
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| |
Collapse
|
10
|
Deng Q, Li X, Fang C, Li X, Zhang J, Xi Q, Li Y, Zhang R. Cordycepin enhances anti-tumor immunity in colon cancer by inhibiting phagocytosis immune checkpoint CD47 expression. Int Immunopharmacol 2022; 107:108695. [PMID: 35305385 DOI: 10.1016/j.intimp.2022.108695] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Revised: 03/05/2022] [Accepted: 03/09/2022] [Indexed: 01/01/2023]
Abstract
Cordycepin, also known as 3'-deoxyadenosine, is an extract from Cordyceps militaris, which has been reported as an anti-inflammation and anti-tumor substance without toxicity. However, the pharmacological mechanism of Cordycepin on tumor immunity under its anti-tumor effect has not yet been elucidated. Herein, we investigated Cordycepin's anti-tumor effect on colon cancer both in vitro and in vivo. Our results show that Cordycepin can inhibit growth, migration, and promoted apoptosis of CT26 cells in a dose-dependent manner. Cordycepin suppressed the growth of colon cancer in mouse subcutaneous tumor model by modulating tumor immune microenvironment where CD4+ T, CD8+ T, M1 type macrophages, NK cells were up-regulated. Further investigations revealed that Cordycepin inhibited phagocytosis immune checkpoint CD47 protein expression by reducing BNIP3 expression. In addition, Cordycepin also inhibited the expression of TSP1 in tumor cells and Jurkat cells, which may reduce the binding of TSP1 to CD47, thereby reducing T cell apoptosis and allowing more T cells to infiltrate into tumors. And in vitro co-culture experiments proved that Cordycepin could enhance the phagocytosis of CT26 cells by macrophages. These results explained the underlying mechanism of the anti-tumor immunity of Cordycepin. In conclusion, our results identify a novel mechanism by which Cordycepin inhibits phagocytosis immune checkpoint CD47 in tumor cells to promote tumor cells phagocytosis of macrophages. Cordycepin may be able to serve as a more effective immunotherapeutic drug against colon cancer.
Collapse
Affiliation(s)
- Qifeng Deng
- Guangdong Provincial Key Laboratory for Biotechnology Drug Candidates, Institute of Basic Medical Sciences and Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou, China
| | - Xinrui Li
- Guangdong Provincial Key Laboratory for Biotechnology Drug Candidates, Institute of Basic Medical Sciences and Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou, China
| | - Chunqiang Fang
- Guangdong Provincial Key Laboratory for Biotechnology Drug Candidates, Institute of Basic Medical Sciences and Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou, China
| | - Xin Li
- Guangdong Provincial Key Laboratory for Biotechnology Drug Candidates, Institute of Basic Medical Sciences and Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou, China
| | - Jing Zhang
- Guangdong Provincial Key Laboratory for Biotechnology Drug Candidates, Institute of Basic Medical Sciences and Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou, China
| | - Qing Xi
- The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou, China; School of Biomedical Sciences and Engineering, South China University of Technology, Guangzhou, China
| | - Yan Li
- Guangdong Provincial Key Laboratory for Biotechnology Drug Candidates, Institute of Basic Medical Sciences and Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou, China
| | - Rongxin Zhang
- Guangdong Provincial Key Laboratory for Biotechnology Drug Candidates, Institute of Basic Medical Sciences and Department of Biotechnology, School of Life Sciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou, China.
| |
Collapse
|
11
|
Sicho M, Liu X, Svozil D, van Westen GJP. GenUI: interactive and extensible open source software platform for de novo molecular generation and cheminformatics. J Cheminform 2021; 13:73. [PMID: 34563271 PMCID: PMC8465716 DOI: 10.1186/s13321-021-00550-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 09/05/2021] [Indexed: 03/05/2023] Open
Abstract
Many contemporary cheminformatics methods, including computer-aided de novo drug design, hold promise to significantly accelerate and reduce the cost of drug discovery. Thanks to this attractive outlook, the field has thrived and in the past few years has seen an especially significant growth, mainly due to the emergence of novel methods based on deep neural networks. This growth is also apparent in the development of novel de novo drug design methods with many new generative algorithms now available. However, widespread adoption of new generative techniques in the fields like medicinal chemistry or chemical biology is still lagging behind the most recent developments. Upon taking a closer look, this fact is not surprising since in order to successfully integrate the most recent de novo drug design methods in existing processes and pipelines, a close collaboration between diverse groups of experimental and theoretical scientists needs to be established. Therefore, to accelerate the adoption of both modern and traditional de novo molecular generators, we developed Generator User Interface (GenUI), a software platform that makes it possible to integrate molecular generators within a feature-rich graphical user interface that is easy to use by experts of diverse backgrounds. GenUI is implemented as a web service and its interfaces offer access to cheminformatics tools for data preprocessing, model building, molecule generation, and interactive chemical space visualization. Moreover, the platform is easy to extend with customizable frontend React.js components and backend Python extensions. GenUI is open source and a recently developed de novo molecular generator, DrugEx, was integrated as a proof of principle. In this work, we present the architecture and implementation details of GenUI and discuss how it can facilitate collaboration in the disparate communities interested in de novo molecular generation and computer-aided drug discovery.
Collapse
Affiliation(s)
- M. Sicho
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28 Prague, Czech Republic
| | - X. Liu
- Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - D. Svozil
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28 Prague, Czech Republic
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the ASCR, v. v. i., Vídeňská 1083, 142 20 Prague 4, Czech Republic
| | - G. J. P. van Westen
- Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| |
Collapse
|
12
|
Guo F, Jiang C, Xi Y, Wang D, Zhang Y, Xie N, Guan Y, Zhang F, Yang H. Investigation of pharmacological mechanism of natural product using pathway fingerprints similarity based on "drug-target-pathway" heterogenous network. J Cheminform 2021; 13:68. [PMID: 34544480 PMCID: PMC8454151 DOI: 10.1186/s13321-021-00549-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 09/03/2021] [Indexed: 02/05/2023] Open
Abstract
Natural products from traditional medicine inherit bioactivity from their source herbs. However, the pharmacological mechanism of natural products is often unclear and studied insufficiently. Pathway fingerprint similarity based on "drug-target-pathway" heterogeneous network provides new insight into Mechanism of Action (MoA) for natural products compared with reference drugs, which are selected approved drugs with similar bioactivity. Natural products with similar pathway fingerprints may have similar MoA to approved drugs. In our study, XYPI, an andrographolide derivative, had similar anti-inflammatory activity to Glucocorticoids (GCs) and non-steroidal anti-inflammatory drugs (NSAIDs), and GCs and NSAIDs have completely different MoA. Based on similarity evaluation, XYPI has similar pathway fingerprints as NSAIDs, but has similar target profile with GCs. The expression pattern of genes in LPS-activated macrophages after XYPI treatment is similar to that after NSAID but not GC treatment, and this experimental result is consistent with the computational prediction based on pathway fingerprints. These results imply that the pathway fingerprints of drugs have potential for drug similarity evaluation. This study used XYPI as an example to propose a new approach for investigating the pharmacological mechanism of natural products using pathway fingerprint similarity based on a "drug-target-pathway" heterogeneous network.
Collapse
Affiliation(s)
- Feifei Guo
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
| | - Chunhong Jiang
- Joint Institute of Virology (Shantou University and The University of Hong Kong), Shantou University Medical College, Shantou, China
| | - Yujie Xi
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
- Tianjin University of Traditional Chinese Medicine, Tianjin, China
| | - Dan Wang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences-Beijing (PHOENIX Center), Beijing Institute of Lifeomics, Beijing, China
| | - Yi Zhang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
| | - Ning Xie
- State Key Laboratory of Innovative Natural Medicine and TCM Injections, Ganzhou, China
| | - Yi Guan
- Joint Institute of Virology (Shantou University and The University of Hong Kong), Shantou University Medical College, Shantou, China
| | - Fangbo Zhang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China.
| | - Hongjun Yang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China.
- Beijing Key Laboratory of Traditional Chinese Medicine Basic Research on Prevention and Treatment for Major Diseases, Experimental Research Center, China Academy of Chinese Medical Sciences, Beijing, China.
| |
Collapse
|
13
|
Batra K, Zorn KM, Foil DH, Minerali E, Gawriljuk VO, Lane TR, Ekins S. Quantum Machine Learning Algorithms for Drug Discovery Applications. J Chem Inf Model 2021; 61:2641-2647. [PMID: 34032436 PMCID: PMC8254374 DOI: 10.1021/acs.jcim.1c00166] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
The growing quantity of public and private data sets focused on small molecules screened against biological targets or whole organisms provides a wealth of drug discovery relevant data. This is matched by the availability of machine learning algorithms such as Support Vector Machines (SVM) and Deep Neural Networks (DNN) that are computationally expensive to perform on very large data sets with thousands of molecular descriptors. Quantum computer (QC) algorithms have been proposed to offer an approach to accelerate quantum machine learning over classical computer (CC) algorithms, however with significant limitations. In the case of cheminformatics, which is widely used in drug discovery, one of the challenges to overcome is the need for compression of large numbers of molecular descriptors for use on a QC. Here, we show how to achieve compression with data sets using hundreds of molecules (SARS-CoV-2) to hundreds of thousands of molecules (whole cell screening data sets for plague and M. tuberculosis) with SVM and the data reuploading classifier (a DNN equivalent algorithm) on a QC benchmarked against CC and hybrid approaches. This study illustrates the steps needed in order to be "quantum computer ready" in order to apply quantum computing to drug discovery and to provide the foundation on which to build this field.
Collapse
Affiliation(s)
- Kushal Batra
- Computer Science, NC State University, Raleigh, NC 27606, USA
| | - Kimberley M. Zorn
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| | - Daniel H. Foil
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| | - Eni Minerali
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| | - Victor O. Gawriljuk
- São Carlos Institute of Physics, University of São Paulo, Av. João Dagnone, 1100 - Santa Angelina, São Carlos - SP, 13563-120, Brazil
| | - Thomas R. Lane
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| |
Collapse
|
14
|
Wang Q, Du L, Hong J, Chen Z, Liu H, Li S, Xiao X, Yan S. Molecular mechanism underlying the hypolipidemic effect of Shanmei Capsule based on network pharmacology and molecular docking. Technol Health Care 2021; 29:239-256. [PMID: 33682762 PMCID: PMC8150495 DOI: 10.3233/thc-218023] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
BACKGROUND: Shanmei Capsule is a famous preparation in China. However, the related mechanism of Shanmei Capsule against hyperlipidemia has yet to be revealed. OBJECTIVE: To elucidate underlying mechanism of Shanmei Capsule against hyperlipidemia through network pharmacology approach and molecular docking. METHODS: Active ingredients, targets of Shanmei Capsule as well as targets for hyperlipidemia were screened based on database. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment were performed via Database for Annotation, Visualization, and Integrated Discovery (DAVID) 6.8 database. Ingredient-target-disease-pathway network was visualized utilizing Cytoscape software and molecular docking was performed by Autodock Vina. RESULTS: Seventeen active ingredients in Shanmei Capsule were screened out with a closely connection with 34 hyperlipidemia-related targets. GO analysis revealed 40 biological processes, 5 cellular components and 29 molecular functions. A total of 15 signal pathways were enriched by KEGG pathway enrichment analysis. The docking results indicated that the binding activities of key ingredients for PPAR-α are equivalent to that of the positive drug lifibrate. CONCLUSIONS: The possible molecular mechanism mainly involved PPAR signaling pathway, Bile secretion and TNF signaling pathway via acting on MAPK8, PPARγ, MMP9, PPARα, FABP4 and NOS2 targets.
Collapse
Affiliation(s)
- Qian Wang
- Institute of Traditional Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou, Guangdong 510006, China.,Institute of Traditional Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou, Guangdong 510006, China
| | - Lijing Du
- School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China.,Institute of Traditional Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou, Guangdong 510006, China
| | - Jiana Hong
- Institute of Traditional Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou, Guangdong 510006, China
| | - Zhenlin Chen
- Institute of Traditional Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou, Guangdong 510006, China
| | - Huijian Liu
- Shanxi Taihang Pharmaceutical Co., Ltd, Changzhi, Shanxi 046000, China
| | - Shasha Li
- The Second Clinical College of Guangzhou University of Chinese Medicine, Guangzhou, Guangdong 510006, China
| | - Xue Xiao
- Institute of Traditional Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou, Guangdong 510006, China
| | - Shikai Yan
- Institute of Traditional Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou, Guangdong 510006, China.,School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
15
|
Vincent F, Loria PM, Weston AD, Steppan CM, Doyonnas R, Wang YM, Rockwell KL, Peakman MC. Hit Triage and Validation in Phenotypic Screening: Considerations and Strategies. Cell Chem Biol 2020; 27:1332-1346. [DOI: 10.1016/j.chembiol.2020.08.009] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 05/31/2020] [Accepted: 08/14/2020] [Indexed: 02/06/2023]
|
16
|
Goya-Jorge E, Giner RM, Sylla-Iyarreta Veitía M, Gozalbes R, Barigye SJ. Predictive modeling of aryl hydrocarbon receptor (AhR) agonism. CHEMOSPHERE 2020; 256:127068. [PMID: 32447110 DOI: 10.1016/j.chemosphere.2020.127068] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 05/09/2020] [Accepted: 05/12/2020] [Indexed: 06/11/2023]
Abstract
The aryl hydrocarbon receptor (AhR) plays a key role in the regulation of gene expression in metabolic machinery and detoxification systems. In the recent years, this receptor has attracted interest as a therapeutic target for immunological, oncogenic and inflammatory conditions. In the present report, in silico and in vitro approaches were combined to study the activation of the AhR. To this end, a large database of chemical compounds with known AhR agonistic activity was employed to build 5 classifiers based on the Adaboost (AdB), Gradient Boosting (GB), Random Forest (RF), Multilayer Perceptron (MLP) and Support Vector Machine (SVM) algorithms, respectively. The built classifiers were examined, following a 10-fold external validation procedure, demonstrating adequate robustness and predictivity. These models were integrated into a majority vote based ensemble, subsequently used to screen an in-house library of compounds from which 40 compounds were selected for prospective in vitro experimental validation. The general correspondence between the ensemble predictions and the in vitro results suggests that the constructed ensemble may be useful in predicting the AhR agonistic activity, both in a toxicological and pharmacological context. A preliminary structure-activity analysis of the evaluated compounds revealed that all structures bearing a benzothiazole moiety induced AhR expression while diverse activity profiles were exhibited by phenolic derivatives.
Collapse
Affiliation(s)
- Elizabeth Goya-Jorge
- ProtoQSAR SL. CEEI (Centro Europeo de Empresas Innovadoras) Parque Tecnológico de Valencia, Av. Benjamin Franklin 12, 46980, Paterna, Valencia, Spain; Departament de Farmacologia, Facultat de Farmàcia, Universitat de València, Av. Vicente Andrés Estellés s/n, 46100, Burjassot, Valencia, Spain
| | - Rosa M Giner
- Departament de Farmacologia, Facultat de Farmàcia, Universitat de València, Av. Vicente Andrés Estellés s/n, 46100, Burjassot, Valencia, Spain
| | - Maité Sylla-Iyarreta Veitía
- Equipe de Chimie Moléculaire du Laboratoire Génomique, Bioinformatique et Chimie Moléculaire (EA 7528), Conservatoire National des Arts et Métiers (Cnam), 2 Rue Conté, HESAM Université, 75003, Paris, France
| | - Rafael Gozalbes
- ProtoQSAR SL. CEEI (Centro Europeo de Empresas Innovadoras) Parque Tecnológico de Valencia, Av. Benjamin Franklin 12, 46980, Paterna, Valencia, Spain
| | - Stephen J Barigye
- ProtoQSAR SL. CEEI (Centro Europeo de Empresas Innovadoras) Parque Tecnológico de Valencia, Av. Benjamin Franklin 12, 46980, Paterna, Valencia, Spain.
| |
Collapse
|
17
|
Benchmarking Data Sets from PubChem BioAssay Data: Current Scenario and Room for Improvement. Int J Mol Sci 2020; 21:ijms21124380. [PMID: 32575564 PMCID: PMC7352161 DOI: 10.3390/ijms21124380] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 06/15/2020] [Accepted: 06/18/2020] [Indexed: 11/17/2022] Open
Abstract
Developing realistic data sets for evaluating virtual screening methods is a task that has been tackled by the cheminformatics community for many years. Numerous artificially constructed data collections were developed, such as DUD, DUD-E, or DEKOIS. However, they all suffer from multiple drawbacks, one of which is the absence of experimental results confirming the impotence of presumably inactive molecules, leading to possible false negatives in the ligand sets. In light of this problem, the PubChem BioAssay database, an open-access repository providing the bioactivity information of compounds that were already tested on a biological target, is now a recommended source for data set construction. Nevertheless, there exist several issues with the use of such data that need to be properly addressed. In this article, an overview of benchmarking data collections built upon experimental PubChem BioAssay input is provided, along with a thorough discussion of noteworthy issues that one must consider during the design of new ligand sets from this database. The points raised in this review are expected to guide future developments in this regard, in hopes of offering better evaluation tools for novel in silico screening procedures.
Collapse
|
18
|
Minerali E, Foil DH, Zorn KM, Lane TR, Ekins S. Comparing Machine Learning Algorithms for Predicting Drug-Induced Liver Injury (DILI). Mol Pharm 2020; 17:2628-2637. [PMID: 32422053 DOI: 10.1021/acs.molpharmaceut.0c00326] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Drug-induced liver injury (DILI) is one the most unpredictable adverse reactions to xenobiotics in humans and the leading cause of postmarketing withdrawals of approved drugs. To date, these drugs have been collated by the FDA to form the DILIRank database, which classifies DILI severity and potential. These classifications have been used by various research groups in generating computational predictions for this type of liver injury. Recently, groups from Pfizer and AstraZeneca have collated DILI in vitro data and physicochemical properties for compounds that can be used along with data from the FDA to build machine learning models for DILI. In this study, we have used these data sets, as well as the Biopharmaceutics Drug Disposition Classification System data set, to generate Bayesian machine learning models with our in-house software, Assay Central. The performance of all machine learning models was assessed through both the internal 5-fold cross-validation metrics and prediction accuracy of an external test set of compounds with known hepatotoxicity. The best-performing Bayesian model was based on the DILI-concern category from the DILIRank database with an ROC of 0.814, a sensitivity of 0.741, a specificity of 0.755, and an accuracy of 0.746. A comparison of alternative machine learning algorithms, such as k-nearest neighbors, support vector classification, AdaBoosted decision trees, and deep learning methods, produced similar statistics to those generated with the Bayesian algorithm in Assay Central. This study demonstrates machine learning models grouped in a tool called MegaTox that can be used to predict early-stage clinical compounds, as well as recent FDA-approved drugs, to identify potential DILI.
Collapse
Affiliation(s)
- Eni Minerali
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Daniel H Foil
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Kimberley M Zorn
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Thomas R Lane
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Sean Ekins
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| |
Collapse
|
19
|
Yang ZY, Dong J, Yang ZJ, Lu AP, Hou TJ, Cao DS. Structural Analysis and Identification of False Positive Hits in Luciferase-Based Assays. J Chem Inf Model 2020; 60:2031-2043. [PMID: 32202787 DOI: 10.1021/acs.jcim.9b01188] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Luciferase-based bioluminescence detection techniques are highly favored in high-throughput screening (HTS), in which the firefly luciferase (FLuc) is the most commonly used variant. However, FLuc inhibitors can interfere with the activity of luciferase, which may result in false positive signals in HTS assays. In order to reduce the unnecessary cost of time and money, an in silico prediction model for FLuc inhibitors is highly desirable. In this study, we built an extensive data set consisting of 20 888 FLuc inhibitors and 198 608 noninhibitors, and then developed a group of classification models based on the combination of three machine learning (ML) algorithms and four types of molecular representations. The best prediction model based on XGBoost and ECFP4 and MOE2d descriptors yielded a balanced accuracy (BA) of 0.878 and an area under the receiver operating characteristic curve (AUC) value of 0.958 for the validation set, and a BA of 0.886 and an AUC of 0.947 for the test set. Three external validation sets, including set 1 (3231 FLuc inhibitors and 69 783 noninhibitors), set 2 (695 FLuc inhibitors and 75 913 noninhibitors), and set 3 (1138 FLuc inhibitors and 8155 noninhibitors), were used to verify the predictive ability of our models. The BA values for the three external validation sets given by the best model are 0.864, 0.845, and 0.791, respectively. In addition, the important features or structural fragments related to FLuc inhibitors were recognized by the Shapley additive explanations (SHAP) method along with their influences on predictions, which may provide valuable clues to detecting undesirable luciferase inhibitors. Based on the important and explanatory features, 16 rules were proposed for detecting FLuc inhibitors, which can achieve a correction rate of 70% for FLuc inhibitors. Furthermore, a comparison with existing prediction rules and models for FLuc inhibitors used in virtual screening verified the high reliability of the models and rules proposed in this study. We also used the model to screen three curated chemical databases, and almost 10% of the molecules in the evaluated databases were predicted as inhibitors, highlighting the potential risk of false positives in luciferase-based assays. Finally, a public web server called ChemFLuc was developed (http://admet.scbdd.com/chemfluc/index/), and it offers a free available service to predict potential FLuc inhibitors.
Collapse
Affiliation(s)
- Zi-Yi Yang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410003, P.R. China
| | - Jie Dong
- Central South University of Forestry and Technology, Changsha, 410004, P.R. China
| | - Zhi-Jiang Yang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410003, P.R. China
| | - Ai-Ping Lu
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, P.R. China
| | - Ting-Jun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P.R. China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410003, P.R. China.,Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, P.R. China
| |
Collapse
|
20
|
Abstract
Suramin is 100 years old and is still being used to treat the first stage of acute human sleeping sickness, caused by Trypanosoma brucei rhodesiense Suramin is a multifunctional molecule with a wide array of potential applications, from parasitic and viral diseases to cancer, snakebite, and autism. Suramin is also an enigmatic molecule: What are its targets? How does it get into cells in the first place? Here, we provide an overview of the many different candidate targets of suramin and discuss its modes of action and routes of cellular uptake. We reason that, once the polypharmacology of suramin is understood at the molecular level, new, more specific, and less toxic molecules can be identified for the numerous potential applications of suramin.
Collapse
|
21
|
Silvestri I, Lyu H, Fata F, Banta PR, Mattei B, Ippoliti R, Bellelli A, Pitari G, Ardini M, Petukhova V, Thatcher GRJ, Petukhov PA, Williams DL, Angelucci F. Ectopic suicide inhibition of thioredoxin glutathione reductase. Free Radic Biol Med 2020; 147:200-211. [PMID: 31870799 PMCID: PMC7583042 DOI: 10.1016/j.freeradbiomed.2019.12.019] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 12/13/2019] [Accepted: 12/17/2019] [Indexed: 02/07/2023]
Abstract
Selective suicide inhibitors represent a seductively attractive approach for inactivation of therapeutically relevant enzymes since they are generally devoid of off-target toxicity in vivo. While most suicide inhibitors are converted to reactive species at enzyme active sites, theoretically bioactivation can also occur in ectopic (secondary) sites that have no known function. Here, we report an example of such an "ectopic suicide inhibition", an unprecedented bioactivation mechanism of a suicide inhibitor carried out by a non-catalytic site of thioredoxin glutathione reductase (TGR). TGR is a promising drug target to treat schistosomiasis, a devastating human parasitic disease. Utilizing hits selected from a high throughput screening campaign, time-resolved X-ray crystallography, molecular dynamics, mass spectrometry, molecular modeling, protein mutagenesis and functional studies, we find that 2-naphtholmethylamino derivatives bound to this novel ectopic site of Schistosoma mansoni (Sm)TGR are transformed to covalent modifiers and react with its mobile selenocysteine-containing C-terminal arm. In particular, one 2-naphtholmethylamino compound is able to specifically induce the pro-oxidant activity in the inhibited enzyme. Since some 2-naphtholmethylamino analogues show worm killing activity and the ectopic site is not conserved in human orthologues, a general approach to development of novel and selective anti-parasitic therapeutics against schistosoma is proposed.
Collapse
Affiliation(s)
- Ilaria Silvestri
- Dept. of Life, Health and Environmental Sciences, University of L'Aquila, Italy
| | - Haining Lyu
- Dept. of Microbial Pathogens and Immunity, Rush University Medical Center, Chicago, IL, USA
| | - Francesca Fata
- Dept. of Life, Health and Environmental Sciences, University of L'Aquila, Italy
| | - Paul R Banta
- Dept. of Microbial Pathogens and Immunity, Rush University Medical Center, Chicago, IL, USA
| | - Benedetta Mattei
- Dept. of Life, Health and Environmental Sciences, University of L'Aquila, Italy
| | - Rodolfo Ippoliti
- Dept. of Life, Health and Environmental Sciences, University of L'Aquila, Italy
| | - Andrea Bellelli
- Dept. of Biochemical Sciences, Sapienza University of Rome, Italy
| | - Giuseppina Pitari
- Dept. of Life, Health and Environmental Sciences, University of L'Aquila, Italy
| | - Matteo Ardini
- Dept. of Life, Health and Environmental Sciences, University of L'Aquila, Italy
| | - Valentina Petukhova
- Dept. of Pharmaceutical Sciences, College of Pharmacy, University of Illinois at Chicago, Chicago, IL, USA
| | - Gregory R J Thatcher
- Dept. of Pharmaceutical Sciences, College of Pharmacy, University of Illinois at Chicago, Chicago, IL, USA
| | - Pavel A Petukhov
- Dept. of Pharmaceutical Sciences, College of Pharmacy, University of Illinois at Chicago, Chicago, IL, USA.
| | - David L Williams
- Dept. of Microbial Pathogens and Immunity, Rush University Medical Center, Chicago, IL, USA.
| | - Francesco Angelucci
- Dept. of Life, Health and Environmental Sciences, University of L'Aquila, Italy.
| |
Collapse
|
22
|
Vo AH, Van Vleet TR, Gupta RR, Liguori MJ, Rao MS. An Overview of Machine Learning and Big Data for Drug Toxicity Evaluation. Chem Res Toxicol 2019; 33:20-37. [DOI: 10.1021/acs.chemrestox.9b00227] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Andy H. Vo
- Department of Preclinical Safety, AbbVie, 1 North Waukegan Road, North Chicago, Illinois 60064, United States
| | - Terry R. Van Vleet
- Department of Preclinical Safety, AbbVie, 1 North Waukegan Road, North Chicago, Illinois 60064, United States
| | - Rishi R. Gupta
- Information Research, Research and Development, AbbVie, 1 North Waukegan Road, North Chicago, Illinois 60064, United States
| | - Michael J. Liguori
- Department of Preclinical Safety, AbbVie, 1 North Waukegan Road, North Chicago, Illinois 60064, United States
| | - Mohan S. Rao
- Department of Preclinical Safety, AbbVie, 1 North Waukegan Road, North Chicago, Illinois 60064, United States
| |
Collapse
|
23
|
Gaspar HA, Gerring Z, Hübel C, Middeldorp CM, Derks EM, Breen G. Using genetic drug-target networks to develop new drug hypotheses for major depressive disorder. Transl Psychiatry 2019; 9:117. [PMID: 30877270 PMCID: PMC6420656 DOI: 10.1038/s41398-019-0451-4] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Revised: 01/28/2019] [Accepted: 02/12/2019] [Indexed: 12/25/2022] Open
Abstract
The major depressive disorder (MDD) working group of the Psychiatric Genomics Consortium (PGC) has published a genome-wide association study (GWAS) for MDD in 130,664 cases, identifying 44 risk variants. We used these results to investigate potential drug targets and repurposing opportunities. We built easily interpretable bipartite drug-target networks integrating interactions between drugs and their targets, genome-wide association statistics, and genetically predicted expression levels in different tissues, using the online tool Drug Targetor ( drugtargetor.com ). We also investigated drug-target relationships that could be impacting MDD. MAGMA was used to perform pathway analyses and S-PrediXcan to investigate the directionality of tissue-specific expression levels in patients vs. controls. Outside the major histocompatibility complex (MHC) region, 153 protein-coding genes are significantly associated with MDD in MAGMA after multiple testing correction; among these, five are predicted to be down or upregulated in brain regions and 24 are known druggable genes. Several drug classes were significantly enriched, including monoamine reuptake inhibitors, sex hormones, antipsychotics, and antihistamines, indicating an effect on MDD and potential repurposing opportunities. These findings not only require validation in model systems and clinical examination, but also show that GWAS may become a rich source of new therapeutic hypotheses for MDD and other psychiatric disorders that need new-and better-treatment options.
Collapse
Affiliation(s)
- Héléna A Gaspar
- King's College London, Institute of Psychiatry, Psychology and Neuroscience, Social, Genetic and Developmental Psychiatry (SGDP) Centre, London, SE5 8AF, UK.
- National Institute for Health Research Biomedical Research Centre, South London and Maudsley National Health Service Trust, London, EC1V 2PD, UK.
| | - Zachary Gerring
- Translational Neurogenomics Laboratory, QIMR Berghofer Institute of Medical Research, Brisbane City, QLD 4006, Australia
| | - Christopher Hübel
- King's College London, Institute of Psychiatry, Psychology and Neuroscience, Social, Genetic and Developmental Psychiatry (SGDP) Centre, London, SE5 8AF, UK
- National Institute for Health Research Biomedical Research Centre, South London and Maudsley National Health Service Trust, London, EC1V 2PD, UK
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Christel M Middeldorp
- Child Health Research Centre, University of Queensland, South Brisbane, QLD 4072, Australia
- Child and Youth Mental Health Service, Children's Health Queensland Hospital and Health Service, South Brisbane, QLD 4101, Australia
- Biological Psychology, Vrije Universiteit Amsterdam, 1081 HV, Amsterdam, Netherlands
| | - Eske M Derks
- Translational Neurogenomics Laboratory, QIMR Berghofer Institute of Medical Research, Brisbane City, QLD 4006, Australia
| | - Gerome Breen
- King's College London, Institute of Psychiatry, Psychology and Neuroscience, Social, Genetic and Developmental Psychiatry (SGDP) Centre, London, SE5 8AF, UK
- National Institute for Health Research Biomedical Research Centre, South London and Maudsley National Health Service Trust, London, EC1V 2PD, UK
| |
Collapse
|
24
|
Tang W, Chen J, Wang Z, Xie H, Hong H. Deep learning for predicting toxicity of chemicals: a mini review. JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH. PART C, ENVIRONMENTAL CARCINOGENESIS & ECOTOXICOLOGY REVIEWS 2019; 36:252-271. [PMID: 30821199 DOI: 10.1080/10590501.2018.1537563] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Humans and wildlife inhabit a world with panoply of natural and synthetic chemicals. Alarmingly, only a limited number of chemicals have undergone comprehensive toxicological evaluation due to limitations of traditional toxicity testing. High-throughput screening assays provide a higher-speed alternative for conventional toxicity testing. Advancement of high-throughput bioassay technology has greatly increased chemical toxicity data volumes in the past decade, pushing toxicology research into a "big data" era. However, traditional data analysis methods fail to effectively process large data volumes, presenting both a challenge and an opportunity for toxicologists. Deep learning, a machine learning method leveraging deep neural networks (DNNs), is a proven useful tool for building quantitative structure-activity relationship (QSAR) models for toxicity prediction utilizing these new large datasets. In this mini review, a brief technical background on DNNs is provided, and the current state of chemical toxicity prediction models built with DNNs is reviewed. In addition, relevant toxicity data sources are summarized, possible limitations are discussed, and perspectives on DNN utilization in chemical toxicity prediction are given.
Collapse
Affiliation(s)
- Weihao Tang
- a Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology , Dalian University of Technology , Dalian , China
| | - Jingwen Chen
- a Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology , Dalian University of Technology , Dalian , China
| | - Zhongyu Wang
- a Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology , Dalian University of Technology , Dalian , China
| | - Hongbin Xie
- a Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology , Dalian University of Technology , Dalian , China
| | - Huixiao Hong
- b National Center for Toxicological Research , U.S. Food and Drug Administration , Jefferson , Arkansas , USA
| |
Collapse
|
25
|
Abstract
Although significant advances in experimental high throughput screening (HTS) have been made for drug lead identification, in silico virtual screening (VS) is indispensable owing to its unique advantage over experimental HTS, target-focused, cheap, and efficient, albeit its disadvantage of producing false positive hits. For both experimental HTS and VS, the quality of screening libraries is crucial and determines the outcome of those studies. In this paper, we first reviewed the recent progress on screening library construction. We realized the urgent need for compiling high-quality screening libraries in drug discovery. Then we compiled a set of screening libraries from about 20 million druglike ZINC molecules by running fingerprint-based similarity searches against known drug molecules. Lastly, the screening libraries were objectively evaluated using 5847 external actives covering more than 2000 drug targets. The result of the assessment is very encouraging. For example, with the Tanimoto coefficient being set to 0.75, 36% of external actives were retrieved and the enrichment factor was 13. Additionally, drug target family specific screening libraries were also constructed and evaluated. The druglike screening libraries are available for download from https://mulan.pharmacy.pitt.edu .
Collapse
Affiliation(s)
- Junmei Wang
- Department of Pharmaceutical Sciences , The University of Pittsburgh , 3501 Terrace Street , Pittsburgh , Pennsylvania 15261 , United States
| | - Yubin Ge
- Department of Pharmaceutical Sciences , The University of Pittsburgh , 3501 Terrace Street , Pittsburgh , Pennsylvania 15261 , United States
| | - Xiang-Qun Xie
- Department of Pharmaceutical Sciences , The University of Pittsburgh , 3501 Terrace Street , Pittsburgh , Pennsylvania 15261 , United States
| |
Collapse
|
26
|
Southan C, Sharman JL, Faccenda E, Pawson AJ, Harding SD, Davies JA. Challenges of Connecting Chemistry to Pharmacology: Perspectives from Curating the IUPHAR/BPS Guide to PHARMACOLOGY. ACS OMEGA 2018; 3:8408-8420. [PMID: 30087946 PMCID: PMC6070956 DOI: 10.1021/acsomega.8b00884] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Accepted: 07/12/2018] [Indexed: 06/08/2023]
Abstract
Connecting chemistry to pharmacology has been an objective of Guide to PHARMACOLOGY (GtoPdb) and its precursor the International Union of Basic and Clinical Pharmacology Database (IUPHAR-DB) since 2003. This has been achieved by populating our database with expert-curated relationships between documents, assays, quantitative results, chemical structures, their locations within the documents, and the protein targets in the assays (D-A-R-C-P). A wide range of challenges associated with this are described in this perspective, using illustrative examples from GtoPdb entries. Our selection process begins with judgments of pharmacological relevance and scientific quality. Even though we have a stringent focus for our small-data extraction, we note that assessing the quality of papers has become more difficult over the last 15 years. We discuss ambiguity issues with the resolution of authors' descriptions of A-R-C-P entities to standardized identifiers. We also describe developments that have made this somewhat easier over the same period both in the publication ecosystem and recent enhancements of our internal processes. This perspective concludes with a look at challenges for the future, including the wider capture of mechanistic nuances and possible impacts of text mining on automated entity extraction.
Collapse
|
27
|
Rodríguez-Pérez R, Miyao T, Jasial S, Vogt M, Bajorath J. Prediction of Compound Profiling Matrices Using Machine Learning. ACS OMEGA 2018; 3:4713-4723. [PMID: 30023899 PMCID: PMC6045364 DOI: 10.1021/acsomega.8b00462] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 04/20/2018] [Indexed: 05/25/2023]
Abstract
Screening of compound libraries against panels of targets yields profiling matrices. Such matrices typically contain structurally diverse screening compounds, large numbers of inactives, and small numbers of hits per assay. As such, they represent interesting and challenging test cases for computational screening and activity predictions. In this work, modeling of large compound profiling matrices was attempted that were extracted from publicly available screening data. Different machine learning methods including deep learning were compared and different prediction strategies explored. Prediction accuracy varied for assays with different numbers of active compounds, and alternative machine learning approaches often produced comparable results. Deep learning did not further increase the prediction accuracy of standard methods such as random forests or support vector machines. Target-based random forest models were prioritized and yielded successful predictions of active compounds for many assays.
Collapse
|
28
|
Southan C. Caveat Usor: Assessing Differences between Major Chemistry Databases. ChemMedChem 2018; 13:470-481. [PMID: 29451740 PMCID: PMC5900829 DOI: 10.1002/cmdc.201700724] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Revised: 02/07/2018] [Indexed: 12/24/2022]
Abstract
The three databases of PubChem, ChemSpider, and UniChem capture the majority of open chemical structure records with February 2018 totals of 95, 63, and 154 million, respectively. Collectively, they constitute a massively enabling resource for cheminformatics, chemical biology, and drug discovery. As meta-portals, they subsume and link out to the major proportion of public bioactivity data extracted from the literature and screening center assay results. Therefore, they not only present three different entry points, but the many subsumed independent resources present a fourth entry point in the form of standalone databases. Because this creates a complex picture it is important for users to have at least some appreciation of differential content to enable utility judgments for the tasks at hand. This turns out to be challenging. By comparing the three resources in detail, this review assesses their differences, some of which are not obvious. This includes the fact that coverage is significantly different between the 587, 282, and 38 contributing sources, respectively. This not only presents the "who-has-what" question, but also the reason "why" any particular inclusion is considered valuable is rarely made explicit. Also confusing is that sources nominally in common (i.e., having the same submitter name) can have significantly different structure counts, not only in each of the three but also from their standalone instantiations. Assessing a series of examples indicates that differences in loading dates and structural standardization are the main causes of this inter-portal discordance.
Collapse
Affiliation(s)
- Christopher Southan
- IUPHAR/BPS Guide to PHARMACOLOGY, Deanery of Biomedical SciencesUniversity of EdinburghEdinburghEH8 9XDUK
| |
Collapse
|