1
|
Cankara F, Senyuz S, Sayin AZ, Gursoy A, Keskin O. DiPPI: A Curated Data Set for Drug-like Molecules in Protein-Protein Interfaces. J Chem Inf Model 2024; 64:5041-5051. [PMID: 38907989 DOI: 10.1021/acs.jcim.3c01905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/24/2024]
Abstract
Proteins interact through their interfaces, and dysfunction of protein-protein interactions (PPIs) has been associated with various diseases. Therefore, investigating the properties of the drug-modulated PPIs and interface-targeting drugs is critical. Here, we present a curated large data set for drug-like molecules in protein interfaces. We further introduce DiPPI (Drugs in Protein-Protein Interfaces), a two-module web site to facilitate the search for such molecules and their properties by exploiting our data set in drug repurposing studies. In the interface module of the web site, we present several properties, of interfaces, such as amino acid properties, hotspots, evolutionary conservation of drug-binding amino acids, and post-translational modifications of these residues. On the drug-like molecule side, we list drug-like small molecules and FDA-approved drugs from various databases and highlight those that bind to the interfaces. We further clustered the drugs based on their molecular fingerprints to confine the search for an alternative drug to a smaller space. Drug properties, including Lipinski's rules and various molecular descriptors, are also calculated and made available on the web site to guide the selection of drug molecules. Our data set contains 534,203 interfaces for 98,632 protein structures, of which 55,135 are detected to bind to a drug-like molecule. 2214 drug-like molecules are deposited on our web site, among which 335 are FDA-approved. DiPPI provides users with an easy-to-follow scheme for drug repurposing studies through its well-curated and clustered interface and drug data and is freely available at http://interactome.ku.edu.tr:8501.
Collapse
Affiliation(s)
- Fatma Cankara
- Graduate School of Sciences and Engineering, Koç University, İstanbul 34450, Turkey
| | - Simge Senyuz
- Graduate School of Sciences and Engineering, Koç University, İstanbul 34450, Turkey
| | - Ahenk Zeynep Sayin
- Department of Chemical and Biological Engineering, Koç University, İstanbul 34450, Turkey
| | - Attila Gursoy
- Department of Computer Engineering, Koç University, İstanbul 34450, Turkey
| | - Ozlem Keskin
- Department of Chemical and Biological Engineering, Koç University, İstanbul 34450, Turkey
| |
Collapse
|
2
|
Latosińska M, Latosińska JN. The Chameleon Strategy-A Recipe for Effective Ligand Screening for Viral Targets Based on Four Novel Structure-Binding Strength Indices. Viruses 2024; 16:1073. [PMID: 39066235 PMCID: PMC11281727 DOI: 10.3390/v16071073] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Revised: 06/28/2024] [Accepted: 06/30/2024] [Indexed: 07/28/2024] Open
Abstract
The RNA viruses SARS-CoV, SARS-CoV-2 and MERS-CoV encode the non-structural Nsp16 (2'-O-methyltransferase) that catalyzes the transfer of a methyl group from S-adenosylmethionine (SAM) to the first ribonucleotide in mRNA. Recently, it has been found that breaking the bond between Nsp16 and SAM substrate results in the cessation of mRNA virus replication. To date, only a limited number of such inhibitors have been identified, which can be attributed to a lack of an effective "recipe". The aim of our study was to propose and verify a rapid and effective screening protocol dedicated to such purposes. We proposed four new indices describing structure-binding strength (structure-binding affinity, structure-hydrogen bonding, structure-steric and structure-protein-ligand indices) were then applied and shown to be extremely helpful in determining the degree of increase or decrease in binding affinity in response to a relatively small change in the ligand structure. After initial pre-selection, based on similarity to SAM, we limited the study to 967 compounds, so-called molecular chameleons. They were then docked in the Nsp16 protein pocket, and 10 candidate ligands were selected using the novel structure-binding affinity index. Subsequently the selected 10 candidate ligands and 8 known inhibitors and were docked to Nsp16 pockets from SARS-CoV-2, MERS-CoV and SARS-CoV. Based on the four new indices, the best ligands were selected and a new one was designed by tuning them. Finally, ADMET profiling and molecular dynamics simulations were performed for the best ligands. The new structure-binding strength indices can be successfully applied not only to screen and tune ligands, but also to determine the effectiveness of the ligand in response to changes in the target viral entity, which is particularly useful for assessing drug effectiveness in the case of alterations in viral proteins. The developed approach, the so-called chameleon strategy, has the capacity to introduce a novel universal paradigm to the field of drugs design, including RNA antivirals.
Collapse
|
3
|
Das S, Merz KM. Molecular Gas-Phase Conformational Ensembles. J Chem Inf Model 2024; 64:749-760. [PMID: 38206321 DOI: 10.1021/acs.jcim.3c01309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2024]
Abstract
Accurately determining the global minima of a molecular structure is important in diverse scientific fields, including drug design, materials science, and chemical synthesis. Conformational search engines serve as valuable tools for exploring the extensive conformational space of molecules and for identifying energetically favorable conformations. In this study, we present a comparison of Auto3D, CREST, Balloon, and ETKDG (from RDKit), which are freely available conformational search engines, to evaluate their effectiveness in locating global minima. These engines employ distinct methodologies, including machine learning (ML) potential-based, semiempirical, and force field-based approaches. To validate these methods, we propose the use of collisional cross-section (CCS) values obtained from ion mobility-mass spectrometry studies. We hypothesize that experimental gas-phase CCS values can provide experimental evidence that we likely have the global minimum for a given molecule. To facilitate this effort, we used our gas-phase conformation library (GPCL) which currently consists of the full ensembles of 20 small molecules and can be used by the community to validate any conformational search engine. Further members of the GPCL can be readily created for any molecule of interest using our standard workflow used to compute CCS values, expanding the ability of the GPCL in validation exercises. These innovative validation techniques enhance our understanding of the conformational landscape and provide valuable insights into the performance of conformational generation engines. Our findings shed light on the strengths and limitations of each search engine, enabling informed decisions for their utilization in various scientific fields, where accurate molecular structure determination is crucial for understanding biological activity and designing targeted interventions. By facilitating the identification of reliable conformations, this study significantly contributes to enhancing the efficiency and accuracy of molecular structure determination, with particular focus on metabolite structure elucidation. The findings of this research also provide valuable insights for developing effective workflows for predicting the structures of unknown compounds with high precision.
Collapse
Affiliation(s)
- Susanta Das
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
| | - Kenneth M Merz
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
| |
Collapse
|
4
|
Carracedo-Reboredo P, Aranzamendi E, He S, Arrasate S, Munteanu CR, Fernandez-Lozano C, Sotomayor N, Lete E, González-Díaz H. MATEO: intermolecular α-amidoalkylation theoretical enantioselectivity optimization. Online tool for selection and design of chiral catalysts and products. J Cheminform 2024; 16:9. [PMID: 38254200 PMCID: PMC10804835 DOI: 10.1186/s13321-024-00802-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 01/11/2024] [Indexed: 01/24/2024] Open
Abstract
The enantioselective Brønsted acid-catalyzed α-amidoalkylation reaction is a useful procedure is for the production of new drugs and natural products. In this context, Chiral Phosphoric Acid (CPA) catalysts are versatile catalysts for this type of reactions. The selection and design of new CPA catalysts for different enantioselective reactions has a dual interest because new CPA catalysts (tools) and chiral drugs or materials (products) can be obtained. However, this process is difficult and time consuming if approached from an experimental trial and error perspective. In this work, an Heuristic Perturbation-Theory and Machine Learning (HPTML) algorithm was used to seek a predictive model for CPA catalysts performance in terms of enantioselectivity in α-amidoalkylation reactions with R2 = 0.96 overall for training and validation series. It involved a Monte Carlo sampling of > 100,000 pairs of query and reference reactions. In addition, the computational and experimental investigation of a new set of intermolecular α-amidoalkylation reactions using BINOL-derived N-triflylphosphoramides as CPA catalysts is reported as a case of study. The model was implemented in a web server called MATEO: InterMolecular Amidoalkylation Theoretical Enantioselectivity Optimization, available online at: https://cptmltool.rnasa-imedir.com/CPTMLTools-Web/mateo . This new user-friendly online computational tool would enable sustainable optimization of reaction conditions that could lead to the design of new CPA catalysts along with new organic synthesis products.
Collapse
Affiliation(s)
- Paula Carracedo-Reboredo
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain
- Department of Computer Science and Information Technologies, Faculty of Computer Science, CITIC-Research Center of Information and Communication Technologies, University of A Coruña, Campus Elviña s/n, 15071, A Coruña, Spain
| | - Eider Aranzamendi
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain
| | - Shan He
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain
- IKERDATA S.L., ZITEK, University of Basque Country UPVEHU, Rectorate Building, 48940, Leioa, Spain
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain
| | - Cristian R Munteanu
- Department of Computer Science and Information Technologies, Faculty of Computer Science, CITIC-Research Center of Information and Communication Technologies, University of A Coruña, Campus Elviña s/n, 15071, A Coruña, Spain
| | - Carlos Fernandez-Lozano
- Department of Computer Science and Information Technologies, Faculty of Computer Science, CITIC-Research Center of Information and Communication Technologies, University of A Coruña, Campus Elviña s/n, 15071, A Coruña, Spain
| | - Nuria Sotomayor
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain.
| | - Esther Lete
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain.
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain.
- IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Spain.
| |
Collapse
|
5
|
Xia S, Chen E, Zhang Y. Integrated Molecular Modeling and Machine Learning for Drug Design. J Chem Theory Comput 2023; 19:7478-7495. [PMID: 37883810 PMCID: PMC10653122 DOI: 10.1021/acs.jctc.3c00814] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 10/10/2023] [Accepted: 10/11/2023] [Indexed: 10/28/2023]
Abstract
Modern therapeutic development often involves several stages that are interconnected, and multiple iterations are usually required to bring a new drug to the market. Computational approaches have increasingly become an indispensable part of helping reduce the time and cost of the research and development of new drugs. In this Perspective, we summarize our recent efforts on integrating molecular modeling and machine learning to develop computational tools for modulator design, including a pocket-guided rational design approach based on AlphaSpace to target protein-protein interactions, delta machine learning scoring functions for protein-ligand docking as well as virtual screening, and state-of-the-art deep learning models to predict calculated and experimental molecular properties based on molecular mechanics optimized geometries. Meanwhile, we discuss remaining challenges and promising directions for further development and use a retrospective example of FDA approved kinase inhibitor Erlotinib to demonstrate the use of these newly developed computational tools.
Collapse
Affiliation(s)
- Song Xia
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Eric Chen
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Yingkai Zhang
- Department
of Chemistry, New York University, New York, New York 10003, United States
- Simons
Center for Computational Physical Chemistry at New York University, New York, New York 10003, United States
- NYU-ECNU
Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
6
|
Devaraji V, Sivaraman J. Exploring the potential of machine learning to design antidiabetic molecules: a comprehensive study with experimental validation. J Biomol Struct Dyn 2023:1-22. [PMID: 37938122 DOI: 10.1080/07391102.2023.2275176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 10/20/2023] [Indexed: 11/09/2023]
Abstract
Recent advances in hardware and software algorithms have led to the rise of data-driven approaches for designing therapeutic modalities. One of the major causes of human mortality is diabetes. Thus, there is a tremendous opportunity for research into effective antidiabetic designs. Therefore, in this study, we used machine learning-based small molecule design. We used various chemoinformatic and binary fingerprint techniques on small molecules to construct multiple models for alpha-amylase inhibitors. Among these models, the top models were used for ensemble-based machine learning predictions on libraries of organic molecules supplemented with synthetic scaffolds that could be used as antidiabetic agents. Further, involved identifying 10 promising molecules from computational studies and determining their inhibitory effects on alpha-amylase. These molecules were synthesised and thoroughly analysed to assess their biological inhibitory properties. Then, thermodynamic simulations were conducted to determine the stability and affinity of experimentally active molecules. The research results showcased the top 10 ML models recorded impressive statistics with an average model score of 0.8216, Pearson-r value of 0.827 and external validation yielding a Q2 value of 0.835, proving their reliability and accuracy. Ten derivatives of benzothiophene dioxolane was prime research focus due to computational predictions. The biological inhibitory assay of synthesised molecules showed that small molecules with ID ALC5 and ALC6 exhibited inhibitory efficiencies (IC50) of 2.1 ± 0.14 µM and 5.71 ± 0.02 µM against alpha-amylase enzyme, whereas other molecules showed moderate inhibition. In conclusion, the positive results of the experiment indicate that researchers should explore machine learning-driven design.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Vinod Devaraji
- Computational Drug Design Lab, Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - Jayanthi Sivaraman
- Computational Drug Design Lab, Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| |
Collapse
|
7
|
Venkatraman V. FP-MAP: an extensive library of fingerprint-based molecular activity prediction tools. Front Chem 2023; 11:1239467. [PMID: 37649967 PMCID: PMC10462816 DOI: 10.3389/fchem.2023.1239467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 07/31/2023] [Indexed: 09/01/2023] Open
Abstract
Discovering new drugs for disease treatment is challenging, requiring a multidisciplinary effort as well as time, and resources. With a view to improving hit discovery and lead compound identification, machine learning (ML) approaches are being increasingly used in the decision-making process. Although a number of ML-based studies have been published, most studies only report fragments of the wider range of bioactivities wherein each model typically focuses on a particular disease. This study introduces FP-MAP, an extensive atlas of fingerprint-based prediction models that covers a diverse range of activities including neglected tropical diseases (caused by viral, bacterial and parasitic pathogens) as well as other targets implicated in diseases such as Alzheimer's. To arrive at the best predictive models, performance of ≈4,000 classification/regression models were evaluated on different bioactivity data sets using 12 different molecular fingerprints. The best performing models that achieved test set AUC values of 0.62-0.99 have been integrated into an easy-to-use graphical user interface that can be downloaded from https://gitlab.com/vishsoft/fpmap.
Collapse
Affiliation(s)
- Vishwesh Venkatraman
- Department of Chemistry, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
8
|
Stoyanova R, Katzberger PM, Komissarov L, Khadhraoui A, Sach-Peltason L, Groebke Zbinden K, Schindler T, Manevski N. Computational Predictions of Nonclinical Pharmacokinetics at the Drug Design Stage. J Chem Inf Model 2023; 63:442-458. [PMID: 36595708 DOI: 10.1021/acs.jcim.2c01134] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Although computational predictions of pharmacokinetics (PK) are desirable at the drug design stage, existing approaches are often limited by prediction accuracy and human interpretability. Using a discovery data set of mouse and rat PK studies at Roche (9,685 unique compounds), we performed a proof-of-concept study to predict key PK properties from chemical structure alone, including plasma clearance (CLp), volume of distribution at steady-state (Vss), and oral bioavailability (F). Ten machine learning (ML) models were evaluated, including Single-Task, Multitask, and transfer learning approaches (i.e., pretraining with in vitro data). In addition to prediction accuracy, we emphasized human interpretability of outcomes, especially the quantification of uncertainty, applicability domains, and explanations of predictions in terms of molecular features. Results show that intravenous (IV) PK properties (CLp and Vss) can be predicted with good precision (average absolute fold error, AAFE of 1.96-2.84 depending on data split) and low bias (average fold error, AFE of 0.98-1.36), with AutoGluon, Gaussian Process Regressor (GP), and ChemProp displaying the best performance. Driven by higher complexity of oral PK studies, predictions of F were more challenging, with the best AAFE values of 2.35-2.60 and higher overprediction bias (AFE of 1.45-1.62). Multi-Task approaches and pretraining of ChemProp neural networks with in vitro data showed similar precision to Single-Task models but helped reduce the bias and increase correlations between observations and predictions. A combination of GP-computed prediction variance, molecular clustering, and dimensionality-reduction provided valuable quantitative insights into prediction uncertainty and applicability domains. SHAPley Additive exPlanations (SHAPs) highlighted molecular features contributing to prediction outcomes of Vss, providing explanations that could aid drug design. Combined results show that computational predictions of PK are feasible at the drug design stage, with several ML technologies converging to successfully leverage historical PK data sets. Further studies are needed to unlock the full potential of this approach, especially with respect to data set sizes and quality, transfer learning between in vitro and in vivo data sets, model-independent quantification of uncertainty, and explainability of predictions.
Collapse
Affiliation(s)
- Raya Stoyanova
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Paul Maximilian Katzberger
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Leonid Komissarov
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Aous Khadhraoui
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Lisa Sach-Peltason
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Katrin Groebke Zbinden
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Torsten Schindler
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Nenad Manevski
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| |
Collapse
|
9
|
The use of machine learning modeling, virtual screening, molecular docking, and molecular dynamics simulations to identify potential VEGFR2 kinase inhibitors. Sci Rep 2022; 12:18825. [PMID: 36335233 PMCID: PMC9637137 DOI: 10.1038/s41598-022-22992-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 10/21/2022] [Indexed: 11/08/2022] Open
Abstract
Targeting the signaling pathway of the Vascular endothelial growth factor receptor-2 is a promising approach that has drawn attention in the quest to develop novel anti-cancer drugs and cardiovascular disease treatments. We construct a screening pipeline using machine learning classification integrated with similarity checks of approved drugs to find new inhibitors. The statistical metrics reveal that the random forest approach has slightly better performance. By further similarity screening against several approved drugs, two candidates are selected. Analysis of absorption, distribution, metabolism, excretion, and toxicity, along with molecular docking and dynamics are performed for the two candidates with regorafenib as a reference. The binding energies of molecule1, molecule2, and regorafenib are - 89.1, - 95.3, and - 87.4 (kJ/mol), respectively which suggest candidate compounds have strong binding to the target. Meanwhile, the median lethal dose and maximum tolerated dose for regorafenib, molecule1, and molecule2 are predicted to be 800, 1600, and 393 mg/kg, and 0.257, 0.527, and 0.428 log mg/kg/day, respectively. Also, the inhibitory activity of these compounds is predicted to be 7.23 and 7.31, which is comparable with the activity of pazopanib and sorafenib drugs. In light of these findings, the two compounds could be further investigated as potential candidates for anti-angiogenesis therapy.
Collapse
|
10
|
Integrating cell morphology with gene expression and chemical structure to aid mitochondrial toxicity detection. Commun Biol 2022; 5:858. [PMID: 35999457 PMCID: PMC9399120 DOI: 10.1038/s42003-022-03763-5] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 07/25/2022] [Indexed: 12/05/2022] Open
Abstract
Mitochondrial toxicity is an important safety endpoint in drug discovery. Models based solely on chemical structure for predicting mitochondrial toxicity are currently limited in accuracy and applicability domain to the chemical space of the training compounds. In this work, we aimed to utilize both -omics and chemical data to push beyond the state-of-the-art. We combined Cell Painting and Gene Expression data with chemical structural information from Morgan fingerprints for 382 chemical perturbants tested in the Tox21 mitochondrial membrane depolarization assay. We observed that mitochondrial toxicants differ from non-toxic compounds in morphological space and identified compound clusters having similar mechanisms of mitochondrial toxicity, thereby indicating that morphological space provides biological insights related to mechanisms of action of this endpoint. We further showed that models combining Cell Painting, Gene Expression features and Morgan fingerprints improved model performance on an external test set of 244 compounds by 60% (in terms of F1 score) and improved extrapolation to new chemical space. The performance of our combined models was comparable with dedicated in vitro assays for mitochondrial toxicity. Our results suggest that combining chemical descriptors with biological readouts enhances the detection of mitochondrial toxicants, with practical implications in drug discovery. Cell Painting, gene expression, and chemical structural data are used to examine the differences between mitochondrial toxicants and non-toxicants and enhance the detection of mitotoxic compounds for future drug discovery.
Collapse
|
11
|
Comparative Analysis of Binary Similarity Measures for Compound Identification in MassSpectrometry-Based Metabolomics. Metabolites 2022; 12:metabo12080694. [PMID: 35893261 PMCID: PMC9394311 DOI: 10.3390/metabo12080694] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 07/22/2022] [Accepted: 07/26/2022] [Indexed: 02/01/2023] Open
Abstract
Compound identification is a critical step in untargeted metabolomics. Its most important procedure is to calculate the similarity between experimental mass spectra and either predicted mass spectra or mass spectra in a mass spectral library. Unlike the continuous similarity measures, there is no study to assess the performance of binary similarity measures in compound identification, even though the well-known Jaccard similarity measure has been widely used without proper evaluation. The objective of this study is thus to evaluate the performance of binary similarity measures for compound identification in untargeted metabolomics. Fifteen binary similarity measures, including the well-known Jaccard, Dice, Sokal–Sneath, Cosine, and Simpson measures, were selected to assess their performance in compound identification. using both electron ionization (EI) and electrospray ionization (ESI) mass spectra. Our theoretical evaluations show that the accuracy of the compound identification was exactly the same between the Jaccard, Dice, 3W-Jaccard, Sokal–Sneath, and Kulczynski measures, between the Cosine and Hellinger measures, and between the McConnaughey and Driver–Kroeber measures, which were practically confirmed using mass spectra libraries. From the mass spectrum-based evaluation, we observed that the best performing similarity measures were the McConnaughey and Driver–Kroeber measures for EI mass spectra and the Cosine and Hellinger measures for ESI mass spectra. The most robust similarity measure was the Fager–McGowan measure, the second-best performing similarity measure in both EI and ESI mass spectra.
Collapse
|
12
|
Multi-Step In Silico Discovery of Natural Drugs against COVID-19 Targeting Main Protease. Int J Mol Sci 2022; 23:ijms23136912. [PMID: 35805916 PMCID: PMC9266348 DOI: 10.3390/ijms23136912] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 06/15/2022] [Accepted: 06/15/2022] [Indexed: 02/04/2023] Open
Abstract
In continuation of our antecedent work against COVID-19, three natural compounds, namely, Luteoside C (130), Kahalalide E (184), and Streptovaricin B (278) were determined as the most promising SARS-CoV-2 main protease (Mpro) inhibitors among 310 naturally originated antiviral compounds. This was performed via a multi-step in silico method. At first, a molecular structure similarity study was done with PRD_002214, the co-crystallized ligand of Mpro (PDB ID: 6LU7), and favored thirty compounds. Subsequently, the fingerprint study performed with respect to PRD_002214 resulted in the election of sixteen compounds (7, 128, 130, 156, 157, 158, 180, 184, 203, 204, 210, 237, 264, 276, 277, and 278). Then, results of molecular docking versus Mpro PDB ID: 6LU7 favored eight compounds (128, 130, 156, 180, 184, 203, 204, and 278) based on their binding affinities. Then, in silico toxicity studies were performed for the promising compounds and revealed that all of them have good toxicity profiles. Finally, molecular dynamic (MD) simulation experiments were carried out for compounds 130, 184, and 278, which exhibited the best binding modes against Mpro. MD tests revealed that luteoside C (130) has the greatest potential to inhibit SARS-CoV-2 main protease.
Collapse
|
13
|
Lenci E, Trabocchi A. Diversity‐Oriented Synthesis and Chemoinformatics: A Fruitful Synergy towards Better Chemical Libraries. European J Org Chem 2022. [DOI: 10.1002/ejoc.202200575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Elena Lenci
- Universita degli Studi di Firenze Department of Chemistry Via della Lastruccia 1350019Italia 50019 Sesto Fiorentino ITALY
| | - Andrea Trabocchi
- University of Florence: Universita degli Studi di Firenze Department of Chemistry "Ugo Schiff" ITALY
| |
Collapse
|
14
|
Naga D, Muster W, Musvasva E, Ecker GF. Off-targetP ML: an open source machine learning framework for off-target panel safety assessment of small molecules. J Cheminform 2022; 14:27. [PMID: 35525988 PMCID: PMC9077900 DOI: 10.1186/s13321-022-00603-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Accepted: 03/26/2022] [Indexed: 11/10/2022] Open
Abstract
Unpredicted drug safety issues constitute the majority of failures in the pharmaceutical industry according to several studies. Some of these preclinical safety issues could be attributed to the non-selective binding of compounds to targets other than their intended therapeutic target, causing undesired adverse events. Consequently, pharmaceutical companies routinely run in-vitro safety screens to detect off-target activities prior to preclinical and clinical studies. Hereby we present an open source machine learning framework aiming at the prediction of our in-house 50 off-target panel activities for ~ 4000 compounds, directly from their structure. This framework is intended to guide chemists in the drug design process prior to synthesis and to accelerate drug discovery. We also present a set of ML approaches that require minimum programming experience for deployment. The workflow incorporates different ML approaches such as deep learning and automated machine learning. It also accommodates popular issues faced in bioactivity predictions, as data imbalance, inter-target duplicated measurements and duplicated public compound identifiers. Throughout the workflow development, we explore and compare the capability of Neural Networks and AutoML in constructing prediction models for fifty off-targets of different protein classes, different dataset sizes, and high-class imbalance. Outcomes from different methods are compared in terms of efficiency and efficacy. The most important challenges and factors impacting model construction and performance in addition to suggestions on how to overcome such challenges are also discussed.
Collapse
Affiliation(s)
- Doha Naga
- Roche Pharma Research & Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland.,Department of Pharmaceutical Sciences, University of Vienna, Vienna, Austria
| | - Wolfgang Muster
- Roche Pharma Research & Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Eunice Musvasva
- Roche Pharma Research & Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Gerhard F Ecker
- Department of Pharmaceutical Sciences, University of Vienna, Vienna, Austria.
| |
Collapse
|
15
|
Zagidullin B, Wang Z, Guan Y, Pitkänen E, Tang J. Comparative analysis of molecular fingerprints in prediction of drug combination effects. Brief Bioinform 2021; 22:bbab291. [PMID: 34401895 PMCID: PMC8574997 DOI: 10.1093/bib/bbab291] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 06/01/2021] [Accepted: 07/07/2021] [Indexed: 12/18/2022] Open
Abstract
Application of machine and deep learning methods in drug discovery and cancer research has gained a considerable amount of attention in the past years. As the field grows, it becomes crucial to systematically evaluate the performance of novel computational solutions in relation to established techniques. To this end, we compare rule-based and data-driven molecular representations in prediction of drug combination sensitivity and drug synergy scores using standardized results of 14 high-throughput screening studies, comprising 64 200 unique combinations of 4153 molecules tested in 112 cancer cell lines. We evaluate the clustering performance of molecular representations and quantify their similarity by adapting the Centered Kernel Alignment metric. Our work demonstrates that to identify an optimal molecular representation type, it is necessary to supplement quantitative benchmark results with qualitative considerations, such as model interpretability and robustness, which may vary between and throughout preclinical drug development projects.
Collapse
Affiliation(s)
- B Zagidullin
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Finland
| | - Z Wang
- Department of Electrical Engineering & Computer Science, University of Michigan, Ann Arbor, USA
| | - Y Guan
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, USA
| | - E Pitkänen
- Institute for Molecular Medicine Finland (FIMM) & Applied Tumor Genomics Research Program, Research Programs Unit, University of Helsinki, Finland
| | - J Tang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Finland
| |
Collapse
|
16
|
Huber F, van der Burg S, van der Hooft JJJ, Ridder L. MS2DeepScore: a novel deep learning similarity measure to compare tandem mass spectra. J Cheminform 2021; 13:84. [PMID: 34715914 PMCID: PMC8556919 DOI: 10.1186/s13321-021-00558-4] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 09/25/2021] [Indexed: 11/18/2022] Open
Abstract
Mass spectrometry data is one of the key sources of information in many workflows in medicine and across the life sciences. Mass fragmentation spectra are generally considered to be characteristic signatures of the chemical compound they originate from, yet the chemical structure itself usually cannot be easily deduced from the spectrum. Often, spectral similarity measures are used as a proxy for structural similarity but this approach is strongly limited by a generally poor correlation between both metrics. Here, we propose MS2DeepScore: a novel Siamese neural network to predict the structural similarity between two chemical structures solely based on their MS/MS fragmentation spectra. Using a cleaned dataset of > 100,000 mass spectra of about 15,000 unique known compounds, we trained MS2DeepScore to predict structural similarity scores for spectrum pairs with high accuracy. In addition, sampling different model varieties through Monte-Carlo Dropout is used to further improve the predictions and assess the model's prediction uncertainty. On 3600 spectra of 500 unseen compounds, MS2DeepScore is able to identify highly-reliable structural matches and to predict Tanimoto scores for pairs of molecules based on their fragment spectra with a root mean squared error of about 0.15. Furthermore, the prediction uncertainty estimate can be used to select a subset of predictions with a root mean squared error of about 0.1. Furthermore, we demonstrate that MS2DeepScore outperforms classical spectral similarity measures in retrieving chemically related compound pairs from large mass spectral datasets, thereby illustrating its potential for spectral library matching. Finally, MS2DeepScore can also be used to create chemically meaningful mass spectral embeddings that could be used to cluster large numbers of spectra. Added to the recently introduced unsupervised Spec2Vec metric, we believe that machine learning-supported mass spectral similarity measures have great potential for a range of metabolomics data processing pipelines.
Collapse
Affiliation(s)
- Florian Huber
- Netherlands eScience Center, 1098 XG, Amsterdam, The Netherlands.
| | | | | | - Lars Ridder
- Netherlands eScience Center, 1098 XG, Amsterdam, The Netherlands
| |
Collapse
|
17
|
Shaker B, Ahmad S, Lee J, Jung C, Na D. In silico methods and tools for drug discovery. Comput Biol Med 2021; 137:104851. [PMID: 34520990 DOI: 10.1016/j.compbiomed.2021.104851] [Citation(s) in RCA: 127] [Impact Index Per Article: 42.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 09/05/2021] [Accepted: 09/05/2021] [Indexed: 12/28/2022]
Abstract
In the past, conventional drug discovery strategies have been successfully employed to develop new drugs, but the process from lead identification to clinical trials takes more than 12 years and costs approximately $1.8 billion USD on average. Recently, in silico approaches have been attracting considerable interest because of their potential to accelerate drug discovery in terms of time, labor, and costs. Many new drug compounds have been successfully developed using computational methods. In this review, we briefly introduce computational drug discovery strategies and outline up-to-date tools to perform the strategies as well as available knowledge bases for those who develop their own computational models. Finally, we introduce successful examples of anti-bacterial, anti-viral, and anti-cancer drug discoveries that were made using computational methods.
Collapse
Affiliation(s)
- Bilal Shaker
- Department of Biomedical Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea
| | - Sajjad Ahmad
- Department of Health and Biological Sciences, Abasyn University, Peshawar, 25000, Pakistan
| | - Jingyu Lee
- Department of Biomedical Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea
| | - Chanjin Jung
- Department of Biomedical Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea
| | - Dokyun Na
- Department of Biomedical Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea.
| |
Collapse
|
18
|
Manelfi C, Gemei M, Talarico C, Cerchia C, Fava A, Lunghini F, Beccari AR. "Molecular Anatomy": a new multi-dimensional hierarchical scaffold analysis tool. J Cheminform 2021; 13:54. [PMID: 34301327 PMCID: PMC8299179 DOI: 10.1186/s13321-021-00526-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Accepted: 06/13/2021] [Indexed: 11/10/2022] Open
Abstract
The scaffold representation is widely employed to classify bioactive compounds on the basis of common core structures or correlate compound classes with specific biological activities. In this paper, we present a novel approach called "Molecular Anatomy" as a flexible and unbiased molecular scaffold-based metrics to cluster large set of compounds. We introduce a set of nine molecular representations at different abstraction levels, combined with fragmentation rules, to define a multi-dimensional network of hierarchically interconnected molecular frameworks. We demonstrate that the introduction of a flexible scaffold definition and multiple pruning rules is an effective method to identify relevant chemical moieties. This approach allows to cluster together active molecules belonging to different molecular classes, capturing most of the structure activity information, in particular when libraries containing a huge number of singletons are analyzed. We also propose a procedure to derive a network visualization that allows a full graphical representation of compounds dataset, permitting an efficient navigation in the scaffold's space and significantly contributing to perform high quality SAR analysis. The protocol is freely available as a web interface at https://ma.exscalate.eu .
Collapse
Affiliation(s)
- Candida Manelfi
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | - Marica Gemei
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | - Carmine Talarico
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | - Carmen Cerchia
- Department of Pharmacy, University of Naples "Federico II", 80131, Napoli, Italy
| | - Anna Fava
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | - Filippo Lunghini
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | | |
Collapse
|
19
|
Ekaney LYE, Eni DB, Ntie-Kang F. Chemical similarity methods for analyzing secondary metabolite structures. PHYSICAL SCIENCES REVIEWS 2021. [DOI: 10.1515/psr-2018-0129] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Abstract
The relation that exists between the structure of a compound and its function is an integral part of chemoinformatics. The similarity principle states that “structurally similar molecules tend to have similar properties and similar molecules exert similar biological activities”. The similarity of the molecules can either be studied at the structure level or at the descriptor level (properties level). Generally, the objective of chemical similarity measures is to enhance prediction of the biological activities of molecules. In this article, an overview of various methods used to compare the similarity between metabolite structures has been provided, including two-dimensional (2D) and three-dimensional (3D) approaches. The focus has been on methods description; e.g. fingerprint-based similarity in which the molecules under study are first fragmented and their fingerprints are computed, 2D structural similarity by comparing the Tanimoto coefficients and Euclidean distances, as well as the use of physiochemical properties descriptor-based similarity methods. The similarity between molecules could also be measured by using data mining (clustering) techniques, e.g. by using virtual screening (VS)-based similarity methods. In this approach, the molecules with the desired descriptors or /and structures are screened from large databases. Lastly, SMILES-based chemical similarity search is an important method for studying the exact structure search, substructure search and also descriptor similarity. The use of a particular method depends upon the requirements of the researcher.
Collapse
Affiliation(s)
- Lena Y. E. Ekaney
- Faculty of Science, Department of Chemistry , University of Buea , P.O. Box 63 , Buea , Cameroon
| | - Donatus B. Eni
- Faculty of Science, Department of Chemistry , University of Buea , P.O. Box 63 , Buea , Cameroon
- Department of Inorganic Chemistry, Faculty of Science , University of Yaoundé I , Yaoundé , Cameroon
| | - Fidele Ntie-Kang
- Faculty of Science, Department of Chemistry , University of Buea , P.O. Box 63 , Buea , Cameroon
- Department of Pharmaceutical Chemistry , Martin-Luther University Halle-Wittenberg , Kurt-Mothes-Str. 3 , Halle (Saale) , 06120 Germany
- Department of Informatics and Chemistry , University of Chemistry and Technology Prague , Technická 5 Prague 6 , Dejvice , 166 28 Czech Republic
| |
Collapse
|
20
|
Bajusz D, Miranda-Quintana RA, Rácz A, Héberger K. Extended many-item similarity indices for sets of nucleotide and protein sequences. Comput Struct Biotechnol J 2021; 19:3628-3639. [PMID: 34257841 PMCID: PMC8253954 DOI: 10.1016/j.csbj.2021.06.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 06/07/2021] [Accepted: 06/14/2021] [Indexed: 12/16/2022] Open
Abstract
Quantification of similarities between protein sequences or DNA/RNA strands is a (sub-)task that is ubiquitously present in bioinformatics workflows, and is usually accomplished by pairwise comparisons of sequences, utilizing simple (e.g. percent identity) or more intricate concepts (e.g. substitution scoring matrices). Complex tasks (such as clustering) rely on a large number of pairwise comparisons under the hood, instead of a direct quantification of set similarities. Based on our recently introduced framework that enables multiple comparisons of binary molecular fingerprints (i.e., direct calculation of the similarity of fingerprint sets), here we introduce novel symmetric similarity indices for analogous calculations on sets of character sequences with more than two (t) possible items (e.g. DNA/RNA sequences with t = 4, or protein sequences with t = 20). The features of these new indices are studied in detail with analysis of variance (ANOVA), and demonstrated with three case studies of protein/DNA sequences with varying degrees of similarity (or evolutionary proximity). The Python code for the extended many-item similarity indices is publicly available at: https://github.com/ramirandaq/tn_Comparisons.
Collapse
Affiliation(s)
- Dávid Bajusz
- Medicinal Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, 1117 Budapest, Hungary
| | | | - Anita Rácz
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, 1117 Budapest, Hungary
| | - Károly Héberger
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, 1117 Budapest, Hungary
| |
Collapse
|
21
|
Zhu J, Teng G, Li D, Hou R, Xia Y. Synthesis and antibacterial activity of novel Schiff bases of thiosemicarbazone derivatives with adamantane moiety. Med Chem Res 2021. [DOI: 10.1007/s00044-021-02759-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
22
|
Medeiros AR, Ferreira LLG, de Souza ML, de Oliveira Rezende Junior C, Espinoza-Chávez RM, Dias LC, Andricopulo AD. Chemoinformatics Studies on a Series of Imidazoles as Cruzain Inhibitors. Biomolecules 2021; 11:biom11040579. [PMID: 33920961 PMCID: PMC8071344 DOI: 10.3390/biom11040579] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/05/2021] [Accepted: 04/13/2021] [Indexed: 11/16/2022] Open
Abstract
Natural products based on imidazole scaffolds have inspired the discovery of a wide variety of bioactive compounds. Herein, a series of imidazoles that act as competitive and potent cruzain inhibitors was investigated using a combination of ligand- and structure-based drug design strategies. Quantitative structure-activity relationships (QSARs) were generated along with the investigation of enzyme-inhibitor molecular interactions. Predictive hologram QSAR (HQSAR, r2pred = 0.80) and AutoQSAR (q2 = 0.90) models were built, and key structural properties that underpin cruzain inhibition were identified. Moreover, comparative molecular field analysis (CoMFA, r2pred = 0.81) and comparative molecular similarity indices analysis (CoMSIA, r2pred = 0.73) revealed 3D molecular features that strongly affect the activity of the inhibitors. These findings were examined along with molecular docking studies and were highly compatible with the intermolecular contacts that take place between cruzain and the inhibitors. The results gathered herein revealed the main factors that determine the activity of the imidazoles studied and provide novel knowledge for the design of improved cruzain inhibitors.
Collapse
Affiliation(s)
- Alex R. Medeiros
- Laboratório de Química Medicinal e Computacional, Centro de Pesquisa e Inovação em Biodiversidade e Fármacos, Instituto de Física de São Carlos, Universidade de São Paulo, Av. João Dagnone 1100, São Carlos, SP 13563-120, Brazil; (A.R.M.); (L.L.G.F.); (M.L.d.S.)
| | - Leonardo L. G. Ferreira
- Laboratório de Química Medicinal e Computacional, Centro de Pesquisa e Inovação em Biodiversidade e Fármacos, Instituto de Física de São Carlos, Universidade de São Paulo, Av. João Dagnone 1100, São Carlos, SP 13563-120, Brazil; (A.R.M.); (L.L.G.F.); (M.L.d.S.)
| | - Mariana L. de Souza
- Laboratório de Química Medicinal e Computacional, Centro de Pesquisa e Inovação em Biodiversidade e Fármacos, Instituto de Física de São Carlos, Universidade de São Paulo, Av. João Dagnone 1100, São Carlos, SP 13563-120, Brazil; (A.R.M.); (L.L.G.F.); (M.L.d.S.)
| | | | - Rocío Marisol Espinoza-Chávez
- Instituto de Química, Universidade Estadual de Campinas, Campinas, SP 13084-971, Brazil; (C.d.O.R.J.); (R.M.E.-C.); (L.C.D.)
| | - Luiz Carlos Dias
- Instituto de Química, Universidade Estadual de Campinas, Campinas, SP 13084-971, Brazil; (C.d.O.R.J.); (R.M.E.-C.); (L.C.D.)
| | - Adriano D. Andricopulo
- Laboratório de Química Medicinal e Computacional, Centro de Pesquisa e Inovação em Biodiversidade e Fármacos, Instituto de Física de São Carlos, Universidade de São Paulo, Av. João Dagnone 1100, São Carlos, SP 13563-120, Brazil; (A.R.M.); (L.L.G.F.); (M.L.d.S.)
- Correspondence: ; Tel.: +55-16-33739844
| |
Collapse
|
23
|
Abdo A, Pupin M. LINGO-DL: a text-based approach for molecular similarity searching. J Comput Aided Mol Des 2021; 35:657-665. [PMID: 33797669 DOI: 10.1007/s10822-021-00383-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 03/26/2021] [Indexed: 11/24/2022]
Abstract
The line notations of chemical structures are more compact than those of graphs and connection tables, so they can be useful for storing and transferring a large number of molecular structures. The simplified molecular input line system (SMILES) representation is the most extensively used, as it is much easier to utilise and comprehend than others, and it can be generated automatically from connection tables. A SMILES represents and encodes the molecule structure. It has been used by an existing method, LINGO, to calculate the molecular similarities and predict the structure-related properties. The LINGO method decomposes a canonical SMILES into a set of substrings of four characters referred to as LINGOs. The purpose of LINGO method is to measure the similarity between a pair of molecules by comparing the LINGOs that occur in each molecule. This paper aims to introduce an alternative version of the LINGO method using LINGOs of different lengths, called LINGO-DL. LINGO-DL is based on the fragmentation of canonical SMILES into substrings of three different lengths rather than one in LINGO method. Retrospective virtual screening experiments with MDDR, DUD, and MUV datasets show that the LINGO-DL outperforms the LINGO method, especially when the active molecules being sought have a high degree of structural heterogeneity.
Collapse
Affiliation(s)
- Ammar Abdo
- Universite de Lille, Villeneuve d'Ascq cedex, France.
| | - Maude Pupin
- Universite de Lille, Villeneuve d'Ascq cedex, France
| |
Collapse
|
24
|
Huang DZ, Baber JC, Bahmanyar SS. The challenges of generalizability in artificial intelligence for ADME/Tox endpoint and activity prediction. Expert Opin Drug Discov 2021; 16:1045-1056. [PMID: 33739897 DOI: 10.1080/17460441.2021.1901685] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
INTRODUCTION Artificial intelligence (AI) has seen a massive resurgence in recent years with wide successes in computer vision, natural language processing, and games. The similar creation of robust and accurate AI models for ADME/Tox endpoint and activity prediction would be revolutionary to drug discovery pipelines. There have been numerous demonstrations of successful applications, but a key challenge remains: how generalizable are these predictive models? AREAS COVERED The authors present a summary of current promising components of AI models in the context of early drug discovery where ADME/Tox endpoint and activity prediction is the main driver of the iterative drug design process. Following that is a review of applicability domains and dataset construction considerations which determine generalizability bottlenecks for AI deployment. Further reviewed is the role of promising learning frameworks - multitask, transfer, and meta learning - which leverage auxiliary data to overcome issues of generalizability. EXPERT OPINION The authors conclude that the most promising direction toward integrating reliable and informative AI models into the drug discovery pipeline is a conjunction of learned feature representations, deep learning, and novel learning frameworks. Such a solution would address the sparse and incomplete datasets that are available for key endpoints related to drug discovery.
Collapse
Affiliation(s)
| | - J Christian Baber
- Scientific Informatics, Global Head of Scientific Informatics, Scientific Informatics, Takeda Pharmaceuticals, Cambridge, MA, USA
| | - Sogole Sami Bahmanyar
- Computational Chemistry, Director of Computational Sciences, Computational Chemistry, Takeda Pharmaceuticals, San Diego, USA
| |
Collapse
|
25
|
Design, Synthesis, and Evaluation of Novel 3-Carboranyl-1,8-Naphthalimide Derivatives as Potential Anticancer Agents. Int J Mol Sci 2021; 22:ijms22052772. [PMID: 33803403 PMCID: PMC7967199 DOI: 10.3390/ijms22052772] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 03/04/2021] [Accepted: 03/06/2021] [Indexed: 12/11/2022] Open
Abstract
We synthesized a series of novel 3-carboranyl-1,8-naphthalimide derivatives, mitonafide and pinafide analogs, using click chemistry, reductive amination and amidation reactions and investigated their in vitro effects on cytotoxicity, cell death, cell cycle, and the production of reactive oxygen species in a HepG2 cancer cell line. The analyses showed that modified naphthalic anhydrides and naphthalimides bearing ortho- or meta-carboranes exhibited diversified activity. Naphthalimides were more cytotoxic than naphthalic anhydrides, with the highest IC50 value determined for compound 9 (3.10 µM). These compounds were capable of inducing cell cycle arrest at G0/G1 or G2M phase and promoting apoptosis, autophagy or ferroptosis. The most promising conjugate 35 caused strong apoptosis and induced ROS production, which was proven by the increased level of 2′-deoxy-8-oxoguanosine in DNA. The tested conjugates were found to be weak topoisomerase II inhibitors and classical DNA intercalators. Compounds 33, 34, and 36 fluorescently stained lysosomes in HepG2 cells. Additionally, we performed a similarity-based assessment of the property profile of the conjugates using the principal component analysis. The creation of an inhibitory profile and descriptor-based plane allowed forming a structure–activity landscape. Finally, a ligand-based comparative molecular field analysis was carried out to specify the (un)favorable structural modifications (pharmacophoric pattern) that are potentially important for the quantitative structure–activity relationship modeling of the carborane–naphthalimide conjugates.
Collapse
|
26
|
Dey D, Paul PK, Al Azad S, Al Mazid MF, Khan AM, Sharif MA, Rahman MH. Molecular optimization, docking, and dynamic simulation profiling of selective aromatic phytochemical ligands in blocking the SARS-CoV-2 S protein attachment to ACE2 receptor: an in silico approach of targeted drug designing. J Adv Vet Anim Res 2021; 8:24-35. [PMID: 33860009 PMCID: PMC8043340 DOI: 10.5455/javar.2021.h481] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 12/11/2020] [Accepted: 12/24/2020] [Indexed: 12/16/2022] Open
Abstract
OBJECTIVES The comprehensive in silico study aims to figure out the most effective aromatic phytochemical ligands among a number from a library, considering their pharmacokinetic efficacies in blocking "angiotensin-converting enzyme 2 (ACE2) receptor-severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) S protein" complex formation as part of a target-specific drug designing. MATERIALS AND METHODS A library of 57 aromatic pharmacophore phytochemical ligands was prepared from where the top five ligands depending on Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET) and quantitative structure-activity relationship (QSAR)-based pharmacokinetic properties were considered. The selected ligands were optimized for commencing molecular docking and dynamic simulation as a complex with the ACE2 receptor to compare their blocking efficacy with the control drug. The ligand-receptor complexes' accuracy in preventing the Spike (S) protein of SARS-CoV-2 penetration inside the host cells has been analyzed through hydrogen-hydrophobic bond interactions, principal component analysis (PCA), root mean square deviation (RMSD), root mean square fluctuation (RMSF), and B-Factor. Advanced in silico programming language and bioanalytical software were used for high throughput and authentic results. RESULTS ADMET and QSAR revealed Rhamnetin, Lactupicrin, Rhinacanthin D, Flemiflavanone D, and Exiguaflavanone A as the ligands of our interest to be compared with the control Cassiarin D. According to the molecular docking binding affinity to block ACE2 receptor, the efficiency mountings were Rhinacanthin D > Flemiflavanone D > Lactupicrin > Exiguaflavanone A > Rhamnetin. The binding affinity of the Cassiarin D-ACE2 complex was (-10.2 KJ/mol) found inferior to the Rhinacanthin D-ACE2 complex (-10.8 KJ/mol), referring to Rhinacanthin D as a more stable candidate to use as drugs. The RMSD values of protein-ligand complexes evaluated according to their structural conformation and stable binding pose ranged between 0.1~2.1 Å. The B-factor showed that very few loops were present in the protein structure. The RMSF peak fluctuation regions ranged 5-250, predicting efficient ligand-receptor interactions. CONCLUSION The experiment sequentially measures all the parameters required in referring to any pharmacophore as a drug, considering which all aromatic components analyzed in the study can strongly be predicted as target-specific medication against the novel coronavirus 2019 infection.
Collapse
Affiliation(s)
- Dipta Dey
- Biochemistry and Molecular Biology Department, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj, Bangladesh
| | - Parag Kumar Paul
- Centre for Energy Research, Department of Electrical and Electronic Engineering, United International University, Dhaka, Bangladesh
| | - Salauddin Al Azad
- Fermentation Engineering Major, School of Biotechnology, Jiangnan University, Wuxi, China
| | - Mohammad Faysal Al Mazid
- Department of Biomedical Science, Korea Institute of Science and Technology, Seongbuk-gu, Seoul-02792, Republic of Korea
- University of Science and Technology, Daejeon, Republic of Korea
| | - Arman Mahmud Khan
- Biotechnology and Genetic Engineering Discipline, Life Science School, Khulna University, Khulna, Bangladesh
| | - Md. Arman Sharif
- Biotechnology and Genetic Engineering Discipline, Life Science School, Khulna University, Khulna, Bangladesh
| | - Md. Hafijur Rahman
- Biotechnology and Genetic Engineering Discipline, Life Science School, Khulna University, Khulna, Bangladesh
| |
Collapse
|
27
|
Seal S, Yang H, Vollmers L, Bender A. Comparison of Cellular Morphological Descriptors and Molecular Fingerprints for the Prediction of Cytotoxicity- and Proliferation-Related Assays. Chem Res Toxicol 2021; 34:422-437. [PMID: 33522793 DOI: 10.1021/acs.chemrestox.0c00303] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Cell morphology features, such as those from the Cell Painting assay, can be generated at relatively low costs and represent versatile biological descriptors of a system and thereby compound response. In this study, we explored cell morphology descriptors and molecular fingerprints, separately and in combination, for the prediction of cytotoxicity- and proliferation-related in vitro assay endpoints. We selected 135 compounds from the MoleculeNet ToxCast benchmark data set which were annotated with Cell Painting readouts, where the relatively small size of the data set is due to the overlap of required annotations. We trained Random Forest classification models using nested cross-validation and Cell Painting descriptors, Morgan and ErG fingerprints, and their combinations. While using leave-one-cluster-out cross-validation (with clusters based on physicochemical descriptors), models using Cell Painting descriptors achieved higher average performance over all assays (Balanced Accuracy of 0.65, Matthews Correlation Coefficient of 0.28, and AUC-ROC of 0.71) compared to models using ErG fingerprints (BA 0.55, MCC 0.09, and AUC-ROC 0.60) and Morgan fingerprints alone (BA 0.54, MCC 0.06, and AUC-ROC 0.56). While using random shuffle splits, the combination of Cell Painting descriptors with ErG and Morgan fingerprints further improved balanced accuracy on average by 8.9% (in 9 out of 12 assays) and 23.4% (in 8 out of 12 assays) compared to using only ErG and Morgan fingerprints, respectively. Regarding feature importance, Cell Painting descriptors related to nuclei texture, granularity of cells, and cytoplasm as well as cell neighbors and radial distributions were identified to be most contributing, which is plausible given the endpoint considered. We conclude that cell morphological descriptors contain complementary information to molecular fingerprints which can be used to improve the performance of predictive cytotoxicity models, in particular in areas of novel structural space.
Collapse
Affiliation(s)
- Srijit Seal
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Hongbin Yang
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Luis Vollmers
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
28
|
Wang C, Kurgan L. Survey of Similarity-Based Prediction of Drug-Protein Interactions. Curr Med Chem 2021; 27:5856-5886. [PMID: 31393241 DOI: 10.2174/0929867326666190808154841] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Revised: 04/16/2018] [Accepted: 10/23/2018] [Indexed: 12/20/2022]
Abstract
Therapeutic activity of a significant majority of drugs is determined by their interactions with proteins. Databases of drug-protein interactions (DPIs) primarily focus on the therapeutic protein targets while the knowledge of the off-targets is fragmented and partial. One way to bridge this knowledge gap is to employ computational methods to predict protein targets for a given drug molecule, or interacting drugs for given protein targets. We survey a comprehensive set of 35 methods that were published in high-impact venues and that predict DPIs based on similarity between drugs and similarity between protein targets. We analyze the internal databases of known PDIs that these methods utilize to compute similarities, and investigate how they are linked to the 12 publicly available source databases. We discuss contents, impact and relationships between these internal and source databases, and well as the timeline of their releases and publications. The 35 predictors exploit and often combine three types of similarities that consider drug structures, drug profiles, and target sequences. We review the predictive architectures of these methods, their impact, and we explain how their internal DPIs databases are linked to the source databases. We also include a detailed timeline of the development of these predictors and discuss the underlying limitations of the current resources and predictive tools. Finally, we provide several recommendations concerning the future development of the related databases and methods.
Collapse
Affiliation(s)
- Chen Wang
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| |
Collapse
|
29
|
Sedykh AY, Shah RR, Kleinstreuer NC, Auerbach SS, Gombar VK. Saagar-A New, Extensible Set of Molecular Substructures for QSAR/QSPR and Read-Across Predictions. Chem Res Toxicol 2020; 34:634-640. [PMID: 33356152 DOI: 10.1021/acs.chemrestox.0c00464] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Molecular structure-based predictive models provide a proven alternative to costly and inefficient animal testing. However, due to a lack of interpretability of predictive models built with abstract molecular descriptors they have earned the notoriety of being black boxes. Interpretable models require interpretable descriptors to provide chemistry-backed predictive reasoning and facilitate intelligent molecular design. We developed a novel set of extensible chemistry-aware substructures, Saagar, to support interpretable predictive models and read-across protocols. Performance of Saagar in chemical characterization and search for structurally similar actives for read-across applications was compared with four publicly available fingerprint sets (MACCS (166), PubChem (881), ECFP4 (1024), ToxPrint (729)) in three benchmark sets (MUV, ULS, and Tox21) spanning ∼145 000 compounds and 78 molecular targets at 1%, 2%, 5%, and 10% false discovery rates. In 18 of the 20 comparisons, interpretable Saagar features performed better than the publicly available, but less interpretable and fixed-bit length, fingerprints. Examples are provided to show the enhanced capability of Saagar in extracting compounds with higher scaffold similarity. Saagar features are interpretable and efficiently characterize diverse chemical collections, thus making them a better choice for building interpretable predictive in silico models and read-across protocols.
Collapse
Affiliation(s)
| | - Ruchir R Shah
- Sciome LLC, Research Triangle Park, North Carolina 27709, United States
| | - Nicole C Kleinstreuer
- National Institute of Environmental Health Sciences (NIEHS), National Toxicology Program (NTP), Research Triangle Park, North Carolina 27709, United States
| | - Scott S Auerbach
- National Institute of Environmental Health Sciences (NIEHS), National Toxicology Program (NTP), Research Triangle Park, North Carolina 27709, United States
| | - Vijay K Gombar
- Sciome LLC, Research Triangle Park, North Carolina 27709, United States
| |
Collapse
|
30
|
Lu H, Qi Y, Zhao Y, Jin N. Effects of Hydroxyl Group on the Interaction of Carboxylated Flavonoid Derivatives with S. Cerevisiae α-Glucosidase. Curr Comput Aided Drug Des 2020; 16:31-44. [PMID: 30345924 PMCID: PMC6967131 DOI: 10.2174/1573409914666181022142553] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2018] [Revised: 08/23/2018] [Accepted: 10/17/2018] [Indexed: 02/07/2023]
Abstract
Introduction Carboxyalkyl flavonoids derivatives are considered as effective inhibitors in reducing post-prandial hyperglycaemia. Methods Combined with Density Functional Theory (DFT) and the theory of Atoms in Molecules (AIM), molecular docking and charge density analysis are carried out to understand the molecular flexibility, charge density distribution and the electrostatic properties of these carboxyalkyl derivatives. Results Results show that the electron density of the chemical bond C14-O17 on B ring of molecule II increases while O17-H18 decreases at the active site, suggesting the existence of weak non-covalent interactions, most prominent of which are H-bonding and electrostatic interaction. When hydroxyl groups are introduced, the highest positive electrostatic potentials are distributed near the B ring hydroxyl hydrogen atom and the carboxyl hydrogen atom on the A ring. It was reported that quercetin has a considerably inhibitory activity to S. cerevisiae α-glucosidase, from the binding affinities, it is suggested that the position and number of hydroxyl groups on the B and C rings are also pivotal to the hypoglycemic activity when the long carboxyalkyl group is introduced into the A ring. Conclusion It is concluded that the presence of three well-defined zones in the structure, both hydrophobicity alkyl, hydrophilicity carboxyl and hydroxyl groups are necessary.
Collapse
Affiliation(s)
- Huining Lu
- Department of Life Sciences and Biological Engineering, Northwest Minzu University, Lanzhou 730124, China
| | - Yanjiao Qi
- Department of Chemical Engineering, Northwest Minzu University, Lanzhou 730124, China.,Key Laboratory for Utility of Environment-Friendly Composite Materials and Biomass in Universities of Gansu Province, Lanzhou, China
| | - Yaming Zhao
- Department of Chemical Engineering, Northwest Minzu University, Lanzhou 730124, China
| | - Nengzhi Jin
- Gansu Province Computing Center, Lanzhou 730000, China
| |
Collapse
|
31
|
Cheirdaris DG. Artificial Neural Networks in Computer-Aided Drug Design: An Overview of Recent Advances. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2020; 1194:115-125. [PMID: 32468528 DOI: 10.1007/978-3-030-32622-7_10] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Computer-aided drug design (CADD) is the framework in which the huge amount of data accumulated by high-throughput experimental methods used in drug design is quantitatively studied. Its objectives include pattern recognition, biomarker identification and/or classification, etc. In order to achieve these objectives, machine learning algorithms and especially artificial neural networks (ANNs) have been used over ADMET factor testing and QSAR modeling evaluation. This paper provides an overview of the current trends in CADD-applied ANNs, since their use was re-boosted over a decade ago.
Collapse
|
32
|
Kos J, Bak A, Kozik V, Jankech T, Strharsky T, Swietlicka A, Michnova H, Hosek J, Smolinski A, Oravec M, Devinsky F, Hutta M, Jampilek J. Biological Activities and ADMET-Related Properties of Novel Set of Cinnamanilides. Molecules 2020; 25:molecules25184121. [PMID: 32916979 PMCID: PMC7570544 DOI: 10.3390/molecules25184121] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 09/03/2020] [Accepted: 09/08/2020] [Indexed: 12/12/2022] Open
Abstract
A series of nineteen novel ring-substituted N-arylcinnamanilides was synthesized and characterized. All investigated compounds were tested against Staphylococcus aureus as the reference strain, two clinical isolates of methicillin-resistant S. aureus (MRSA), and Mycobacterium tuberculosis. (2E)-N-[3-Fluoro-4-(trifluoromethyl)phenyl]-3-phenylprop-2-enamide showed even better activity (minimum inhibitory concentration (MIC) 25.9 and 12.9 µM) against MRSA isolates than the commonly used ampicillin (MIC 45.8 µM). The screening of the cell viability was performed using THP1-Blue™ NF-κB cells and, except for (2E)-N-(4-bromo-3-chlorophenyl)-3-phenylprop-2-enamide (IC50 6.5 µM), none of the discussed compounds showed any significant cytotoxic effect up to 20 μM. Moreover, all compounds were tested for their anti-inflammatory potential; several compounds attenuated the lipopolysaccharide-induced NF-κB activation and were more potent than the parental cinnamic acid. The lipophilicity values were specified experimentally as well. In addition, in silico approximation of the lipophilicity values was performed employing a set of free/commercial clogP estimators, corrected afterwards by the corresponding pKa calculated at physiological pH and subsequently cross-compared with the experimental parameters. The similarity-driven property space evaluation of structural analogs was carried out using the principal component analysis, Tanimoto metrics, and Kohonen mapping.
Collapse
Affiliation(s)
- Jiri Kos
- Regional Centre of Advanced Technologies and Materials, Faculty of Science, Palacky University, Slechtitelu 27, 78371 Olomouc, Czech Republic; (J.K.); (T.S.); (H.M.); (J.H.)
| | - Andrzej Bak
- Department of Chemistry, University of Silesia, Szkolna 9, 40007 Katowice, Poland; (V.K.); (A.S.)
- Correspondence: (A.B.); (J.J.)
| | - Violetta Kozik
- Department of Chemistry, University of Silesia, Szkolna 9, 40007 Katowice, Poland; (V.K.); (A.S.)
| | - Timotej Jankech
- Department of Analytical Chemistry, Faculty of Natural Sciences, Comenius University, Ilkovicova 6, 84215 Bratislava, Slovakia; (T.J.); (M.H.)
| | - Tomas Strharsky
- Regional Centre of Advanced Technologies and Materials, Faculty of Science, Palacky University, Slechtitelu 27, 78371 Olomouc, Czech Republic; (J.K.); (T.S.); (H.M.); (J.H.)
| | - Aleksandra Swietlicka
- Department of Chemistry, University of Silesia, Szkolna 9, 40007 Katowice, Poland; (V.K.); (A.S.)
| | - Hana Michnova
- Regional Centre of Advanced Technologies and Materials, Faculty of Science, Palacky University, Slechtitelu 27, 78371 Olomouc, Czech Republic; (J.K.); (T.S.); (H.M.); (J.H.)
| | - Jan Hosek
- Regional Centre of Advanced Technologies and Materials, Faculty of Science, Palacky University, Slechtitelu 27, 78371 Olomouc, Czech Republic; (J.K.); (T.S.); (H.M.); (J.H.)
| | - Adam Smolinski
- Central Mining Institute, Pl. Gwarkow 1, 40166 Katowice, Poland;
| | - Michal Oravec
- Global Change Research Institute CAS, Belidla 986/4a, 60300 Brno, Czech Republic;
| | - Ferdinand Devinsky
- Faculty of Pharmacy, Comenius University, Odbojarov 10, 83232 Bratislava, Slovakia;
| | - Milan Hutta
- Department of Analytical Chemistry, Faculty of Natural Sciences, Comenius University, Ilkovicova 6, 84215 Bratislava, Slovakia; (T.J.); (M.H.)
| | - Josef Jampilek
- Department of Analytical Chemistry, Faculty of Natural Sciences, Comenius University, Ilkovicova 6, 84215 Bratislava, Slovakia; (T.J.); (M.H.)
- Correspondence: (A.B.); (J.J.)
| |
Collapse
|
33
|
Rajda K, Podlewska S. Similar, or dissimilar, that is the question. How different are methods for comparison of compounds similarity? Comput Biol Chem 2020; 88:107367. [PMID: 32956952 DOI: 10.1016/j.compbiolchem.2020.107367] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 08/13/2020] [Accepted: 08/24/2020] [Indexed: 10/23/2022]
Abstract
Comparison of compounds similarity is one of the main strategies of virtual screening protocols. Both similarity and dissimilarity concepts are of great importance during the search for new active compounds. Similarity is important due to the assumption that underlies the process of searching for new drug candidates: structurally similar compounds should induce similar biological response. On the other hand, we are also interested in dissimilarity, as we usually aim to find structurally novel ligands. In the study, we compared several approaches of evaluating compound similarity. Various representations and metrics were applied and we indicated the rate of variation of the results that can occur when shifting from one strategy to another. We compared both general similarity of datasets using different approaches, as well as examined the changes in the set of nearest neighbors when changing one compound representation into another, and the influence of representation/metric settings on the clustering outcome. We hope that the study will be of great help during the preparation of virtual screening experiments, stressing the need for careful selection of the way, the compound similarity is assessed. The differences in the results that can be obtained via the application of particular strategy can significantly influence the outcome of comparison studies; therefore, its settings should be carefully selected beforerunning the comparison.
Collapse
Affiliation(s)
- Krzysztof Rajda
- Wroclaw University of Science and Technology, Faculty of Computer Science and Management, 50-371 Wrocław, I. Łukasiewicza Street 5, Poland
| | - Sabina Podlewska
- Jagiellonian University Medical College, Department of Technology and Biotechnology of Drugs, 30-688 Kraków, 9 Medyczna Street, Poland; Maj Institute of Pharmacology, Polish Academy of Sciences, Department of Medicinal Chemistry, 31-343 Kraków, Smętna Street 12, Poland.
| |
Collapse
|
34
|
Rasaeifar B, Gomez-Gutierrez P, Perez JJ. New Insights into the Stereochemical Requirements of the Bombesin BB1 Receptor Antagonists Binding. Pharmaceuticals (Basel) 2020; 13:ph13080197. [PMID: 32824403 PMCID: PMC7463749 DOI: 10.3390/ph13080197] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 08/10/2020] [Accepted: 08/12/2020] [Indexed: 12/24/2022] Open
Abstract
Members of the family of bombesinlike peptides exert a wide range of biological activities both at the central nervous system and in peripheral tissues through at least three G-Protein Coupled Receptors: BB1, BB2 and BB3. Despite the number of peptide ligands already described, only a few small molecule binders have been disclosed so far, hampering a deeper understanding of their pharmacology. In order to have a deeper understanding of the stereochemical features characterizing binding to the BB1 receptor, we performed the molecular modeling study consisting of the construction of a 3D model of the receptor by homology modeling followed by a docking study of the peptoids PD168368 and PD176252 onto it. Analysis of the complexes permitted us to propose prospective bound conformations of the compounds, consistent with the experimental information available. Subsequently, we defined a pharmacophore describing minimal stereochemical requirements for binding to the BB1 receptor that was used in silico screening. This exercise yielded a set of small molecules that were purchased and tested, showing affinity to the BB1 but not to the BB2 receptor. These molecules exhibit scaffolds of diverse chemical families that can be used as a starting point for the development of novel BB1 antagonists.
Collapse
|
35
|
Cortés-Ciriano I, Škuta C, Bender A, Svozil D. QSAR-derived affinity fingerprints (part 2): modeling performance for potency prediction. J Cheminform 2020; 12:41. [PMID: 33431016 PMCID: PMC7339533 DOI: 10.1186/s13321-020-00444-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Accepted: 05/16/2020] [Indexed: 01/22/2023] Open
Abstract
Affinity fingerprints report the activity of small molecules across a set of assays, and thus permit to gather information about the bioactivities of structurally dissimilar compounds, where models based on chemical structure alone are often limited, and model complex biological endpoints, such as human toxicity and in vitro cancer cell line sensitivity. Here, we propose to model in vitro compound activity using computationally predicted bioactivity profiles as compound descriptors. To this aim, we apply and validate a framework for the calculation of QSAR-derived affinity fingerprints (QAFFP) using a set of 1360 QSAR models generated using Ki, Kd, IC50 and EC50 data from ChEMBL database. QAFFP thus represent a method to encode and relate compounds on the basis of their similarity in bioactivity space. To benchmark the predictive power of QAFFP we assembled IC50 data from ChEMBL database for 18 diverse cancer cell lines widely used in preclinical drug discovery, and 25 diverse protein target data sets. This study complements part 1 where the performance of QAFFP in similarity searching, scaffold hopping, and bioactivity classification is evaluated. Despite being inherently noisy, we show that using QAFFP as descriptors leads to errors in prediction on the test set in the ~ 0.65-0.95 pIC50 units range, which are comparable to the estimated uncertainty of bioactivity data in ChEMBL (0.76-1.00 pIC50 units). We find that the predictive power of QAFFP is slightly worse than that of Morgan2 fingerprints and 1D and 2D physicochemical descriptors, with an effect size in the 0.02-0.08 pIC50 units range. Including QSAR models with low predictive power in the generation of QAFFP does not lead to improved predictive power. Given that the QSAR models we used to compute the QAFFP were selected on the basis of data availability alone, we anticipate better modeling results for QAFFP generated using more diverse and biologically meaningful targets. Data sets and Python code are publicly available at https://github.com/isidroc/QAFFP_regression .
Collapse
Affiliation(s)
- Isidro Cortés-Ciriano
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK. .,European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, CB10 1SD, UK.
| | - Ctibor Škuta
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the ASCR, v. v. i., Vídeňská 1083, 142 20, Prague, Czech Republic
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
| | - Daniel Svozil
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the ASCR, v. v. i., Vídeňská 1083, 142 20, Prague, Czech Republic.,CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28, Prague, Czech Republic
| |
Collapse
|
36
|
Škuta C, Cortés-Ciriano I, Dehaen W, Kříž P, van Westen GJP, Tetko IV, Bender A, Svozil D. QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping. J Cheminform 2020; 12:39. [PMID: 33431038 PMCID: PMC7260783 DOI: 10.1186/s13321-020-00443-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Accepted: 05/16/2020] [Indexed: 02/11/2023] Open
Abstract
An affinity fingerprint is the vector consisting of compound’s affinity or potency against the reference panel of protein targets. Here, we present the QAFFP fingerprint, 440 elements long in silico QSAR-based affinity fingerprint, components of which are predicted by Random Forest regression models trained on bioactivity data from the ChEMBL database. Both real-valued (rv-QAFFP) and binary (b-QAFFP) versions of the QAFFP fingerprint were implemented and their performance in similarity searching, biological activity classification and scaffold hopping was assessed and compared to that of the 1024 bits long Morgan2 fingerprint (the RDKit implementation of the ECFP4 fingerprint). In both similarity searching and biological activity classification, the QAFFP fingerprint yields retrieval rates, measured by AUC (~ 0.65 and ~ 0.70 for similarity searching depending on data sets, and ~ 0.85 for classification) and EF5 (~ 4.67 and ~ 5.82 for similarity searching depending on data sets, and ~ 2.10 for classification), comparable to that of the Morgan2 fingerprint (similarity searching AUC of ~ 0.57 and ~ 0.66, and EF5 of ~ 4.09 and ~ 6.41, depending on data sets, classification AUC of ~ 0.87, and EF5 of ~ 2.16). However, the QAFFP fingerprint outperforms the Morgan2 fingerprint in scaffold hopping as it is able to retrieve 1146 out of existing 1749 scaffolds, while the Morgan2 fingerprint reveals only 864 scaffolds.![]()
Collapse
Affiliation(s)
- C Škuta
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the ASCR, v. v. i., Vídeňská 1083, 142 20, Prague 4, Czech Republic
| | - I Cortés-Ciriano
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
| | - W Dehaen
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the ASCR, v. v. i., Vídeňská 1083, 142 20, Prague 4, Czech Republic.,CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28, Prague, Czech Republic
| | - P Kříž
- Department of Mathematics, Faculty of Chemical Engineering, University of Chemistry and Technology Prague, Technická 5, 166 28, Prague, Czech Republic
| | - G J P van Westen
- Computational Drug Discovery, Drug Discovery and Safety, LACDR, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - I V Tetko
- Helmholtz Zentrum Muenchen - German Research Center for Environmental Health (GmbH) and BIGCHEM GmbH, Ingolstaedter Landstrasse 1, 85764, Neuherberg, Germany
| | - A Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
| | - D Svozil
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the ASCR, v. v. i., Vídeňská 1083, 142 20, Prague 4, Czech Republic. .,CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28, Prague, Czech Republic.
| |
Collapse
|
37
|
Peng Y, Zhang Z, Jiang Q, Guan J, Zhou S. TOP: A deep mixture representation learning method for boosting molecular toxicity prediction. Methods 2020; 179:55-64. [PMID: 32446957 DOI: 10.1016/j.ymeth.2020.05.013] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Revised: 05/12/2020] [Accepted: 05/13/2020] [Indexed: 01/18/2023] Open
Abstract
At the early stages of the drug discovery, molecule toxicity prediction is crucial to excluding drug candidates that are likely to fail in clinical trials. In this paper, we presented a novel molecular representation method and developed a corresponding deep learning-based framework called TOP (the abbreviation of TOxicity Prediction). TOP integrates specifically designed data preprocessing methods, an RNN based on bidirectional gated recurrent unit (BiGRU), and fully connected neural networks for end-to-end molecular representation learning and chemical toxicity prediction. TOP can automatically learn a mixed molecular representation from not only SMILES contextual information that describes the molecule structure, but also physiochemical properties. Therefore, TOP can overcome the drawbacks of existing methods that use either of them, thus greatly promotes toxicity prediction accuracy. We conducted extensive experiments over 14 classic toxicity prediction tasks on three different benchmark datasets, including balanced and imbalanced ones. The results show that, with the help of the novel molecular representation method, TOP significantly outperforms not only three baseline machine learning methods, but also five state-of-the-art methods.
Collapse
Affiliation(s)
- Yuzhong Peng
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China; Key Lab of Scientific Computing and Intelligent Information Processing in Universities of Guangxi, Nanning Normal University, Nanning 530001, China.
| | - Ziqiao Zhang
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China.
| | - Qizhi Jiang
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China.
| | - Jihong Guan
- Department of Computer Science and Technology, Tongji University, Shanghai 201804, China.
| | - Shuigeng Zhou
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China.
| |
Collapse
|
38
|
Ball N, Madden J, Paini A, Mathea M, Palmer AD, Sperber S, Hartung T, van Ravenzwaay B. Key read across framework components and biology based improvements. MUTATION RESEARCH-GENETIC TOXICOLOGY AND ENVIRONMENTAL MUTAGENESIS 2020; 853:503172. [DOI: 10.1016/j.mrgentox.2020.503172] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 03/09/2020] [Accepted: 03/11/2020] [Indexed: 12/18/2022]
|
39
|
Drakakis G, Cortés-Ciriano I, Alexander-Dann B, Bender A. Elucidating Compound Mechanism of Action and Predicting Cytotoxicity Using Machine Learning Approaches, Taking Prediction Confidence into Account. ACTA ACUST UNITED AC 2020; 11:e73. [PMID: 31483099 DOI: 10.1002/cpch.73] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The modes of action (MoAs) of drugs frequently are unknown, because many are small molecules initially identified from phenotypic screens, giving rise to the need to elucidate their MoAs. In addition, the high attrition rate for candidate drugs in preclinical studies due to intolerable toxicity has motivated the development of computational approaches to predict drug candidate (cyto)toxicity as early as possible in the drug-discovery process. Here, we provide detailed instructions for capitalizing on bioactivity predictions to elucidate the MoAs of small molecules and infer their underlying phenotypic effects. We illustrate how these predictions can be used to infer the underlying antidepressive effects of marketed drugs. We also provide the necessary functionalities to model cytotoxicity data using single and ensemble machine-learning algorithms. Finally, we give detailed instructions on how to calculate confidence intervals for individual predictions using the conformal prediction framework. © 2019 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Georgios Drakakis
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Isidro Cortés-Ciriano
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Ben Alexander-Dann
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
40
|
Playe B, Stoven V. Evaluation of deep and shallow learning methods in chemogenomics for the prediction of drugs specificity. J Cheminform 2020; 12:11. [PMID: 33431042 PMCID: PMC7011501 DOI: 10.1186/s13321-020-0413-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 01/27/2020] [Indexed: 01/09/2023] Open
Abstract
Chemogenomics, also called proteochemometrics, covers a range of computational methods that can be used to predict protein–ligand interactions at large scales in the protein and chemical spaces. They differ from more classical ligand-based methods (also called QSAR) that predict ligands for a given protein receptor. In the context of drug discovery process, chemogenomics allows to tackle the question of predicting off-target proteins for drug candidates, one of the main causes of undesirable side-effects and failure within drugs development processes. The present study compares shallow and deep machine-learning approaches for chemogenomics, and explores data augmentation techniques for deep learning algorithms in chemogenomics. Shallow machine-learning algorithms rely on expert-based chemical and protein descriptors, while recent developments in deep learning algorithms enable to learn abstract numerical representations of molecular graphs and protein sequences, in order to optimise the performance of the prediction task. We first propose a formulation of chemogenomics with deep learning, called the chemogenomic neural network (CN), as a feed-forward neural network taking as input the combination of molecule and protein representations learnt by molecular graph and protein sequence encoders. We show that, on large datasets, the deep learning CN model outperforms state-of-the-art shallow methods, and competes with deep methods with expert-based descriptors. However, on small datasets, shallow methods present better prediction performance than deep learning methods. Then, we evaluate data augmentation techniques, namely multi-view and transfer learning, to improve the prediction performance of the chemogenomic neural network. We conclude that a promising research direction is to integrate heterogeneous sources of data such as auxiliary tasks for which large datasets are available, or independently, multiple molecule and protein attribute views.
Collapse
Affiliation(s)
- Benoit Playe
- Center for Computational Biology, Mines ParisTech, PSL Research University, 60 Bd Saint-Michel, 75006, Paris, France.,Institut Curie, 75248, Paris, France.,INSERM U900, 75248, Paris, France
| | - Veronique Stoven
- Center for Computational Biology, Mines ParisTech, PSL Research University, 60 Bd Saint-Michel, 75006, Paris, France. .,Institut Curie, 75248, Paris, France. .,INSERM U900, 75248, Paris, France.
| |
Collapse
|
41
|
Huggins DJ, Hardwick BS, Sharma P, Emery A, Laraia L, Zhang F, Narvaez AJ, Roberts-Thomson M, Crooks AT, Boyle RG, Boyce R, Walker DW, Mateu N, McKenzie GJ, Spring DR, Venkitaraman AR. Development of a Novel Cell-Permeable Protein-Protein Interaction Inhibitor for the Polo-box Domain of Polo-like Kinase 1. ACS OMEGA 2020; 5:822-831. [PMID: 31956833 PMCID: PMC6964520 DOI: 10.1021/acsomega.9b03626] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Accepted: 12/13/2019] [Indexed: 05/10/2023]
Abstract
Polo-like kinase 1 (PLK1) is a key regulator of mitosis and a recognized drug target for cancer therapy. Inhibiting the polo-box domain of PLK1 offers potential advantages of increased selectivity and subsequently reduced toxicity compared with targeting the kinase domain. However, many if not all existing polo-box domain inhibitors have been shown to be unsuitable for further development. In this paper, we describe a novel compound series, which inhibits the protein-protein interactions of PLK1 via the polo-box domain. We combine high throughput screening with molecular modeling and computer-aided design, synthetic chemistry, and cell biology to address some of the common problems with protein-protein interaction inhibitors, such as solubility and potency. We use molecular modeling to improve the solubility of a hit series with initially poor physicochemical properties, enabling biophysical and biochemical characterization. We isolate and characterize enantiomers to improve potency and demonstrate on-target activity in both cell-free and cell-based assays, entirely consistent with the proposed binding model. The resulting compound series represents a promising starting point for further progression along the drug discovery pipeline and a new tool compound to study kinase-independent PLK functions.
Collapse
Affiliation(s)
- David J. Huggins
- Medical
Research Council Cancer Cell Unit, Hutchison/MRC Research Centre, University of Cambridge, Hills Road, Cambridge CB2 2XZ, United Kingdom
- TCM
Group, Cavendish Laboratory, University
of Cambridge, 19 JJ Thomson
Avenue, Cambridge CB3 0HE, United Kingdom
- Department
of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United
Kingdom
| | - Bryn S. Hardwick
- Medical
Research Council Cancer Cell Unit, Hutchison/MRC Research Centre, University of Cambridge, Hills Road, Cambridge CB2 2XZ, United Kingdom
| | - Pooja Sharma
- Medical
Research Council Cancer Cell Unit, Hutchison/MRC Research Centre, University of Cambridge, Hills Road, Cambridge CB2 2XZ, United Kingdom
| | - Amy Emery
- Medical
Research Council Cancer Cell Unit, Hutchison/MRC Research Centre, University of Cambridge, Hills Road, Cambridge CB2 2XZ, United Kingdom
| | - Luca Laraia
- Department
of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United
Kingdom
| | - Fengzhi Zhang
- Department
of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United
Kingdom
| | - Ana J. Narvaez
- Medical
Research Council Cancer Cell Unit, Hutchison/MRC Research Centre, University of Cambridge, Hills Road, Cambridge CB2 2XZ, United Kingdom
| | - Meredith Roberts-Thomson
- Medical
Research Council Cancer Cell Unit, Hutchison/MRC Research Centre, University of Cambridge, Hills Road, Cambridge CB2 2XZ, United Kingdom
| | - Alex T. Crooks
- Medical
Research Council Cancer Cell Unit, Hutchison/MRC Research Centre, University of Cambridge, Hills Road, Cambridge CB2 2XZ, United Kingdom
| | - Robert G. Boyle
- Sentinel
Oncology Ltd., Cambridge Science Park, Milton Road, Cambridge CB4 0EY, United Kingdom
| | - Richard Boyce
- Sentinel
Oncology Ltd., Cambridge Science Park, Milton Road, Cambridge CB4 0EY, United Kingdom
| | - David W. Walker
- Sentinel
Oncology Ltd., Cambridge Science Park, Milton Road, Cambridge CB4 0EY, United Kingdom
| | - Natalia Mateu
- Department
of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United
Kingdom
| | - Grahame J. McKenzie
- Medical
Research Council Cancer Cell Unit, Hutchison/MRC Research Centre, University of Cambridge, Hills Road, Cambridge CB2 2XZ, United Kingdom
| | - David R. Spring
- Department
of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United
Kingdom
| | - Ashok R. Venkitaraman
- Medical
Research Council Cancer Cell Unit, Hutchison/MRC Research Centre, University of Cambridge, Hills Road, Cambridge CB2 2XZ, United Kingdom
| |
Collapse
|
42
|
Antibacterial activity of griseofulvin analogues as an example of drug repurposing. Int J Antimicrob Agents 2020; 55:105884. [PMID: 31931149 DOI: 10.1016/j.ijantimicag.2020.105884] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 12/19/2019] [Accepted: 12/28/2019] [Indexed: 01/30/2023]
Abstract
Griseofulvin is a well-known antifungal drug that was launched in 1962 by Merck & Co. for the treatment of dermatophyte infections. However, according to predictions using the Way2Drug computational drug repurposing platform, it may also have antibacterial activity. As no confirmation of this prediction was found in the published literature, this study estimated in-silico antibacterial activity for 42 griseofulvin derivatives. Antibacterial activity was predicted for 33 of the 42 compounds, which led to the conclusion that this activity might be considered as typical for this chemical series. Therefore, experimental testing of antibacterial activity was performed on a panel of Gram-positive and Gram-negative micro-organisms. Antibacterial activity was evaluated using the microdilution method detecting the minimal inhibitory concentration (MIC) and the minimal bactericidal concentration (MBC). The tested compounds exhibited potent antibacterial activity against all the studied bacteria, with MIC and MBC values ranging from 0.0037 to 0.04 mg/mL and from 0.01 to 0.16 mg/mL, respectively. Activity was 2.5-12 times greater than that of ampicillin and 2-8 times greater than that of streptomycin, which were used as the reference drugs. Similarity analysis for all 42 compounds with the (approximately) 470,000 drug-like compounds indexed in the Clarivate Analytics Integrity database confirmed the significant novelty of the antibacterial activity for the compounds from this chemical class. Therefore, this study demonstrated that by using computer-aided prediction of biological activity spectra for a particular chemical series, it is possible to identify typical biological activities which may be used for discovery of new applications (e.g. drug repurposing).
Collapse
|
43
|
Nouleho Ilemo S, Barth D, David O, Quessette F, Weisser MA, Watel D. Improving graphs of cycles approach to structural similarity of molecules. PLoS One 2019; 14:e0226680. [PMID: 31881046 PMCID: PMC6934298 DOI: 10.1371/journal.pone.0226680] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2019] [Accepted: 12/03/2019] [Indexed: 12/02/2022] Open
Abstract
This paper focuses on determining the structural similarity of two molecules, i.e., the similarity of the interconnection of all the elementary cycles in the corresponding molecular graphs. In this paper, we propose and analyze an algorithmic approach based on the resolution of the Maximum Common Edge Subgraph (MCES) problem with graphs representing the interaction of cycles molecules. Using the ChEBI database, we compare the effectiveness of this approach in terms of structural similarity and computation time with two calculations of similarity of molecular graphs, one based on the MCES, the other on the use of different fingerprints (Daylight, ECFP4, ECFP6, FCFP4, FCFP6) to measure Tanimoto coefficient. We also analyze the obtained structural similarity results for a selected subset of molecules.
Collapse
Affiliation(s)
- Stefi Nouleho Ilemo
- DAVID, Department of Computer Science, University of Versailles Saint Quentin, Versailles, France
| | - Dominique Barth
- DAVID, Department of Computer Science, University of Versailles Saint Quentin, Versailles, France
| | - Olivier David
- ILV, Department of Chemistry, University of Versailles, Versailles, France
| | - Franck Quessette
- DAVID, Department of Computer Science, University of Versailles Saint Quentin, Versailles, France
| | | | - Dimitri Watel
- ENSIIE, Evry, France
- SAMOVAR, Telecom SudParis, Evry, France
| |
Collapse
|
44
|
Karthikeyan BS, Ravichandran J, Mohanraj K, Vivek-Ananth RP, Samal A. A curated knowledgebase on endocrine disrupting chemicals and their biological systems-level perturbations. THE SCIENCE OF THE TOTAL ENVIRONMENT 2019; 692:281-296. [PMID: 31349169 DOI: 10.1016/j.scitotenv.2019.07.225] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 06/28/2019] [Accepted: 07/14/2019] [Indexed: 05/22/2023]
Abstract
Human well-being can be affected by exposure to several chemicals in the environment. One such group is endocrine disrupting chemicals (EDCs) that can perturb the hormonal homeostasis leading to adverse health effects. In this work, we have developed a detailed workflow to identify EDCs with supporting evidence of endocrine disruption in published experiments in humans or rodents. Thereafter, this workflow was used to manually evaluate more than 16,000 published research articles and identify 686 potential EDCs with published evidence in humans or rodents. Importantly, we have compiled the observed adverse effects or endocrine-specific perturbations along with the dosage information for the potential EDCs from their supporting published experiments. Subsequently, the potential EDCs were classified based on the type of supporting evidence, their environmental source and their chemical properties. Additional compiled information for potential EDCs include their chemical structure, physicochemical properties, predicted ADMET properties and target genes. In order to enable future research based on this compiled information on potential EDCs, we have built an online knowledgebase, Database of Endocrine Disrupting Chemicals and their Toxicity profiles (DEDuCT), accessible at: https://cb.imsc.res.in/deduct/. After building this comprehensive resource, we have performed a network-centric analysis of the chemical space and the associated biological space of target genes of EDCs. Specifically, we have constructed two networks of EDCs using our resource based on similarity of chemical structures or target genes. Ensuing analysis revealed a lack of correlation between chemical structure and target genes of EDCs. Though our detailed results highlight potential challenges in developing predictive models for EDCs, the compiled information in our resource will undoubtedly enable future research in the field, especially, those focussed towards mechanistic understanding of the systems-level perturbations caused by EDCs.
Collapse
Affiliation(s)
| | - Janani Ravichandran
- The Institute of Mathematical Sciences (IMSc), Homi Bhabha National Institute (HBNI), Chennai 600113, India.
| | - Karthikeyan Mohanraj
- The Institute of Mathematical Sciences (IMSc), Homi Bhabha National Institute (HBNI), Chennai 600113, India
| | - R P Vivek-Ananth
- The Institute of Mathematical Sciences (IMSc), Homi Bhabha National Institute (HBNI), Chennai 600113, India
| | - Areejit Samal
- The Institute of Mathematical Sciences (IMSc), Homi Bhabha National Institute (HBNI), Chennai 600113, India.
| |
Collapse
|
45
|
Bruno A, Costantino G, Sartori L, Radi M. The In Silico Drug Discovery Toolbox: Applications in Lead Discovery and Optimization. Curr Med Chem 2019; 26:3838-3873. [PMID: 29110597 DOI: 10.2174/0929867324666171107101035] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2017] [Revised: 09/27/2017] [Accepted: 09/28/2017] [Indexed: 01/04/2023]
Abstract
BACKGROUND Discovery and development of a new drug is a long lasting and expensive journey that takes around 20 years from starting idea to approval and marketing of new medication. Despite R&D expenditures have been constantly increasing in the last few years, the number of new drugs introduced into market has been steadily declining. This is mainly due to preclinical and clinical safety issues, which still represent about 40% of drug discontinuation. To cope with this issue, a number of in silico techniques are currently being used for an early stage evaluation/prediction of potential safety issues, allowing to increase the drug-discovery success rate and reduce costs associated with the development of a new drug. METHODS In the present review, we will analyse the early steps of the drug-discovery pipeline, describing the sequence of steps from disease selection to lead optimization and focusing on the most common in silico tools used to assess attrition risks and build a mitigation plan. RESULTS A comprehensive list of widely used in silico tools, databases, and public initiatives that can be effectively implemented and used in the drug discovery pipeline has been provided. A few examples of how these tools can be problem-solving and how they may increase the success rate of a drug discovery and development program have been also provided. Finally, selected examples where the application of in silico tools had effectively contributed to the development of marketed drugs or clinical candidates will be given. CONCLUSION The in silico toolbox finds great application in every step of early drug discovery: (i) target identification and validation; (ii) hit identification; (iii) hit-to-lead; and (iv) lead optimization. Each of these steps has been described in details, providing a useful overview on the role played by in silico tools in the decision-making process to speed-up the discovery of new drugs.
Collapse
Affiliation(s)
- Agostino Bruno
- Experimental Therapeutics Unit, IFOM - The FIRC Institute for Molecular Oncology Foundation, Via Adamello 16 - 20139 Milano, Italy
| | - Gabriele Costantino
- Dipartimento di Scienze degli Alimenti e del Farmaco, Universita degli Studi di Parma, Viale delle Scienze, 27/A, 43124 Parma, Italy
| | - Luca Sartori
- Experimental Therapeutics Unit, IFOM - The FIRC Institute for Molecular Oncology Foundation, Via Adamello 16 - 20139 Milano, Italy
| | - Marco Radi
- Dipartimento di Scienze degli Alimenti e del Farmaco, Universita degli Studi di Parma, Viale delle Scienze, 27/A, 43124 Parma, Italy
| |
Collapse
|
46
|
Rifaioglu AS, Atas H, Martin MJ, Cetin-Atalay R, Atalay V, Doğan T. Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases. Brief Bioinform 2019; 20:1878-1912. [PMID: 30084866 PMCID: PMC6917215 DOI: 10.1093/bib/bby061] [Citation(s) in RCA: 223] [Impact Index Per Article: 44.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Revised: 05/25/2018] [Indexed: 01/16/2023] Open
Abstract
The identification of interactions between drugs/compounds and their targets is crucial for the development of new drugs. In vitro screening experiments (i.e. bioassays) are frequently used for this purpose; however, experimental approaches are insufficient to explore novel drug-target interactions, mainly because of feasibility problems, as they are labour intensive, costly and time consuming. A computational field known as 'virtual screening' (VS) has emerged in the past decades to aid experimental drug discovery studies by statistically estimating unknown bio-interactions between compounds and biological targets. These methods use the physico-chemical and structural properties of compounds and/or target proteins along with the experimentally verified bio-interaction information to generate predictive models. Lately, sophisticated machine learning techniques are applied in VS to elevate the predictive performance. The objective of this study is to examine and discuss the recent applications of machine learning techniques in VS, including deep learning, which became highly popular after giving rise to epochal developments in the fields of computer vision and natural language processing. The past 3 years have witnessed an unprecedented amount of research studies considering the application of deep learning in biomedicine, including computational drug discovery. In this review, we first describe the main instruments of VS methods, including compound and protein features (i.e. representations and descriptors), frequently used libraries and toolkits for VS, bioactivity databases and gold-standard data sets for system training and benchmarking. We subsequently review recent VS studies with a strong emphasis on deep learning applications. Finally, we discuss the present state of the field, including the current challenges and suggest future directions. We believe that this survey will provide insight to the researchers working in the field of computational drug discovery in terms of comprehending and developing novel bio-prediction methods.
Collapse
Affiliation(s)
- Ahmet Sureyya Rifaioglu
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
- Department of Computer Engineering, İskenderun Technical University, Hatay, Turkey
| | - Heval Atas
- Cancer System Biology Laboratory (CanSyL), Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Maria Jesus Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Cambridge, Hinxton, UK
| | - Rengul Cetin-Atalay
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
| | - Volkan Atalay
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
| | - Tunca Doğan
- Cancer System Biology Laboratory (CanSyL), Graduate School of Informatics, Middle East Technical University, Ankara, Turkey and European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Cambridge, Hinxton, UK
| |
Collapse
|
47
|
Kanakaveti V, Rathinasamy S, Rayala SK, Gromiha M. Forging New Scaffolds from Old: Combining Scaffold Hopping and Hierarchical Virtual Screening for Identifying Novel Bcl-2 Inhibitors. Curr Top Med Chem 2019; 19:1162-1172. [DOI: 10.2174/1568026619666190618142432] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Revised: 02/21/2019] [Accepted: 04/12/2019] [Indexed: 11/22/2022]
Abstract
Background:
Though virtual screening methods have proven to be potent in various instances, the
technique is practically incomplete to quench the need of drug discovery process. Thus, the quest for novel
designing approaches and chemotypes for improved efficacy of lead compounds has been intensified and logistic
approaches such as scaffold hopping and hierarchical virtual screening methods were evolved. Till now,
in all the previous attempts these two approaches were applied separately.
Objective:
In the current work, we made a novel attempt in terms of blending scaffold hopping and hierarchical
virtual screening. The prime objective is to assess the hybrid method for its efficacy in identifying active
lead molecules for emerging PPI target Bcl-2 (B-cell Lymphoma 2).
Method:
We designed novel scaffolds from the reported cores and screened a set of 8270 compounds using
both scaffold hopping and hierarchical virtual screening for Bcl-2 protein. Also, we enumerated the libraries
using clustering, PAINS filtering, physicochemical characterization and SAR matching.
Results:
We generated a focused library of compounds towards Bcl-2 interface, screened the 8270 compounds
and identified top hits for seven families upon fine filtering with PAINS algorithm, features, SAR mapping,
synthetic accessibility and similarity search. Our approach retrieved a set of 50 lead compounds.
Conclusions:
Finding rational approach meeting the needs of drug discovery process for PPI targets is the need
of the hour which can be fulfilled by an extended scaffold hopping approach resulting in focused PPI targeting
by providing novel leads with better potency.
Collapse
Affiliation(s)
- Vishnupriya Kanakaveti
- Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai – 600036, Tamil Nadu, India
| | - Sakthivel Rathinasamy
- Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai – 600036, Tamil Nadu, India
| | - Suresh K. Rayala
- Molecular Oncology Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai – 600036, Tamil Nadu, India
| | - Michael Gromiha
- Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai – 600036, Tamil Nadu, India
| |
Collapse
|
48
|
Lee M, Kim H, Joe H, Kim HG. Multi-channel PINN: investigating scalable and transferable neural networks for drug discovery. J Cheminform 2019; 11:46. [PMID: 31289963 PMCID: PMC6617572 DOI: 10.1186/s13321-019-0368-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 07/02/2019] [Indexed: 12/19/2022] Open
Abstract
Analysis of compound–protein interactions (CPIs) has become a crucial prerequisite for drug discovery and drug repositioning. In vitro experiments are commonly used in identifying CPIs, but it is not feasible to discover the molecular and proteomic space only through experimental approaches. Machine learning’s advances in predicting CPIs have made significant contributions to drug discovery. Deep neural networks (DNNs), which have recently been applied to predict CPIs, performed better than other shallow classifiers. However, such techniques commonly require a considerable volume of dense data for each training target. Although the number of publicly available CPI data has grown rapidly, public data is still sparse and has a large number of measurement errors. In this paper, we propose a novel method, Multi-channel PINN, to fully utilize sparse data in terms of representation learning. With representation learning, Multi-channel PINN can utilize three approaches of DNNs which are a classifier, a feature extractor, and an end-to-end learner. Multi-channel PINN can be fed with both low and high levels of representations and incorporates each of them by utilizing all approaches within a single model. To fully utilize sparse public data, we additionally explore the potential of transferring representations from training tasks to test tasks. As a proof of concept, Multi-channel PINN was evaluated on fifteen combinations of feature pairs to investigate how they affect the performance in terms of highest performance, initial performance, and convergence speed. The experimental results obtained indicate that the multi-channel models using protein features performed better than single-channel models or multi-channel models using compound features. Therefore, Multi-channel PINN can be advantageous when used with appropriate representations. Additionally, we pretrained models on a training task then finetuned them on a test task to figure out whether Multi-channel PINN can capture general representations for compounds and proteins. We found that there were significant differences in performance between pretrained models and non-pretrained models.
Collapse
Affiliation(s)
- Munhwan Lee
- Biomedical Knowledge Engineering Laboratory, Seoul National University, 1 Gwanak-ro, Seoul, Republic of Korea
| | - Hyeyeon Kim
- Biomedical Knowledge Engineering Laboratory, Seoul National University, 1 Gwanak-ro, Seoul, Republic of Korea
| | - Hyunwhan Joe
- Biomedical Knowledge Engineering Laboratory, Seoul National University, 1 Gwanak-ro, Seoul, Republic of Korea
| | - Hong-Gee Kim
- Biomedical Knowledge Engineering Laboratory, Seoul National University, 1 Gwanak-ro, Seoul, Republic of Korea.
| |
Collapse
|
49
|
Taylor R, Wood PA. A Million Crystal Structures: The Whole Is Greater than the Sum of Its Parts. Chem Rev 2019; 119:9427-9477. [PMID: 31244003 DOI: 10.1021/acs.chemrev.9b00155] [Citation(s) in RCA: 127] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The founding in 1965 of what is now called the Cambridge Structural Database (CSD) has reaped dividends in numerous and diverse areas of chemical research. Each of the million or so crystal structures in the database was solved for its own particular reason, but collected together, the structures can be reused to address a multitude of new problems. In this Review, which is focused mainly on the last 10 years, we chronicle the contribution of the CSD to research into molecular geometries, molecular interactions, and molecular assemblies and demonstrate its value in the design of biologically active molecules and the solid forms in which they are delivered. Its potential in other commercially relevant areas is described, including gas storage and delivery, thin films, and (opto)electronics. The CSD also aids the solution of new crystal structures. Because no scientific instrument is without shortcomings, the limitations of CSD research are assessed. We emphasize the importance of maintaining database quality: notwithstanding the arrival of big data and machine learning, it remains perilous to ignore the principle of garbage in, garbage out. Finally, we explain why the CSD must evolve with the world around it to ensure it remains fit for purpose in the years ahead.
Collapse
Affiliation(s)
- Robin Taylor
- Cambridge Crystallographic Data Centre , 12 Union Road , Cambridge CB2 1EZ , United Kingdom
| | - Peter A Wood
- Cambridge Crystallographic Data Centre , 12 Union Road , Cambridge CB2 1EZ , United Kingdom
| |
Collapse
|
50
|
Nadaraia NS, Amiranashvili LS, Merlani M, Kakhabrishvili ML, Barbakadze NN, Geronikaki A, Petrou A, Poroikov V, Ciric A, Glamoclija J, Sokovic M. Novel antimicrobial agents' discovery among the steroid derivatives. Steroids 2019; 144:52-65. [PMID: 30776376 DOI: 10.1016/j.steroids.2019.02.012] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Accepted: 02/12/2019] [Indexed: 02/07/2023]
Abstract
Fourteen steroid compounds were in silico evaluated using computer program PASS as antimicrobial agents. The experimental studies evaluation revealed that all compounds have good antibacterial activity with MIC at range of 0.003-0.96 mg/mL and MBC 0.06-1.92 mg/mL. Almost all compounds except of compound 4 (3β-acetoxy-1/-p-chlorophenyl-3/-methyl-5α-androstano[17,16-d]pyrazoline) were more potent than Ampicillin, and they were equipotent or more potent than Streptomycine. All compounds exhibited good antifungal activity with MIC at 0.003-0.96 mg/mL and MFC at 0.006-1.92 mg/mL but with different sensitivity against fungi tested. According to docking studies 14-alpha demethylase inhibition may be responsible for antifungal activity. Prediction of toxicity by PROTOX and GUSAR revealed that compounds have low toxicity and can be considered as potential lead compounds for the further studies.
Collapse
Affiliation(s)
- Nanuli Sh Nadaraia
- TSMU I.Kutateladze Institute of Pharmacochemistry, Tbilisi 0159, Georgia
| | | | - Maia Merlani
- TSMU I.Kutateladze Institute of Pharmacochemistry, Tbilisi 0159, Georgia
| | | | - Nana N Barbakadze
- TSMU I.Kutateladze Institute of Pharmacochemistry, Tbilisi 0159, Georgia
| | - Athina Geronikaki
- Aristotle University, School of Pharmacy, Thessaloniki 54124, Greece.
| | - Anthi Petrou
- Aristotle University, School of Pharmacy, Thessaloniki 54124, Greece
| | | | - Ana Ciric
- Mycological Laboratory, Department of Plant Physiology, Institute for Biological Research, Siniša Stanković, University of Belgrade, Bulevar Despota Stefana, Serbia
| | - Jarmila Glamoclija
- Mycological Laboratory, Department of Plant Physiology, Institute for Biological Research, Siniša Stanković, University of Belgrade, Bulevar Despota Stefana, Serbia
| | - Marina Sokovic
- Mycological Laboratory, Department of Plant Physiology, Institute for Biological Research, Siniša Stanković, University of Belgrade, Bulevar Despota Stefana, Serbia
| |
Collapse
|