1
|
Siragusa L, Luciani R, Borsari C, Ferrari S, Costi MP, Cruciani G, Spyrakis F. Comparing Drug Images and Repurposing Drugs with BioGPS and FLAPdock: The Thymidylate Synthase Case. ChemMedChem 2016; 11:1653-66. [PMID: 27404817 DOI: 10.1002/cmdc.201600121] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Revised: 06/08/2016] [Indexed: 12/14/2022]
Abstract
Repurposing and repositioning drugs has become a frequently pursued and successful strategy in the current era, as new chemical entities are increasingly difficult to find and get approved. Herein we report an integrated BioGPS/FLAPdock pipeline for rapid and effective off-target identification and drug repurposing. Our method is based on the structural and chemical properties of protein binding sites, that is, the ligand image, encoded in the GRID molecular interaction fields (MIFs). Protein similarity is disclosed through the BioGPS algorithm by measuring the pockets' overlap according to which pockets are clustered. Co-crystallized and known ligands can be cross-docked among similar targets, selected for subsequent in vitro binding experiments, and possibly improved for inhibitory potency. We used human thymidylate synthase (TS) as a test case and searched the entire RCSB Protein Data Bank (PDB) for similar target pockets. We chose casein kinase IIα as a control and tested a series of its inhibitors against the TS template. Ellagic acid and apigenin were identified as TS inhibitors, and various flavonoids were selected and synthesized in a second-round selection. The compounds were demonstrated to be active in the low-micromolar range.
Collapse
Affiliation(s)
- Lydia Siragusa
- Molecular Discovery Limited, 215 Marsh Road, Pinner Middlesex, London, HA5 5NE, UK
| | - Rosaria Luciani
- Department of Life Sciences, University of Modena and Reggio Emilia, Via Campi 103, 41125, Modena, Italy
| | - Chiara Borsari
- Department of Life Sciences, University of Modena and Reggio Emilia, Via Campi 103, 41125, Modena, Italy
| | - Stefania Ferrari
- Department of Life Sciences, University of Modena and Reggio Emilia, Via Campi 103, 41125, Modena, Italy
| | - Maria Paola Costi
- Department of Life Sciences, University of Modena and Reggio Emilia, Via Campi 103, 41125, Modena, Italy
| | - Gabriele Cruciani
- Department of Chemistry, Biology and Biotechnology, University of Perugia, Via Elce di Sotto 8, 06123, Perugia, Italy
| | - Francesca Spyrakis
- Department of Life Sciences, University of Modena and Reggio Emilia, Via Campi 103, 41125, Modena, Italy. .,Department of Food Science, University of Parma, Viale delle Scienze 17A, 43124, Parma, Italy.
| |
Collapse
|
2
|
Parca L, Ferré F, Ausiello G, Helmer-Citterich M. Nucleos: a web server for the identification of nucleotide-binding sites in protein structures. Nucleic Acids Res 2013; 41:W281-5. [PMID: 23703207 PMCID: PMC3692072 DOI: 10.1093/nar/gkt390] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Nucleos is a web server for the identification of nucleotide-binding sites in protein structures. Nucleos compares the structure of a query protein against a set of known template 3D binding sites representing nucleotide modules, namely the nucleobase, carbohydrate and phosphate. Structural features, clustering and conservation are used to filter and score the predictions. The predicted nucleotide modules are then joined to build whole nucleotide-binding sites, which are ranked by their score. The server takes as input either the PDB code of the query protein structure or a user-submitted structure in PDB format. The output of Nucleos is composed of ranked lists of predicted nucleotide-binding sites divided by nucleotide type (e.g. ATP-like). For each ranked prediction, Nucleos provides detailed information about the score, the template structure and the structural match for each nucleotide module composing the nucleotide-binding site. The predictions on the query structure and the template-binding sites can be viewed directly on the web through a graphical applet. In 98% of the cases, the modules composing correct predictions belong to proteins with no homology relationship between each other, meaning that the identification of brand-new nucleotide-binding sites is possible using information from non-homologous proteins. Nucleos is available at http://nucleos.bio.uniroma2.it/nucleos/.
Collapse
Affiliation(s)
- Luca Parca
- Department of Biology, Centre for Molecular Bioinformatics, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | | | | | | |
Collapse
|
3
|
Macchiarulo A, Carotti A, Cellanetti M, Sardella R, Gioiello A. Navigations of chemical space to further the understanding of polypharmacology in human nuclear receptors. MEDCHEMCOMM 2013. [DOI: 10.1039/c2md20157g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The article analyses properties featuring the binding site of human nuclear receptors and cognate ligands, investigating aspects of polypharmacology.
Collapse
Affiliation(s)
- Antonio Macchiarulo
- Dipartimento di Chimica e Tecnologia del Farmaco
- Università di Perugia
- 06123 Perugia
- Italy
| | - Andrea Carotti
- Dipartimento di Chimica e Tecnologia del Farmaco
- Università di Perugia
- 06123 Perugia
- Italy
| | - Marco Cellanetti
- Dipartimento di Chimica e Tecnologia del Farmaco
- Università di Perugia
- 06123 Perugia
- Italy
| | - Roccaldo Sardella
- Dipartimento di Chimica e Tecnologia del Farmaco
- Università di Perugia
- 06123 Perugia
- Italy
| | - Antimo Gioiello
- Dipartimento di Chimica e Tecnologia del Farmaco
- Università di Perugia
- 06123 Perugia
- Italy
| |
Collapse
|
4
|
Parca L, Gherardini PF, Truglio M, Mangone I, Ferrè F, Helmer-Citterich M, Ausiello G. Identification of nucleotide-binding sites in protein structures: a novel approach based on nucleotide modularity. PLoS One 2012; 7:e50240. [PMID: 23209685 PMCID: PMC3507729 DOI: 10.1371/journal.pone.0050240] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2012] [Accepted: 10/22/2012] [Indexed: 01/30/2023] Open
Abstract
Nucleotides are involved in several cellular processes, ranging from the transmission of genetic information, to energy transfer and storage. Both sequence and structure based methods have been developed to predict the location of nucleotide-binding sites in proteins. Here we propose a novel methodology that leverages the observation that nucleotide-binding sites have a modular structure. Nucleotides are composed of identifiable fragments, i.e. the phosphate, the nucleobase and the carbohydrate moieties. These fragments are bound by specific structural motifs that recur in proteins of different fold. Moreover these motifs behave as modules and are found in different combinations across fold space. Our method predicts binding sites for each nucleotide fragment by comparing a query protein with a database of templates extracted from proteins of known structure. Whenever a similarity is found the fragment bound by the template is transferred on the query protein, thus identifying a putative binding site. Predictions falling inside the surface of the protein are discarded, and the remaining ones are scored using clustering and conservation. The method is able to rank as first a correct prediction in the 48%, 48% and 68% of the analyzed proteins for the nucleobase, carbohydrate and phosphate respectively, while considering the first five predictions the performances change to 71%, 65% and 86% respectively. Furthermore we attempted to reconstruct the full structure of the binding site, starting from the predicted positions of the fragments. We calculated that in the 59% of the analyzed proteins the method ranks as first a reconstructed binding site or a part of it. Finally we tested the reliability of our method in a real world case in which it has to predict nucleotide-binding sites in unbound proteins. We analyzed proteins whose structure has been solved with and without the nucleotide and observed only little variations in the method performance.
Collapse
Affiliation(s)
- Luca Parca
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | | | - Mauro Truglio
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - Iolanda Mangone
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - Fabrizio Ferrè
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | | | - Gabriele Ausiello
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| |
Collapse
|
5
|
Wu CY, Hwa YH, Chen YC, Lim C. Hidden relationship between conserved residues and locally conserved phosphate-binding structures in NAD(P)-binding proteins. J Phys Chem B 2012; 116:5644-52. [PMID: 22530587 DOI: 10.1021/jp3014332] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
A one-dimensional (1D) motif usually comprises conserved essential residues involved in catalysis, ligand binding, or maintaining a specific structure. However, it cannot be easily detected in proteins with low sequence identity because it is difficult to (1) identify protein sequences suspected to contain the motif, and (2) align sequences with little sequence identity to spot the conserved residues. Here, we present a strategy for discovering phosphate-binding 1D motifs in NAD(P)-binding proteins sharing low sequence identity that overcomes these two hurdles by determining all distinct locally conserved pyrophosphate-binding structures and aligning the same-length sequences comprising each of these structures to identify the conserved residues. We show that the sequence motifs derived from the distinct pyrophosphate-binding structures yield different numbers/spacing of conserved Gly residues. We also show that they depend on the side chain orientations and cofactor type (NAD or NADP). Thus, sequence motifs derived from local similarity of backbone structures without consideration of the cofactor type and/or side chain orientations would reduce their reliability in annotating protein function from sequence alone. The three-dimensional (3D) and 1D motifs comprising conserved residues in nonredundant proteins reveal hidden relationships between the protein structure/function and sequence as well as protein-cofactor interactions.
Collapse
Affiliation(s)
- Chih Yuan Wu
- Institute of Biomedical Sciences, Academia Sinica , Taipei 115, Taiwan
| | | | | | | |
Collapse
|
6
|
Bianchi V, Gherardini PF, Helmer-Citterich M, Ausiello G. Identification of binding pockets in protein structures using a knowledge-based potential derived from local structural similarities. BMC Bioinformatics 2012; 13 Suppl 4:S17. [PMID: 22536963 PMCID: PMC3434446 DOI: 10.1186/1471-2105-13-s4-s17] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Background The identification of ligand binding sites is a key task in the annotation of proteins with known structure but uncharacterized function. Here we describe a knowledge-based method exploiting the observation that unrelated binding sites share small structural motifs that bind the same chemical fragments irrespective of the nature of the ligand as a whole. Results PDBinder compares a query protein against a library of binding and non-binding protein surface regions derived from the PDB. The results of the comparison are used to derive a propensity value for each residue which is correlated with the likelihood that the residue is part of a ligand binding site. The method was applied to two different problems: i) the prediction of ligand binding residues and ii) the identification of which surface cleft harbours the binding site. In both cases PDBinder performed consistently better than existing methods. PDBinder has been trained on a non-redundant set of 1356 high-quality protein-ligand complexes and tested on a set of 239 holo and apo complex pairs. We obtained an MCC of 0.313 on the holo set with a PPV of 0.413 while on the apo set we achieved an MCC of 0.271 and a PPV of 0.372. Conclusions We show that PDBinder performs better than existing methods. The good performance on the unbound proteins is extremely important for real-world applications where the location of the binding site is unknown. Moreover, since our approach is orthogonal to those used in other programs, the PDBinder propensity value can be integrated in other algorithms further increasing the final performance.
Collapse
Affiliation(s)
- Valerio Bianchi
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, Rome 00133, Italy
| | | | | | | |
Collapse
|
7
|
Parca L, Mangone I, Gherardini PF, Ausiello G, Helmer-Citterich M. Phosfinder: a web server for the identification of phosphate-binding sites on protein structures. Nucleic Acids Res 2011; 39:W278-82. [PMID: 21622655 PMCID: PMC3125782 DOI: 10.1093/nar/gkr389] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Phosfinder is a web server for the identification of phosphate binding sites in protein structures. Phosfinder uses a structural comparison algorithm to scan a query structure against a set of known 3D phosphate binding motifs. Whenever a structural similarity between the query protein and a phosphate binding motif is detected, the phosphate bound by the known motif is added to the protein structure thus representing a putative phosphate binding site. Predicted binding sites are then evaluated according to (i) their position with respect to the query protein solvent-excluded surface and (ii) the conservation of the binding residues in the protein family. The server accepts as input either the PDB code of the protein to be analyzed or a user-submitted structure in PDB format. All the search parameters are user modifiable. Phosfinder outputs a list of predicted binding sites with detailed information about their structural similarity with known phosphate binding motifs, and the conservation of the residues involved. A graphical applet allows the user to visualize the predicted binding sites on the query protein structure. The results on a set of 52 apo/holo structure pairs show that the performance of our method is largely unaffected by ligand-induced conformational changes. Phosfinder is available at http://phosfinder.bio.uniroma2.it.
Collapse
Affiliation(s)
- Luca Parca
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | | | | | | | | |
Collapse
|
8
|
Parca L, Gherardini PF, Helmer-Citterich M, Ausiello G. Phosphate binding sites identification in protein structures. Nucleic Acids Res 2010; 39:1231-42. [PMID: 20974634 PMCID: PMC3045618 DOI: 10.1093/nar/gkq987] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Nearly half of known protein structures interact with phosphate-containing ligands, such as nucleotides and other cofactors. Many methods have been developed for the identification of metal ions-binding sites and some for bigger ligands such as carbohydrates, but none is yet available for the prediction of phosphate-binding sites. Here we describe Pfinder, a method that predicts binding sites for phosphate groups, both in the form of ions or as parts of other non-peptide ligands, in proteins of known structure. Pfinder uses the Query3D local structural comparison algorithm to scan a protein structure for the presence of a number of structural motifs identified for their ability to bind the phosphate chemical group. Pfinder has been tested on a data set of 52 proteins for which both the apo and holo forms were available. We obtained at least one correct prediction in 63% of the holo structures and in 62% of the apo. The ability of Pfinder to recognize a phosphate-binding site in unbound protein structures makes it an ideal tool for functional annotation and for complementing docking and drug design methods. The Pfinder program is available at http://pdbfun.uniroma2.it/pfinder.
Collapse
Affiliation(s)
- Luca Parca
- Department of Biology, Centre for Molecular Bioinformatics, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | | | | | | |
Collapse
|
9
|
Tendulkar AV, Krallinger M, de la Torre V, López G, Wangikar PP, Valencia A. FragKB: structural and literature annotation resource of conserved peptide fragments and residues. PLoS One 2010; 5:e9679. [PMID: 20305778 PMCID: PMC2841175 DOI: 10.1371/journal.pone.0009679] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2009] [Accepted: 02/12/2010] [Indexed: 01/21/2023] Open
Abstract
Background FragKB (Fragment Knowledgebase) is a repository of clusters of structurally similar fragments from proteins. Fragments are annotated with information at the level of sequence, structure and function, integrating biological descriptions derived from multiple existing resources and text mining. Methodology FragKB contains approximately 400,000 conserved fragments from 4,800 representative proteins from PDB. Literature annotations are extracted from more than 1,700 articles and are available for over 12,000 fragments. The underlying systematic annotation workflow of FragKB ensures efficient update and maintenance of this database. The information in FragKB can be accessed through a web interface that facilitates sequence and structural visualization of fragments together with known literature information on the consequences of specific residue mutations and functional annotations of proteins and fragment clusters. FragKB is accessible online at http://ubio.bioinfo.cnio.es/biotools/fragkb/. Significance The information presented in FragKB can be used for modeling protein structures, for designing novel proteins and for functional characterization of related fragments. The current release is focused on functional characterization of proteins through inspection of conservation of the fragments.
Collapse
Affiliation(s)
- Ashish V Tendulkar
- Structural Biology and Biocomputing Programme, Spanish National Cancer Center, Madrid, Spain.
| | | | | | | | | | | |
Collapse
|
10
|
Gherardini PF, Ausiello G, Russell RB, Helmer-Citterich M. Modular architecture of nucleotide-binding pockets. Nucleic Acids Res 2010; 38:3809-16. [PMID: 20185567 PMCID: PMC2887960 DOI: 10.1093/nar/gkq090] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Recently, modularity has emerged as a general attribute of complex biological systems. This is probably because modular systems lend themselves readily to optimization via random mutation followed by natural selection. Although they are not traditionally considered to evolve by this process, biological ligands are also modular, being composed of recurring chemical fragments, and moreover they exhibit similarities reminiscent of mutations (e.g. the few atoms differentiating adenine and guanine). Many ligands are also promiscuous in the sense that they bind to many different protein folds. Here, we investigated whether ligand chemical modularity is reflected in an underlying modularity of binding sites across unrelated proteins. We chose nucleotides as paradigmatic ligands, because they can be described as composed of well-defined fragments (nucleobase, ribose and phosphates) and are quite abundant both in nature and in protein structure databases. We found that nucleotide-binding sites do indeed show a modular organization and are composed of fragment-specific protein structural motifs, which parallel the modular structure of their ligands. Through an analysis of the distribution of these motifs in different proteins and in different folds, we discuss the evolutionary implications of these findings and argue that the structural features we observed can arise both as a result of divergence from a common ancestor or convergent evolution.
Collapse
Affiliation(s)
- Pier Federico Gherardini
- Centre for Molecular Bioinformatics, Department of Biology, University of Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | | | | | | |
Collapse
|
11
|
Mining protein loops using a structural alphabet and statistical exceptionality. BMC Bioinformatics 2010; 11:75. [PMID: 20132552 PMCID: PMC2833150 DOI: 10.1186/1471-2105-11-75] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2009] [Accepted: 02/04/2010] [Indexed: 12/21/2022] Open
Abstract
Background Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. Results We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times). Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words). These structural words have low structural variability (mean RMSd of 0.85 Å). As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues) and long loops. Moreover, half of recurrent motifs exhibit a significant level of amino-acid conservation with at least four significant positions and 87% of long loops contain at least one such word. We complement our analysis with the detection of statistically over-represented patterns of structural letters as in conventional DNA sequence analysis. About 30% (930) of structural words are over-represented, and cover about 40% of loop lengths. Interestingly, these words exhibit lower structural variability and higher sequential specificity, suggesting structural or functional constraints. Conclusions We developed a method to systematically decompose and study protein loops using recurrent structural motifs. This method is based on the structural alphabet HMM-SA and not on structural alignment and geometrical parameters. We extracted meaningful structural motifs that are found in both short and long loops. To our knowledge, it is the first time that pattern mining helps to increase the signal-to-noise ratio in protein loops. This finding helps to better describe protein loops and might permit to decrease the complexity of long-loop analysis. Detailed results are available at http://www.mti.univ-paris-diderot.fr/publication/supplementary/2009/ACCLoop/.
Collapse
|