451
|
Maietta P, Lopez G, Carro A, Pingilley BJ, Leon LG, Valencia A, Tress ML. FireDB: a compendium of biological and pharmacologically relevant ligands. Nucleic Acids Res 2013; 42:D267-72. [PMID: 24243844 PMCID: PMC3965074 DOI: 10.1093/nar/gkt1127] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
FireDB (http://firedb.bioinfo.cnio.es) is a curated inventory of catalytic and biologically relevant small ligand-binding residues culled from the protein structures in the Protein Data Bank. Here we present the important new additions since the publication of FireDB in 2007. The database now contains an extensive list of manually curated biologically relevant compounds. Biologically relevant compounds are informative because of their role in protein function, but they are only a small fraction of the entire ligand set. For the remaining ligands, the FireDB provides cross-references to the annotations from publicly available biological, chemical and pharmacological compound databases. FireDB now has external references for 95% of contacting small ligands, making FireDB a more complete database and providing the scientific community with easy access to the pharmacological annotations of PDB ligands. In addition to the manual curation of ligands, FireDB also provides insights into the biological relevance of individual binding sites. Here, biological relevance is calculated from the multiple sequence alignments of related binding sites that are generated from all-against-all comparison of each FireDB binding site. The database can be accessed by RESTful web services and is available for download via MySQL.
Collapse
Affiliation(s)
- Paolo Maietta
- Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre, Madrid, 28029, Spain and Spanish National Bioinformatics Institute (INB-ISCIII)
| | | | | | | | | | | | | |
Collapse
|
452
|
Mitra P, Shultis D, Brender JR, Czajka J, Marsh D, Gray F, Cierpicki T, Zhang Y. An evolution-based approach to De Novo protein design and case study on Mycobacterium tuberculosis. PLoS Comput Biol 2013; 9:e1003298. [PMID: 24204234 PMCID: PMC3812052 DOI: 10.1371/journal.pcbi.1003298] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2012] [Accepted: 09/09/2013] [Indexed: 01/31/2023] Open
Abstract
Computational protein design is a reverse procedure of protein folding and structure prediction, where constructing structures from evolutionarily related proteins has been demonstrated to be the most reliable method for protein 3-dimensional structure prediction. Following this spirit, we developed a novel method to design new protein sequences based on evolutionarily related protein families. For a given target structure, a set of proteins having similar fold are identified from the PDB library by structural alignments. A structural profile is then constructed from the protein templates and used to guide the conformational search of amino acid sequence space, where physicochemical packing is accommodated by single-sequence based solvation, torsion angle, and secondary structure predictions. The method was tested on a computational folding experiment based on a large set of 87 protein structures covering different fold classes, which showed that the evolution-based design significantly enhances the foldability and biological functionality of the designed sequences compared to the traditional physics-based force field methods. Without using homologous proteins, the designed sequences can be folded with an average root-mean-square-deviation of 2.1 Å to the target. As a case study, the method is extended to redesign all 243 structurally resolved proteins in the pathogenic bacteria Mycobacterium tuberculosis, which is the second leading cause of death from infectious disease. On a smaller scale, five sequences were randomly selected from the design pool and subjected to experimental validation. The results showed that all the designed proteins are soluble with distinct secondary structure and three have well ordered tertiary structure, as demonstrated by circular dichroism and NMR spectroscopy. Together, these results demonstrate a new avenue in computational protein design that uses knowledge of evolutionary conservation from protein structural families to engineer new protein molecules of improved fold stability and biological functionality. The goal of computational protein design is to create new protein sequences of desirable structure and biological function. Most protein design methods are developed to search for sequences with the lowest free-energy based on physics-based force fields following Anfinsen's thermodynamic hypothesis. A major obstacle of such approaches is the inaccuracy of the force-field design, which cannot accurately describe atomic interactions or correctly recognize protein folds. We propose a novel method which uses evolutionary information, in the form of sequence profiles from structure families, to guide the sequence design. Since sequence profiles are generally more accurate than physics-based potentials in protein fold recognition, a unique advantage lies on that it targets the design procedure to a family of protein sequence profiles to enhance the robustness of designed sequences. The method was tested on 87 proteins and the designed sequences can be folded by I-TASSER to models with an average RMSD 2.1 Å. As a case study of large-scale application, the method is extended to redesign all structurally resolved proteins in the human pathogenic bacteria, Mycobacterium tuberculosis. Five sequences varying in fold and sizes were characterized by circular dichroism and NMR spectroscopy experiments and three were shown to have ordered tertiary structure.
Collapse
Affiliation(s)
- Pralay Mitra
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - David Shultis
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Jeffrey R. Brender
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Jeff Czajka
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - David Marsh
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Felicia Gray
- Department of Pathology, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Tomasz Cierpicki
- Department of Pathology, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, United States of America
- * E-mail:
| |
Collapse
|
453
|
Ucisik MN, Chakravorty DK, Merz KM. Structure and dynamics of the N-terminal domain of the Cu(I) binding protein CusB. Biochemistry 2013; 52:6911-23. [PMID: 23988152 DOI: 10.1021/bi400606b] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
CusCFBA is one of the metal efflux systems in Escherichia coli that is highly specific for its substrates, Cu(I) and Ag(I). It serves to protect the bacteria in environments that have lethal concentrations of these metals. The membrane fusion protein CusB is the periplasmic piece of CusCFBA, which has not been fully characterized by crystallography because of its extremely disordered N-terminal region. This region has both structural and functional importance because it has been experimentally proven to transfer the metal by itself from the metallochaperone CusF and to induce a structural change in the rest of CusB to increase Cu(I)/Ag(I) resistance. Understanding metal uptake from the periplasm is critical to gain insight into the mechanism of the whole CusCFBA pump, which makes resolving a structure for the N-terminal region necessary because it contains the metal binding site. We ran extensive molecular dynamics simulations to reveal the structural and dynamic properties of both the apo and Cu(I)-bound versions of the CusB N-terminal region. In contrast to its functional companion CusF, Cu(I) binding to the N-terminus of CusB causes only a slight, local stabilization around the metal site. The trajectories were analyzed in detail, revealing extensive structural disorder in both the apo and holo forms of the protein. CusB was further analyzed by breaking the protein up into three subdomains according to the extent of the observed disorder: the N- and C-terminal tails, the central beta strand motif, and the M21-M36 loop connecting the two metal-coordinating methionine residues. Most of the observed disorder was traced back to the tail regions, leading us to hypothesize that the latter two subdomains (residues 13-45) may form a functionally competent metal-binding domain because the tail regions appear to play no role in metal binding.
Collapse
Affiliation(s)
- Melek N Ucisik
- Department of Chemistry and Quantum Theory Project, University of Florida , 2328 New Physics Building, P.O. Box 118435, Gainesville, Florida 32611-8435, United States
| | | | | |
Collapse
|
454
|
Yang J, Roy A, Zhang Y. Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. ACTA ACUST UNITED AC 2013; 29:2588-95. [PMID: 23975762 DOI: 10.1093/bioinformatics/btt447] [Citation(s) in RCA: 610] [Impact Index Per Article: 55.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Identification of protein-ligand binding sites is critical to protein function annotation and drug discovery. However, there is no method that could generate optimal binding site prediction for different protein types. Combination of complementary predictions is probably the most reliable solution to the problem. RESULTS We develop two new methods, one based on binding-specific substructure comparison (TM-SITE) and another on sequence profile alignment (S-SITE), for complementary binding site predictions. The methods are tested on a set of 500 non-redundant proteins harboring 814 natural, drug-like and metal ion molecules. Starting from low-resolution protein structure predictions, the methods successfully recognize >51% of binding residues with average Matthews correlation coefficient (MCC) significantly higher (with P-value <10(-9) in student t-test) than other state-of-the-art methods, including COFACTOR, FINDSITE and ConCavity. When combining TM-SITE and S-SITE with other structure-based programs, a consensus approach (COACH) can increase MCC by 15% over the best individual predictions. COACH was examined in the recent community-wide COMEO experiment and consistently ranked as the best method in last 22 individual datasets with the Area Under the Curve score 22.5% higher than the second best method. These data demonstrate a new robust approach to protein-ligand binding site recognition, which is ready for genome-wide structure-based function annotations. AVAILABILITY http://zhanglab.ccmb.med.umich.edu/COACH/
Collapse
Affiliation(s)
- Jianyi Yang
- Department of Computational Medicine and Bioinformatics and Department of Biological Chemistry, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218, USA
| | | | | |
Collapse
|
455
|
Cereto-Massagué A, Ojeda MJ, Joosten RP, Valls C, Mulero M, Salvado MJ, Arola-Arnal A, Arola L, Garcia-Vallvé S, Pujadas G. The good, the bad and the dubious: VHELIBS, a validation helper for ligands and binding sites. J Cheminform 2013; 5:36. [PMID: 23895374 PMCID: PMC3733808 DOI: 10.1186/1758-2946-5-36] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2013] [Accepted: 07/18/2013] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Many Protein Data Bank (PDB) users assume that the deposited structural models are of high quality but forget that these models are derived from the interpretation of experimental data. The accuracy of atom coordinates is not homogeneous between models or throughout the same model. To avoid basing a research project on a flawed model, we present a tool for assessing the quality of ligands and binding sites in crystallographic models from the PDB. RESULTS The Validation HElper for LIgands and Binding Sites (VHELIBS) is software that aims to ease the validation of binding site and ligand coordinates for non-crystallographers (i.e., users with little or no crystallography knowledge). Using a convenient graphical user interface, it allows one to check how ligand and binding site coordinates fit to the electron density map. VHELIBS can use models from either the PDB or the PDB_REDO databank of re-refined and re-built crystallographic models. The user can specify threshold values for a series of properties related to the fit of coordinates to electron density (Real Space R, Real Space Correlation Coefficient and average occupancy are used by default). VHELIBS will automatically classify residues and ligands as Good, Dubious or Bad based on the specified limits. The user is also able to visually check the quality of the fit of residues and ligands to the electron density map and reclassify them if needed. CONCLUSIONS VHELIBS allows inexperienced users to examine the binding site and the ligand coordinates in relation to the experimental data. This is an important step to evaluate models for their fitness for drug discovery purposes such as structure-based pharmacophore development and protein-ligand docking experiments.
Collapse
Affiliation(s)
- Adrià Cereto-Massagué
- Grup de Recerca en Nutrigenòmica, Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, Campus de Sescelades, C/ Marceŀlí Domingo s/n, Tarragona, Catalonia 43007, Spain
| | - María José Ojeda
- Grup de Recerca en Nutrigenòmica, Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, Campus de Sescelades, C/ Marceŀlí Domingo s/n, Tarragona, Catalonia 43007, Spain
| | - Robbie P Joosten
- Department of Biochemistry, Netherlands Cancer Institute, Plesmanlaan 121, Amsterdam 1066 CX, The Netherlands
| | - Cristina Valls
- Grup de Recerca en Nutrigenòmica, Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, Campus de Sescelades, C/ Marceŀlí Domingo s/n, Tarragona, Catalonia 43007, Spain
| | - Miquel Mulero
- Grup de Recerca en Nutrigenòmica, Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, Campus de Sescelades, C/ Marceŀlí Domingo s/n, Tarragona, Catalonia 43007, Spain
| | - M Josepa Salvado
- Grup de Recerca en Nutrigenòmica, Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, Campus de Sescelades, C/ Marceŀlí Domingo s/n, Tarragona, Catalonia 43007, Spain
| | - Anna Arola-Arnal
- Grup de Recerca en Nutrigenòmica, Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, Campus de Sescelades, C/ Marceŀlí Domingo s/n, Tarragona, Catalonia 43007, Spain
| | - Lluís Arola
- Grup de Recerca en Nutrigenòmica, Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, Campus de Sescelades, C/ Marceŀlí Domingo s/n, Tarragona, Catalonia 43007, Spain
- Centre Tecnològic de Nutrició i Salut (CTNS), TECNIO, CEICS, Avinguda Universitat 1, Reus, Catalonia 43204, Spain
| | - Santiago Garcia-Vallvé
- Grup de Recerca en Nutrigenòmica, Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, Campus de Sescelades, C/ Marceŀlí Domingo s/n, Tarragona, Catalonia 43007, Spain
- Centre Tecnològic de Nutrició i Salut (CTNS), TECNIO, CEICS, Avinguda Universitat 1, Reus, Catalonia 43204, Spain
| | - Gerard Pujadas
- Grup de Recerca en Nutrigenòmica, Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, Campus de Sescelades, C/ Marceŀlí Domingo s/n, Tarragona, Catalonia 43007, Spain
- Centre Tecnològic de Nutrició i Salut (CTNS), TECNIO, CEICS, Avinguda Universitat 1, Reus, Catalonia 43204, Spain
| |
Collapse
|
456
|
Fu J, Ling S, Liu Y, Yang J, Naveh S, Hannah M, Gilon C, Zhang Y, Holoshitz J. A small shared epitope-mimetic compound potently accelerates osteoclast-mediated bone damage in autoimmune arthritis. THE JOURNAL OF IMMUNOLOGY 2013; 191:2096-103. [PMID: 23885107 DOI: 10.4049/jimmunol.1203231] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
We have recently proposed that the shared epitope (SE) may contribute to rheumatoid arthritis pathogenesis by acting as a ligand that activates proarthritogenic signal transduction events. To examine this hypothesis, in this study we characterized a novel small SE-mimetic compound, c(HS4-4), containing the SE primary sequence motif QKRAA, which was synthesized using a backbone cyclization method. The SE-mimetic c(HS4-4) compound interacted strongly with the SE receptor calreticulin, potently activated NO and reactive oxygen species production, and markedly facilitated osteoclast differentiation and function in vitro. The pro-osteoclastogenic potency of c(HS4-4) was 100,000- to 1,000,000-fold higher than the potency of a recently described linear SE peptidic ligand. When administered in vivo at nanogram doses, c(HS4-4) enhanced Th17 expansion, and in mice with collagen-induced arthritis it facilitated disease onset, increased disease incidence and severity, enhanced osteoclast abundance in synovial tissues and osteoclastogenic propensities of bone marrow-derived cells, and augmented bone destruction. In conclusion, c(HS4-4), a highly potent small SE-mimetic compound enhances bone damage and disease severity in inflammatory arthritis. These findings support the hypothesis that the SE acts as a signal transduction ligand that activates a CRT-mediated proarthritogenic pathway.
Collapse
Affiliation(s)
- Jiaqi Fu
- Department of Internal Medicine, University of Michigan School of Medicine, Ann Arbor, MI 48109-5680, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
457
|
Yu DJ, Hu J, Yang J, Shen HB, Tang J, Yang JY. Designing template-free predictor for targeting protein-ligand binding sites with classifier ensemble and spatial clustering. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:994-1008. [PMID: 24334392 DOI: 10.1109/tcbb.2013.104] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Accurately identifying the protein-ligand binding sites or pockets is of significant importance for both protein function analysis and drug design. Although much progress has been made, challenges remain, especially when the 3D structures of target proteins are not available or no homology templates can be found in the library, where the template-based methods are hard to be applied. In this paper, we report a new ligand-specific template-free predictor called TargetS for targeting protein-ligand binding sites from primary sequences. TargetS first predicts the binding residues along the sequence with ligand-specific strategy and then further identifies the binding sites from the predicted binding residues through a recursive spatial clustering algorithm. Protein evolutionary information, predicted protein secondary structure, and ligand-specific binding propensities of residues are combined to construct discriminative features; an improved AdaBoost classifier ensemble scheme based on random undersampling is proposed to deal with the serious imbalance problem between positive (binding) and negative (nonbinding) samples. Experimental results demonstrate that TargetS achieves high performances and outperforms many existing predictors. TargetS web server and data sets are freely available at: http://www.csbio.sjtu.edu.cn/bioinf/TargetS/ for academic use.
Collapse
Affiliation(s)
- Dong-Jun Yu
- Nanjing University of Science and Technology, Nanjing
| | - Jun Hu
- Nanjing University of Science and Technology, Nanjing
| | - Jing Yang
- Shanghai Jiao Tong University, Shanghai and Ministry of Education of China, Shanghai
| | - Hong-Bin Shen
- Shanghai Jiao Tong University, Shanghai and Ministry of Education of China, Shanghai
| | - Jinhui Tang
- Nanjing University of Science and Technology, Nanjing
| | - Jing-Yu Yang
- Nanjing University of Science and Technology, Nanjing
| |
Collapse
|
458
|
Omenn GS, Menon R, Zhang Y. Innovations in proteomic profiling of cancers: alternative splice variants as a new class of cancer biomarker candidates and bridging of proteomics with structural biology. J Proteomics 2013; 90:28-37. [PMID: 23603631 DOI: 10.1016/j.jprot.2013.04.007] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2013] [Revised: 04/05/2013] [Accepted: 04/07/2013] [Indexed: 01/05/2023]
Abstract
Alternative splicing allows a single gene to generate multiple RNA transcripts which can be translated into functionally diverse protein isoforms. Current knowledge of splicing is derived mainly from RNA transcripts, with very little known about the expression level, 3D structures, and functional differences of the proteins. Splicing is a remarkable phenomenon of molecular and biological evolution. Studies which simply report up-regulation or down-regulation of protein or mRNA expression are confounded by the effects of mixtures of these isoforms. Besides understanding the net biological effects of the mixtures, we may be able to develop biomarker tests based on the observable differential expression of particular splice variants or combinations of splice variants in specific disease states. Here we review our work on differential expression of splice variant proteins in cancers and the feasibility of integrating proteomic analysis with structure-based conformational predictions of the differences between such isoforms.
Collapse
Affiliation(s)
- Gilbert S Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218, USA.
| | | | | |
Collapse
|