1
|
Identification of sequence determinants for the ABHD14 enzymes. Proteins 2023. [PMID: 37974539 DOI: 10.1002/prot.26632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 10/14/2023] [Accepted: 10/24/2023] [Indexed: 11/19/2023]
Abstract
Over the course of evolution, enzymes have developed remarkable functional diversity in catalyzing important chemical reactions across various organisms, and understanding how new enzyme functions might have evolved remains an important question in modern enzymology. To systematically annotate functions, based on their protein sequences and available biochemical studies, enzymes with similar catalytic mechanisms have been clustered together into an enzyme superfamily. Typically, enzymes within a superfamily have similar overall three-dimensional structures, conserved catalytic residues, but large variations in substrate recognition sites and residues to accommodate the diverse biochemical reactions that are catalyzed within the superfamily. The serine hydrolases are an excellent example of such an enzyme superfamily. Based on known enzymatic activities and protein sequences, they are split almost equally into the serine proteases and metabolic serine hydrolases. Within the metabolic serine hydrolases, there are two outlying members, ABHD14A and ABHD14B, that have high sequence similarity, but their biological functions remained cryptic till recently. While ABHD14A still lacks any functional annotation to date, we recently showed that ABHD14B functions as a lysine deacetylase in mammals. Given their high sequence similarity, automated databases often wrongly assign ABHD14A and ABHD14B as the same enzyme, and therefore, annotating functions to them in various organisms has been problematic. In this article, we present a bioinformatics study coupled with biochemical experiments, which identifies key sequence determinants for both ABHD14A and ABHD14B, and enable better classification for them. In addition, we map these enzymes on an evolutionary timescale and provide a much-wanted resource for studying these interesting enzymes in different organisms.
Collapse
|
2
|
Observation of an Unusually Large IR Red-Shift in an Unconventional S-H···S Hydrogen-Bond. J Phys Chem Lett 2021; 12:1228-1235. [PMID: 33492971 DOI: 10.1021/acs.jpclett.0c03183] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The S-H···S non-covalent interaction is generally known as an extremely unconventional weak hydrogen-bond in the literature. The present gas-phase spectroscopic investigation shows that the S-H···S hydrogen-bond can be as strong as any conventional hydrogen-bond in terms of the IR red-shift in the stretching frequency of the hydrogen-bond donor group. Herein, the strength of the S-H···S hydrogen-bond has been determined by measuring the red-shift (∼150 cm-1) of the S-H stretching frequency in a model complex of 2-chlorothiophenol and dimethyl sulfide using isolated gas-phase IR spectroscopy coupled with quantum chemistry calculations. The observation of an unusually large IR red-shift in the S-H···S hydrogen-bond is explained in terms of the presence of a significant amount of charge-transfer interactions in addition to the usual electrostatic interactions. The existence of ∼750 S-H···S interactions between the cysteine and methionine residues in 642 protein structures determined from an extensive Protein Data Bank analysis also indicates that this interaction is important for the structures of proteins.
Collapse
|
3
|
Protein Interaction Z Score Assessment (PIZSA): an empirical scoring scheme for evaluation of protein-protein interactions. Nucleic Acids Res 2020; 47:W331-W337. [PMID: 31114890 PMCID: PMC6602501 DOI: 10.1093/nar/gkz368] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Revised: 04/24/2019] [Accepted: 05/15/2019] [Indexed: 11/24/2022] Open
Abstract
Our web server, PIZSA (http://cospi.iiserpune.ac.in/pizsa), assesses the likelihood of protein–protein interactions by assigning a Z Score computed from interface residue contacts. Our score takes into account the optimal number of atoms that mediate the interaction between pairs of residues and whether these contacts emanate from the main chain or side chain. We tested the score on 174 native interactions for which 100 decoys each were constructed using ZDOCK. The native structure scored better than any of the decoys in 146 cases and was able to rank within the 95th percentile in 162 cases. This easily outperforms a competing method, CIPS. We also benchmarked our scoring scheme on 15 targets from the CAPRI dataset and found that our method had results comparable to that of CIPS. Further, our method is able to analyse higher order protein complexes without the need to explicitly identify chains as receptors or ligands. The PIZSA server is easy to use and could be used to score any input three-dimensional structure and provide a residue pair-wise break up of the results. Attractively, our server offers a platform for users to upload their own potentials and could serve as an ideal testing ground for this class of scoring schemes.
Collapse
|
4
|
PDBe-KB: a community-driven resource for structural and functional annotations. Nucleic Acids Res 2020; 48:D344-D353. [PMID: 31584092 PMCID: PMC6943075 DOI: 10.1093/nar/gkz853] [Citation(s) in RCA: 68] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Revised: 09/11/2019] [Accepted: 10/01/2019] [Indexed: 11/23/2022] Open
Abstract
The Protein Data Bank in Europe-Knowledge Base (PDBe-KB, https://pdbe-kb.org) is a community-driven, collaborative resource for literature-derived, manually curated and computationally predicted structural and functional annotations of macromolecular structure data, contained in the Protein Data Bank (PDB). The goal of PDBe-KB is two-fold: (i) to increase the visibility and reduce the fragmentation of annotations contributed by specialist data resources, and to make these data more findable, accessible, interoperable and reusable (FAIR) and (ii) to place macromolecular structure data in their biological context, thus facilitating their use by the broader scientific community in fundamental and applied research. Here, we describe the guidelines of this collaborative effort, the current status of contributed data, and the PDBe-KB infrastructure, which includes the data exchange format, the deposition system for added value annotations, the distributable database containing the assembled data, and programmatic access endpoints. We also describe a series of novel web-pages-the PDBe-KB aggregated views of structure data-which combine information on macromolecular structures from many PDB entries. We have recently released the first set of pages in this series, which provide an overview of available structural and functional information for a protein of interest, referenced by a UniProtKB accession.
Collapse
|
5
|
Predicting and designing therapeutics against the Nipah virus. PLoS Negl Trop Dis 2019; 13:e0007419. [PMID: 31830030 PMCID: PMC6907750 DOI: 10.1371/journal.pntd.0007419] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Accepted: 11/04/2019] [Indexed: 11/28/2022] Open
Abstract
Despite Nipah virus outbreaks having high mortality rates (>70% in Southeast Asia), there are no licensed drugs against it. In this study, we have considered all 9 Nipah proteins as potential therapeutic targets and computationally identified 4 putative peptide inhibitors (against G, F and M proteins) and 146 small molecule inhibitors (against F, G, M, N, and P proteins). The computations include extensive homology/ab initio modeling, peptide design and small molecule docking. An important contribution of this study is the increased structural characterization of Nipah proteins by approximately 90% of what is deposited in the PDB. In addition, we have carried out molecular dynamics simulations on all the designed protein-peptide complexes and on 13 of the top shortlisted small molecule ligands to check for stability and to estimate binding strengths. Details, including atomic coordinates of all the proteins and their ligand bound complexes, can be accessed at http://cospi.iiserpune.ac.in/Nipah. Our strategy was to tackle the development of therapeutics on a proteome wide scale and the lead compounds identified could be attractive starting points for drug development. To counter the threat of drug resistance, we have analysed the sequences of the viral strains from different outbreaks, to check whether they would be sensitive to the binding of the proposed inhibitors.
Collapse
|
6
|
Water-Mediated Selenium Hydrogen-Bonding in Proteins: PDB Analysis and Gas-Phase Spectroscopy of Model Complexes. J Phys Chem A 2019; 123:5995-6002. [PMID: 31268326 DOI: 10.1021/acs.jpca.9b04159] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
High-resolution X-ray crystallography and two-dimensional NMR studies demonstrate that water-mediated conventional hydrogen-bonding interactions (N-H···N, O-H···N, etc.) bridging two or more amino acid residues contribute to the stability of proteins and protein-ligand complexes. In this work, we have investigated single water-mediated selenium hydrogen-bonding interactions (unconventional hydrogen-bonding) between amino acid residues in proteins through extensive protein data bank (PDB) analysis coupled with gas-phase spectroscopy and quantum chemical calculation of a model complex consisting of indole, dimethyl selenide, and water. Here, indole and dimethyl selenide represent the amino acid residues tryptophan and selenomethionine, respectively. The current investigation demonstrates that the most stable structure of the model complex observed in the IR spectroscopy mimics single water-mediated selenium hydrogen-bonded structural motifs present in the crystal structures of proteins. The present work establishes that water-mediated Se hydrogen-bonding interactions are ubiquitous in proteins and the number of these interactions observed in the PDB is more than that of direct Se hydrogen-bonds present there.
Collapse
|
7
|
Discovering Putative Protein Targets of Small Molecules: A Study of the p53 Activator Nutlin. J Chem Inf Model 2019; 59:1529-1546. [DOI: 10.1021/acs.jcim.8b00762] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
8
|
Unraveling the structural basis for the unusually rich association of human leukocyte antigen DQ2.5 with class-II-associated invariant chain peptides. J Biol Chem 2017; 292:9218-9228. [PMID: 28364043 DOI: 10.1074/jbc.m117.785139] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Revised: 03/28/2017] [Indexed: 11/06/2022] Open
Abstract
Human leukocyte antigen (HLA)-DQ2.5 (DQA1*05/DQB1*02) is a class-II major histocompatibility complex protein associated with both type 1 diabetes and celiac disease. One unusual feature of DQ2.5 is its high class-II-associated invariant chain peptide (CLIP) content. Moreover, HLA-DQ2.5 preferentially binds the non-canonical CLIP2 over the canonical CLIP1. To better understand the structural basis of HLA-DQ2.5's unusual CLIP association characteristics, better insight into the HLA-DQ2.5·CLIP complex structures is required. To this end, we determined the X-ray crystal structure of the HLA-DQ2.5· CLIP1 and HLA-DQ2.5·CLIP2 complexes at 2.73 and 2.20 Å, respectively. We found that HLA-DQ2.5 has an unusually large P4 pocket and a positively charged peptide-binding groove that together promote preferential binding of CLIP2 over CLIP1. An α9-α22-α24-α31-β86-β90 hydrogen bond network located at the bottom of the peptide-binding groove, spanning from the P1 to P4 pockets, renders the residues in this region relatively immobile. This hydrogen bond network, along with a deletion mutation at α53, may lead to HLA-DM insensitivity in HLA-DQ2.5. A molecular dynamics simulation experiment reported here and recent biochemical studies by others support this hypothesis. The diminished HLA-DM sensitivity is the likely reason for the CLIP-rich phenotype of HLA-DQ2.5.
Collapse
|
9
|
Topology independent comparison of RNA 3D structures using the CLICK algorithm. Nucleic Acids Res 2016; 45:e5. [PMID: 27634929 PMCID: PMC5741206 DOI: 10.1093/nar/gkw819] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2015] [Revised: 09/01/2016] [Accepted: 09/02/2016] [Indexed: 01/15/2023] Open
Abstract
RNA molecules are attractive therapeutic targets because non-coding RNA molecules have increasingly been found to play key regulatory roles in the cell. Comparing and classifying RNA 3D structures yields unique insights into RNA evolution and function. With the rapid increase in the number of atomic-resolution RNA structures, it is crucial to have effective tools to classify RNA structures and to investigate them for structural similarities at different resolutions. We previously developed the algorithm CLICK to superimpose a pair of protein 3D structures by clique matching and 3D least squares fitting. In this study, we extend and optimize the CLICK algorithm to superimpose pairs of RNA 3D structures and RNA-protein complexes, independent of the associated topologies. Benchmarking Rclick on four different datasets showed that it is either comparable to or better than other structural alignment methods in terms of the extent of structural overlaps. Rclick also recognizes conformational changes between RNA structures and produces complementary alignments to maximize the extent of detectable similarity. Applying Rclick to study Ribonuclease III protein correctly aligned the RNA binding sites of RNAse III with its substrate. Rclick can be further extended to identify ligand-binding pockets in RNA. A web server is developed at http://mspc.bii.a-star.edu.sg/minhn/rclick.html.
Collapse
|
10
|
Molecular Mechanism Underlying ATP-Induced Conformational Changes in the Nucleoprotein Filament of Mycobacterium smegmatis RecA. Biochemistry 2016; 55:1850-62. [PMID: 26915388 DOI: 10.1021/acs.biochem.5b01383] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
RecA plays a central role in bacterial DNA repair, homologous recombination, and restoration of stalled replication forks by virtue of its active extended nucleoprotein filament. Binding of ATP and its subsequent recognition by the carboxamide group of a highly conserved glutamine (Gln196 in MsRecA) have been implicated in the formation of active RecA nucleoprotein filaments. Although the mechanism of ATP-dependent structural transitions in RecA has been proposed on the basis of low-resolution electron microscopic reconstructions, the precise sequence of events that constitute these transitions is poorly understood. On the basis of biochemical and crystallographic analyses of MsRecA variants carrying mutations in highly conserved Gln196 and Arg198 residues, we propose that the disposition of the interprotomer interface is the structural basis of allosteric activation of RecA. Furthermore, this study accounts for the contributions of several conserved amino acids to ATP hydrolysis and to the transition from collapsed to extended filament forms in Mycobacterium smegmatis RecA (MsRecA). In addition to their role in the inactive compressed state, the study reveals a role for Gln196 and Arg198 along with Phe219 in ATP hydrolysis in the active extended nucleoprotein filament. Finally, our data suggest that the primary, but not secondary, nucleotide binding site in MsRecA isomerizes into the ATP binding site present in the extended nucleoprotein filament.
Collapse
|
11
|
Depth: a web server to compute depth, cavity sizes, detect potential small-molecule ligand-binding cavities and predict the pKa of ionizable residues in proteins. Nucleic Acids Res 2013; 41:W314-21. [PMID: 23766289 PMCID: PMC3692129 DOI: 10.1093/nar/gkt503] [Citation(s) in RCA: 125] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Residue depth accurately measures burial and parameterizes local protein environment. Depth is the distance of any atom/residue to the closest bulk water. We consider the non-bulk waters to occupy cavities, whose volumes are determined using a Voronoi procedure. Our estimation of cavity sizes is statistically superior to estimates made by CASTp and VOIDOO, and on par with McVol over a data set of 40 cavities. Our calculated cavity volumes correlated best with the experimentally determined destabilization of 34 mutants from five proteins. Some of the cavities identified are capable of binding small molecule ligands. In this study, we have enhanced our depth-based predictions of binding sites by including evolutionary information. We have demonstrated that on a database (LigASite) of ∼200 proteins, we perform on par with ConCavity and better than MetaPocket 2.0. Our predictions, while less sensitive, are more specific and precise. Finally, we use depth (and other features) to predict pKas of GLU, ASP, LYS and HIS residues. Our results produce an average error of just <1 pH unit over 60 predictions. Our simple empirical method is statistically on par with two and superior to three other methods while inferior to only one. The DEPTH server (http://mspc.bii.a-star.edu.sg/depth/) is an ideal tool for rapid yet accurate structural analyses of protein structures.
Collapse
|
12
|
Transplantation of a hydrogen bonding network from West Nile virus protease onto Dengue-2 protease improves catalytic efficiency and sheds light on substrate specificity. Protein Eng Des Sel 2012; 25:843-50. [DOI: 10.1093/protein/gzs049] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
13
|
Abstract
SUMMARY Accurate alignment of protein sequences and/or structures is crucial for many biological analyses, including functional annotation of proteins, classifying protein sequences into families, and comparative protein structure modeling. Described here is a web interface to SALIGN, the versatile protein multiple sequence/structure alignment module of MODELLER. The web server automatically determines the best alignment procedure based on the inputs, while allowing the user to override default parameter values. Multiple alignments are guided by a dendrogram computed from a matrix of all pairwise alignment scores. When aligning sequences to structures, SALIGN uses structural environment information to place gaps optimally. If two multiple sequence alignments of related proteins are input to the server, a profile-profile alignment is performed. All features of the server have been previously optimized for accuracy, especially in the contexts of comparative modeling and identification of interacting protein partners. AVAILABILITY The SALIGN web server is freely accessible to the academic community at http://salilab.org/salign. SALIGN is a module of the MODELLER software, also freely available to academic users (http://salilab.org/modeller). CONTACT sali@salilab.org; madhusudhan@bii.a-star.edu.sg.
Collapse
|
14
|
CLICK--topology-independent comparison of biomolecular 3D structures. Nucleic Acids Res 2011; 39:W24-8. [PMID: 21602266 PMCID: PMC3125785 DOI: 10.1093/nar/gkr393] [Citation(s) in RCA: 100] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2011] [Revised: 04/19/2011] [Accepted: 05/03/2011] [Indexed: 01/28/2023] Open
Abstract
Our server, CLICK: http://mspc.bii.a-star.edu.sg/click, is capable of superimposing the 3D structures of any pair of biomolecules (proteins, DNA, RNA, etc.). The server makes use of the Cartesian coordinates of the molecules with the option of using other structural features such as secondary structure, solvent accessible surface area and residue depth to guide the alignment. CLICK first looks for cliques of points (3-7 residues) that are structurally similar in the pair of structures to be aligned. Using these local similarities, a one-to-one equivalence is charted between the residues of the two structures. A least square fit then superimposes the two structures. Our method is especially powerful in establishing protein relationships by detecting similarities in structural subdomains, domains and topological variants. CLICK has been extensively benchmarked and compared with other popular methods for protein and RNA structural alignments. In most cases, CLICK alignments were statistically significantly better in terms of structure overlap. The method also recognizes conformational changes that may have occurred in structural domains or subdomains in one structure with respect to the other. For this purpose, the server produces complementary alignments to maximize the extent of detectable similarity. Various examples showcase the utility of our web server.
Collapse
|
15
|
Abstract
Comparing and classifying the three-dimensional (3D) structures of proteins is of crucial importance to molecular biology, from helping to determine the function of a protein to determining its evolutionary relationships. Traditionally, 3D structures are classified into groups of families that closely resemble the grouping according to their primary sequence. However, significant structural similarities exist at multiple levels between proteins that belong to these different structural families. In this study, we propose a new algorithm, CLICK, to capture such similarities. The method optimally superimposes a pair of protein structures independent of topology. Amino acid residues are represented by the Cartesian coordinates of a representative point (usually the Cα atom), side chain solvent accessibility, and secondary structure. Structural comparison is effected by matching cliques of points. CLICK was extensively benchmarked for alignment accuracy on four different sets: (i) 9537 pair-wise alignments between two structures with the same topology; (ii) 64 alignments from set (i) that were considered to constitute difficult alignment cases; (iii) 199 pair-wise alignments between proteins with similar structure but different topology; and (iv) 1275 pair-wise alignments of RNA structures. The accuracy of CLICK alignments was measured by the average structure overlap score and compared with other alignment methods, including HOMSTRAD, MUSTANG, Geometric Hashing, SALIGN, DALI, GANGSTA+, FATCAT, ARTS and SARA. On average, CLICK produces pair-wise alignments that are either comparable or statistically significantly more accurate than all of these other methods. We have used CLICK to uncover relationships between (previously) unrelated proteins. These new biological insights include: (i) detecting hinge regions in proteins where domain or sub-domains show flexibility; (ii) discovering similar small molecule binding sites from proteins of different folds and (iii) discovering topological variants of known structural/sequence motifs. Our method can generally be applied to compare any pair of molecular structures represented in Cartesian coordinates as exemplified by the RNA structure superimposition benchmark.
Collapse
|
16
|
DEPTH: a web server to compute depth and predict small-molecule binding cavities in proteins. Nucleic Acids Res 2011; 39:W242-8. [PMID: 21576233 PMCID: PMC3125764 DOI: 10.1093/nar/gkr356] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Depth measures the extent of atom/residue burial within a protein. It correlates with properties such as protein stability, hydrogen exchange rate, protein–protein interaction hot spots, post-translational modification sites and sequence variability. Our server, DEPTH, accurately computes depth and solvent-accessible surface area (SASA) values. We show that depth can be used to predict small molecule ligand binding cavities in proteins. Often, some of the residues lining a ligand binding cavity are both deep and solvent exposed. Using the depth-SASA pair values for a residue, its likelihood to form part of a small molecule binding cavity is estimated. The parameters of the method were calibrated over a training set of 900 high-resolution X-ray crystal structures of single-domain proteins bound to small molecules (molecular weight <1.5 KDa). The prediction accuracy of DEPTH is comparable to that of other geometry-based prediction methods including LIGSITE, SURFNET and Pocket-Finder (all with Matthew’s correlation coefficient of ∼0.4) over a testing set of 225 single and multi-chain protein structures. Users have the option of tuning several parameters to detect cavities of different sizes, for example, geometrically flat binding sites. The input to the server is a protein 3D structure in PDB format. The users have the option of tuning the values of four parameters associated with the computation of residue depth and the prediction of binding cavities. The computed depths, SASA and binding cavity predictions are displayed in 2D plots and mapped onto 3D representations of the protein structure using Jmol. Links are provided to download the outputs. Our server is useful for all structural analysis based on residue depth and SASA, such as guiding site-directed mutagenesis experiments and small molecule docking exercises, in the context of protein functional annotation and drug discovery.
Collapse
|
17
|
Structure-guided fragment-based in silico drug design of dengue protease inhibitors. J Comput Aided Mol Des 2011; 25:263-74. [DOI: 10.1007/s10822-011-9418-0] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2010] [Accepted: 02/07/2011] [Indexed: 11/24/2022]
|
18
|
Alignment of multiple protein structures based on sequence and structure features. Protein Eng Des Sel 2009; 22:569-74. [PMID: 19587024 DOI: 10.1093/protein/gzp040] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Comparing the structures of proteins is crucial to gaining insight into protein evolution and function. Here, we align the sequences of multiple protein structures by a dynamic programming optimization of a scoring function that is a sum of an affine gap penalty and terms dependent on various sequence and structure features (SALIGN). The features include amino acid residue type, residue position, residue accessible surface area, residue secondary structure state and the conformation of a short segment centered on the residue. The multiple alignment is built by following the 'guide' tree constructed from the matrix of all pairwise protein alignment scores. Importantly, the method does not depend on the exact values of various parameters, such as feature weights and gap penalties, because the optimal alignment across a range of parameter values is found. Using multiple structure alignments in the HOMSTRAD database, SALIGN was benchmarked against MUSTANG for multiple alignments as well as against TM-align and CE for pairwise alignments. On the average, SALIGN produces a 15% improvement in structural overlap over HOMSTRAD and 14% over MUSTANG, and yields more equivalent structural positions than TM-align and CE in 90% and 95% of cases, respectively. The utility of accurate multiple structure alignment is illustrated by its application to comparative protein structure modeling.
Collapse
|
19
|
Abstract
Functional characterization of a protein sequence is a common goal in biology, and is usually facilitated by having an accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.
Collapse
|
20
|
Abstract
Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.
Collapse
|
21
|
Stereochemical criteria for prediction of the effects of proline mutations on protein stability. PLoS Comput Biol 2008; 3:e241. [PMID: 18069886 PMCID: PMC2134964 DOI: 10.1371/journal.pcbi.0030241] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2007] [Accepted: 10/19/2007] [Indexed: 11/17/2022] Open
Abstract
When incorporated into a polypeptide chain, proline (Pro) differs from all other naturally occurring amino acid residues in two important respects. The φ dihedral angle of Pro is constrained to values close to −65° and Pro lacks an amide hydrogen. Consequently, mutations which result in introduction of Pro can significantly affect protein stability. In the present work, we describe a procedure to accurately predict the effect of Pro introduction on protein thermodynamic stability. Seventy-seven of the 97 non-Pro amino acid residues in the model protein, CcdB, were individually mutated to Pro, and the in vivo activity of each mutant was characterized. A decision tree to classify the mutation as perturbing or nonperturbing was created by correlating stereochemical properties of mutants to activity data. The stereochemical properties including main chain dihedral angle φ and main chain amide H-bonds (hydrogen bonds) were determined from 3D models of the mutant proteins built using MODELLER. We assessed the performance of the decision tree on a large dataset of 163 single-site Pro mutations of T4 lysozyme, 74 nsSNPs, and 52 other Pro substitutions from the literature. The overall accuracy of this algorithm was found to be 81% in the case of CcdB, 77% in the case of lysozyme, 76% in the case of nsSNPs, and 71% in the case of other Pro substitution data. The accuracy of Pro scanning mutagenesis for secondary structure assignment was also assessed and found to be at best 69%. Our prediction procedure will be useful in annotating uncharacterized nsSNPs of disease-associated proteins and for protein engineering and design. Unlike other amino acids that constitute proteins, Proline is missing a vital hydrogen atom and also bestows local structural rigidity to the three-dimensional (3D) structure of proteins. In some locations, proline can be introduced with little or no detrimental effect to protein function, while at others it is destabilizing and can result in significant degradation or aggregation of the protein. To determine the features of protein 3D structure that tolerate the introduction of prolines, each of the 101 amino acid residues of the protein CcdB were replaced with Proline, and the functional consequence of the mutations were observed. On correlating these data to features of protein 3D structure, a decision tree was generated to predict the functional consequences of proline mutations in proteins of known (or accurately modeled) 3D structure. The performance of the tree was assessed on three different datasets that contained a total of 289 proline mutants in 37 different proteins. The average accuracy of prediction was 75%. The decision tree will be useful in predicting if known but uncharacterized proline mutations in disease-related proteins are likely to have adverse effects. It will also be useful in engineering and designing new proteins and peptides.
Collapse
|
22
|
Abstract
The DBAli tools use a comprehensive set of structural alignments in the DBAli database to leverage the structural information deposited in the Protein Data Bank (PDB). These tools include (i) the DBAlit program that allows users to input the 3D coordinates of a protein structure for comparison by MAMMOTH against all chains in the PDB; (ii) the AnnoLite and AnnoLyze programs that annotate a target structure based on its stored relationships to other structures; (iii) the ModClus program that clusters structures by sequence and structure similarities; (iv) the ModDom program that identifies domains as recurrent structural fragments and (v) an implementation of the COMPARER method in the SALIGN command in MODELLER that creates a multiple structure alignment for a set of related protein structures. Thus, the DBAli tools, which are freely accessible via the World Wide Web at http://salilab.org/DBAli/, allow users to mine the protein structure space by establishing relationships between protein structures and their functions.
Collapse
|
23
|
Gene expression profiling of the human maternal-fetal interface reveals dramatic changes between midgestation and term. Endocrinology 2007; 148:1059-79. [PMID: 17170095 DOI: 10.1210/en.2006-0683] [Citation(s) in RCA: 142] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Human placentation entails the remarkable integration of fetal and maternal cells into a single functional unit. In the basal plate region (the maternal-fetal interface) of the placenta, fetal cytotrophoblasts from the placenta invade the uterus and remodel the resident vasculature and avoid maternal immune rejection. Knowing the molecular bases for these unique cell-cell interactions is important for understanding how this specialized region functions during normal pregnancy with implications for tumor biology and transplantation immunology. Therefore, we undertook a global analysis of the gene expression profiles at the maternal-fetal interface. Basal plate biopsy specimens were obtained from 36 placentas (14-40 wk) at the conclusion of normal pregnancies. RNA was isolated, processed, and hybridized to HG-U133A&B Affymetrix GeneChips. Surprisingly, there was little change in gene expression during the 14- to 24-wk interval. In contrast, 418 genes were differentially expressed at term (37-40 wk) as compared with midgestation (14-24 wk). Subsequent analyses using quantitative PCR and immunolocalization approaches validated a portion of these results. Many of the differentially expressed genes are known in other contexts to be involved in differentiation, motility, transcription, immunity, angiogenesis, extracellular matrix dissolution, or lipid metabolism. One sixth were nonannotated or encoded hypothetical proteins. Modeling based on structural homology revealed potential functions for 31 of these proteins. These data provide a reference set for understanding the molecular components of the dialogue taking place between maternal and fetal cells in the basal plate as well as for future comparisons of alterations in this region that occur in obstetric complications.
Collapse
|
24
|
Abstract
MODBASE () is a database of annotated comparative protein structure models for all available protein sequences that can be matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on MODELLER for fold assignment, sequence–structure alignment, model building and model assessment (). MODBASE is updated regularly to reflect the growth in protein sequence and structure databases, and improvements in the software for calculating the models. MODBASE currently contains 3 094 524 reliable models for domains in 1 094 750 out of 1 817 889 unique protein sequences in the UniProt database (July 5, 2005); only models based on statistically significant alignments and models assessed to have the correct fold despite insignificant alignments are included. MODBASE also allows users to generate comparative models for proteins of interest with the automated modeling server MODWEB (). Our other resources integrated with MODBASE include comprehensive databases of multiple protein structure alignments (DBAli, ), structurally defined ligand binding sites and structurally defined binary domain interfaces (PIBASE, ) as well as predictions of ligand binding sites, interactions between yeast proteins, and functional consequences of human nsSNPs (LS-SNP, ).
Collapse
|
25
|
Abstract
The penalty for inserting gaps into an alignment between two protein sequences is a major determinant of the alignment accuracy. Here, we present an algorithm for finding a globally optimal alignment by dynamic programming that can use a variable gap penalty (VGP) function of any form. We also describe a specific function that depends on the structural context of an insertion or deletion. It penalizes gaps that are introduced within regions of regular secondary structure, buried regions, straight segments and also between two spatially distant residues. The parameters of the penalty function were optimized on a set of 240 sequence pairs of known structure, spanning the sequence identity range of 20-40%. We then tested the algorithm on another set of 238 sequence pairs of known structures. The use of the VGP function increases the number of correctly aligned residues from 81.0 to 84.5% in comparison with the optimized affine gap penalty function; this difference is statistically significant according to Student's t-test. We estimate that the new algorithm allows us to produce comparative models with an additional approximately 7 million accurately modeled residues in the approximately 1.1 million proteins that are detectably related to a known structure.
Collapse
|
26
|
Abstract
The accuracy of an alignment between two protein sequences can be improved by including other detectably related sequences in the comparison. We optimize and benchmark such an approach that relies on aligning two multiple sequence alignments, each one including one of the two protein sequences. Thirteen different protocols for creating and comparing profiles corresponding to the multiple sequence alignments are implemented in the SALIGN command of MODELLER. A test set of 200 pairwise, structure-based alignments with sequence identities below 40% is used to benchmark the 13 protocols as well as a number of previously described sequence alignment methods, including heuristic pairwise sequence alignment by BLAST, pairwise sequence alignment by global dynamic programming with an affine gap penalty function by the ALIGN command of MODELLER, sequence-profile alignment by PSI-BLAST, Hidden Markov Model methods implemented in SAM and LOBSTER, pairwise sequence alignment relying on predicted local structure by SEA, and multiple sequence alignment by CLUSTALW and COMPASS. The alignment accuracies of the best new protocols were significantly better than those of the other tested methods. For example, the fraction of the correctly aligned residues relative to the structure-based alignment by the best protocol is 56%, which can be compared with the accuracies of 26%, 42%, 43%, 48%, 50%, 49%, 43%, and 43% for the other methods, respectively. The new method is currently applied to large-scale comparative protein structure modeling of all known sequences.
Collapse
|
27
|
MODBASE, a database of annotated comparative protein structure models, and associated resources. Nucleic Acids Res 2004; 32:D217-22. [PMID: 14681398 PMCID: PMC308829 DOI: 10.1093/nar/gkh095] [Citation(s) in RCA: 220] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MODBASE (http://salilab.org/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on the MODELLER package for fold assignment, sequence-structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE uses the MySQL relational database management system for flexible querying and CHIMERA for viewing the sequences and structures (http://www.cgl.ucsf.edu/chimera/). MODBASE is updated regularly to reflect the growth in protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different data sets. The largest data set contains 1,26,629 models for domains in 659,495 out of 1,182,126 unique protein sequences in the complete Swiss-Prot/TrEMBL database (August 25, 2003); only models based on alignments with significant similarity scores and models assessed to have the correct fold despite insignificant alignments are included. Another model data set supports target selection and structure-based annotation by the New York Structural Genomics Research Consortium; e.g. the 53 new structures produced by the consortium allowed us to characterize structurally 24,113 sequences. MODBASE also contains binding site predictions for small ligands and a set of predicted interactions between pairs of modeled sequences from the same genome. Our other resources associated with MODBASE include a comprehensive database of multiple protein structure alignments (DBALI, http://salilab.org/dbali) as well as web servers for automated comparative modeling with MODPIPE (MODWEB, http://salilab. org/modweb), modeling of loops in protein structures (MODLOOP, http://salilab.org/modloop) and predicting functional consequences of single nucleotide polymorphisms (SNPWEB, http://salilab. org/snpweb).
Collapse
|
28
|
Abstract
The following resources for comparative protein structure modeling and analysis are described (http://salilab.org): MODELLER, a program for comparative modeling by satisfaction of spatial restraints; MODWEB, a web server for automated comparative modeling that relies on PSI-BLAST, IMPALA and MODELLER; MODLOOP, a web server for automated loop modeling that relies on MODELLER; MOULDER, a CPU intensive protocol of MODWEB for building comparative models based on distant known structures; MODBASE, a comprehensive database of annotated comparative models for all sequences detectably related to a known structure; MODVIEW, a Netscape plugin for Linux that integrates viewing of multiple sequences and structures; and SNPWEB, a web server for structure-based prediction of the functional impact of a single amino acid substitution.
Collapse
|
29
|
RasGRP4, a new mast cell-restricted Ras guanine nucleotide-releasing protein with calcium- and diacylglycerol-binding motifs. Identification of defective variants of this signaling protein in asthma, mastocytosis, and mast cell leukemia patients and demonstration of the importance of RasGRP4 in mast cell development and function. J Biol Chem 2002; 277:25756-74. [PMID: 11956218 DOI: 10.1074/jbc.m202575200] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
A cDNA was isolated from interleukin 3-developed, mouse bone marrow-derived mast cells (MCs) that contained an insert (designated mRasGRP4) that had not been identified in any species at the gene, mRNA, or protein level. By using a homology-based cloning approach, the approximately 2.6-kb hRasGRP4 transcript was also isolated from the mononuclear progenitors residing in the peripheral blood of normal individuals. This transcript information was then used to locate the RasGRP4 gene in the mouse and human genomes, to deduce its exon/intron organization, and then to identify 10 single nucleotide polymorphisms in the human gene that result in 5 amino acid differences. The >15-kb hRasGRP4 gene consists of 18 exons and resides on a region of chromosome 19q13.1 that had not been sequenced by the Human Genome Project. Human and mouse MCs and their progenitors selectively express RasGRP4, and this new intracellular protein contains all of the domains present in the RasGRP family of guanine nucleotide exchange factors even though it is <50% identical to its closest homolog. Recombinant RasGRP4 can activate H-Ras in a cation-dependent manner. Transfection experiments also suggest that RasGRP4 is a diacylglycerol/phorbol ester receptor. Transcript analysis of an asthma patient, a mastocytosis patient, and the HMC-1 cell line derived from a MC leukemia patient revealed the presence of substantial amounts of non-functional forms of hRasGRP4 due to an inability to remove intron 5 in the precursor transcript. Because only abnormal forms of hRasGRP4 were identified in the HMC-1 cell line, this immature MC progenitor was used to address the function of RasGRP4 in MCs. HMC-1 leukemia cells differentiated and underwent granule maturation when induced to express a normal form of RasGRP4. Thus, RasGRP4 plays an important role in the final stages of MC development.
Collapse
|
30
|
Abstract
The reliability of ranking of protein structure modeling methods is assessed. The assessment is based on the parametric Student's t test and the nonparametric Wilcox signed rank test of statistical significance of the difference between paired samples. The approach is applied to the ranking of the comparative modeling methods tested at the fourth meeting on Critical Assessment of Techniques for Protein Structure Prediction (CASP). It is shown that the 14 CASP4 test sequences may not be sufficient to reliably distinguish between the top eight methods, given the model quality differences and their standard deviations. We suggest that CASP needs to be supplemented by an assessment of protein structure prediction methods that is automated, continuous in time, based on several criteria applied to a large number of models, and with quantitative statistical reliability assigned to each characterization.
Collapse
|
31
|
Human tryptase epsilon (PRSS22), a new member of the chromosome 16p13.3 family of human serine proteases expressed in airway epithelial cells. J Biol Chem 2001; 276:49169-82. [PMID: 11602603 DOI: 10.1074/jbc.m108677200] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Probing of the GenBank expressed sequence tag (EST) data base with varied human tryptase cDNAs identified two truncated ESTs that subsequently were found to encode overlapping portions of a novel human serine protease (designated tryptase epsilon or protease, serine S1 family member 22 (PRSS22)). The tryptase epsilon gene resides on chromosome 16p13.3 within a 2.5-Mb complex of serine protease genes. Although at least 7 of the 14 genes in this complex encode enzymatically active proteases, only one tryptase epsilon-like gene was identified. The trachea and esophagus were found to contain the highest steady-state levels of the tryptase epsilon transcript in adult humans. Although the tryptase epsilon transcript was scarce in adult human lung, it was present in abundance in fetal lung. Thus, the tryptase epsilon gene is expressed in the airways in a developmentally regulated manner that is different from that of other human tryptase genes. At the cellular level, tryptase epsilon is a major product of normal pulmonary epithelial cells, as well as varied transformed epithelial cell lines. Enzymatically active tryptase epsilon is also constitutively secreted from these cells. The amino acid sequence of human tryptase epsilon is 38-44% identical to those of human tryptase alpha, tryptase beta I, tryptase beta II, tryptase beta III, transmembrane tryptase/tryptase gamma, marapsin, and Esp-1/testisin. Nevertheless, comparative protein structure modeling and functional studies using recombinant material revealed that tryptase epsilon has a substrate preference distinct from that of its other family members. These data indicate that the products of the chromosome 16p13.3 complex of tryptase genes evolved to carry out varied functions in humans.
Collapse
|
32
|
Abstract
UNLABELLED Evaluation of protein structure prediction methods is difficult and time-consuming. Here, we describe EVA, a web server for assessing protein structure prediction methods, in an automated, continuous and large-scale fashion. Currently, EVA evaluates the performance of a variety of prediction methods available through the internet. Every week, the sequences of the latest experimentally determined protein structures are sent to prediction servers, results are collected, performance is evaluated, and a summary is published on the web. EVA has so far collected data for more than 3000 protein chains. These results may provide valuable insight to both developers and users of prediction methods. AVAILABILITY http://cubic.bioc.columbia.edu/eva. CONTACT eva@cubic.bioc.columbia.edu
Collapse
|
33
|
Computer modeling and molecular dynamics simulations of ligand bound complexes of bovine angiogenin: dinucleotide topology at the active site of RNase a family proteins. Proteins 2001; 45:30-9. [PMID: 11536357 DOI: 10.1002/prot.1120] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We have undertaken the modeling of substrate-bound structures of angiogenin. In our recent study, we modeled the dinucleotide ligand binding to human angiogenin. In the present study, the substrates CpG, UpG, and CpA were docked onto bovine angiogenin. This was achieved by overcoming the problem of an obstruction to the B1 site by the C-terminus and identifying residues that bind to the second base. The modeled complexes retain biochemically important interactions. The docked models were subjected to 1 ns of molecular dynamics, and structures from the simulation were refined by using simulated annealing. Our models explained the enzyme's specificity for both B1 and B2 bases as observed experimentally. The nature of binding of the dinucleotide substrate was compared with that of the mononucleotide product. The models of these complexes were also compared with those obtained earlier with human angiogenin. On the basis of the simulations and annealed structures, we came up with a consensus topology of dinucleotide ligands that binds to human and bovine angiogenins. This dinucleotide conformation can serve as a starting model for ligand-bound complex structures for RNase A family of proteins. We demonstrated this capability by generating the complex structure of CpA bound to eosinophil-derived neurotoxin (EDN) by fitting the consensus topology of CpA to the crystal structure of native EDN.
Collapse
|
34
|
Abstract
Invariant water molecules that are of structural or functional importance to proteins are detected from their presence in the same location in different crystal structures of the same protein or closely related proteins. In this study we have investigated the location of invariant water molecules from MD simulations of ribonuclease A, HIV1-protease and Hen egg white lysozyme. Snapshots of MD trajectories represent the structure of a dynamic protein molecule in a solvated environment as opposed to the static picture provided by crystallography. The MD results are compared to an analysis on crystal structures. A good correlation is observed between the two methods with more than half the hydration sites identified as invariant from crystal structures featuring as invariant in the MD simulations which include most of the functionally or structurally important residues. It is also seen that the propensities of occupying the various hydration sites on a protein for structures obtained from MD and crystallographic studies are different. In general MD simulations can be used to predict invariant hydration sites when there is a paucity of crystallographic data or to complement crystallographic results.
Collapse
|
35
|
Abstract
Genomic blot analysis raised the possibility that uncharacterized tryptase genes reside on chromosome 17 at the complex containing the three genes that encode mouse mast cell protease (mMCP) 6, mMCP-7, and transmembrane tryptase (mTMT). Probing of GenBank's expressed sequence tag data base with these three tryptase cDNAs resulted in the identification of an expressed sequence tag that encodes a portion of a novel mouse serine protease (now designated mouse tryptase 4 (mT4) because it is the fourth member of this family). 5'- and 3'-rapid amplification of cDNA ends approaches were carried out to deduce the nucleotide sequence of the full-length mT4 transcript. This information was then used to clone its approximately 5.0-kilobase pair gene. Chromosome mapping analysis of its gene, sequence analysis of its transcript, and comparative protein structure modeling of its translated product revealed that mT4 is a new member of the chromosome 17 family of mouse tryptases. mT4 is 40-44% identical to mMCP-6, mMCP-7, and mTMT, and this new serine protease has all of the structural features of a functional tryptase. Moreover, mT4 is enzymatically active when expressed in insect cells. Due to its 17-mer hydrophobic domain at its C terminus, mT4 is a membrane-anchored tryptase more analogous to mTMT than the other members of its family. As assessed by RNA blot, reverse transcriptase-polymerase chain reaction, and/or in situ hybridization analysis, mT4 is expressed in interleukin-5-dependent mouse eosinophils, as well as in ovaries and testes. The observation that recombinant mT4 is preferentially retained in the endoplasmic reticulum of transiently transfected COS-7 cells suggests a convertase-like role for this integral membrane serine protease.
Collapse
|
36
|
Short-strong hydrogen bonds and a low barrier transition state for the proton transfer reaction in RNase A catalysis: a quantum chemical study. Biophys Chem 2001; 89:105-17. [PMID: 11254205 DOI: 10.1016/s0301-4622(00)00221-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
There is growing evidence that some enzymes catalyze reactions through the formation of short-strong hydrogen bonds as first suggested by Gerlt and Gassman. Support comes from several experimental and quantum chemical studies that include correlation energies on model systems. In the present study, the process of proton transfer between hydroxyl and imidazole groups, a model of the crucial step in the hydrolysis of RNA by the enzymes of the RNase A family, is investigated at the quantum mechanical level of density functional theory and perturbation theory at the MP2 level. The model focuses on the nature of the formation of a complex between the important residues of the protein and the hydroxyl group of the substrate. We have also investigated different configurations of the ground state that are important in the proton transfer reaction. The nature of bonding between the catalytic unit of the enzyme and the substrate in the model is investigated by Bader's atoms in molecule theory. The contributions of solvation and vibrational energies corresponding to the reactant, the transition state and the product configurations are also evaluated. Furthermore, the effect of protein environment is investigated by considering the catalytic unit surrounded by complete proteins--RNase A and Angiogenin. The results, in general, indicate the formation of a short-strong hydrogen bond and the formation of a low barrier transition state for the proton transfer model of the enzyme.
Collapse
|
37
|
A molecular dynamics study based post facto free energy analysis of the binding of bovine angiogenin with UMP and CMP ligands. INDIAN JOURNAL OF BIOCHEMISTRY & BIOPHYSICS 2001; 38:27-33. [PMID: 11563327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/21/2023]
Abstract
Angiogenin is a protein belonging to the superfamily of RNase A. The RNase activity of this protein is essential for its angiogenic activity. Although members of the RNase A family carry out RNase activity, they differ markedly in their strength and specificity. In this paper, we address the problem of higher specificity of angiogenin towards cytosine against uracil in the first base binding position. We have carried out extensive nano-second level molecular dynamics(MD) computer simulations on the native bovine angiogenin and on the CMP and UMP complexes of this protein in aqueous medium with explicit molecular solvent. The structures thus generated were subjected to a rigorous free energy component analysis to arrive at a plausible molecular thermodynamic explanation for the substrate specificity of angiogenin.
Collapse
|
38
|
Abstract
Structures of substrate bound human angiogenin complexes have been obtained for the first time by computer modeling. The dinucleotides CpA and UpA have been docked onto human angiogenin using a systematic grid search procedure in torsion and Eulerian angle space. The docking was guided throughout by the similarity of angiogenin-substrate interactions with interactions of RNase A and its substrate. The models were subjected to 1 nanosecond of molecular dynamics to access their stability. Structures extracted from MD simulations were refined by simulated annealing. Stable hydrogen bonds that bridged protein and ligand residues during the MD simulations were taken as restraints for simulated annealing. Our analysis on the MD structures and annealed models explains the substrate specificity of human angiogenin and is in agreement with experimental results. This study also predicts the B2 binding site residues of angiogenin, for which no experimental information is available so far. In the case of one of the substrates, CpA, we have also identified the presence of a water molecule that invariantly bridges the B2 base with the protein. We have compared our results to the RNase A-substrate complex and highlight the similarities and differences.
Collapse
|
39
|
Abstract
The shapes of most protein sequences will be modeled based on their similarity to experimentally determined protein structures. The current role, limitations, challenges and prospects for protein structure modeling (using information about genes and genomes) are discussed in the context of structural genomics.
Collapse
|
40
|
Abstract
Molecular dynamics simulations have been carried out for 1 ns on human and bovine angiogenin systems in an effort to compare and contrast their dynamics. An analysis of their dynamics is done by examining the rms deviations, following hydrogen-bonding interactions and looking at the role of water in and around the protein. The C-terminus of bovine angiogenin moves appreciably during dynamics suggesting a better structure for ligand binding. However, we do not find any evidence of a conformation where the glutamate residue that obstructs the active site takes on a different conformation. We observe a differential hydrogen-bonding pattern in the active site regions of bovine and human angiogenins, which could have a bearing on the different catalytic activities of the proteins. We also propose that the differential binding of the monoclonal antibody toward the two proteins might be due sequential and not conformational differences. Water molecules might play an important functional role in both proteins given their subtle functional differences. A simple computation on the molecular dynamics data has been carried out to identify locations in and around the protein that are invariably occupied by water. The locations of nearly half the waters we have identified from the simulation as being invariant in bovine angiogenin occupy similar locations in the bovine angiogenin crystal structure. The positions of the waters identified in human angiogenin differ considerably from that of bovine angiogenin.
Collapse
|
41
|
Abstract
Angiogenin belongs to the Ribonuclease superfamily and has a weak enzymatic activity that is crucial for its biological function of stimulating blood vessel growth. Structural studies on ligand bound Angiogenin will go a long way in understanding the mechanism of the protein as well as help in designing drugs against it. In this study we present the first available structure of nucleotide ligand bound Angiogenin obtained by computer modeling. The importance of this study in itself notwithstanding, is a precursor to modeling a full dinucleotide substrate onto Angiogenin. Bovine Angiogenin, the structure of which has been solved at a high resolution, was earlier subjected to Molecular Dynamics simulations for a nanosecond. The MD structures offer better starting points for docking as they offer lesser obstruction than the crystal structure to ligand binding. The MD structure with the least serious short contacts was modeled to obtain a steric free Angiogenin - 3' mononucleotide complex structure. The structures were energetically minimized and subjected to a brief spell of Molecular Dynamics. The results of the simulation show that all the ligand-Angiogenin interactions and hydrogen bonds are retained, redeeming the structure and docking procedure. Further, following ligand - protein interactions in the case of the ligands 3'-CMP and 3'-UMP we were able to speculate on how Angiogenin, a predominantly prymidine specific ribonuclease prefers Cytosine to Uracil in the first base position.
Collapse
|
42
|
Nonlinear enzyme kinetics can lead to high metabolic flux control coefficients: implications for the evolution of dominance. J Theor Biol 1996; 182:299-302. [PMID: 8944161 DOI: 10.1006/jtbi.1996.0167] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
In a classic study, Kacser & Burns (1981, Genetics 97, 639-666) demonstrated that given certain plausible assumptions, the flux in a metabolic pathway was more or less indifferent to the activity of any of the enzymes in the pathway taken singly. It was inferred from this that the observed dominance of most wild-type alleles with respect to loss-of-function mutations did not require an adaptive, meaning selectionist, explanation. Cornish-Bowden (1987, J. theor. Biol. 125, 333-338) showed that the Kacser-Burns inference was not valid when substrate concentrations were large relative to the relevant Michaelis constants. We find that in a randomly constructed functional pathway, even when substrate levels are small, one can except high values of control coefficient for metabolic flux in the presence of significant nonlinearities as exemplified by enzymes with Hill coefficients ranging from two to six, or by the existence of oscillatory loops. Under these conditions the flux can be quite sensitive to changes in enzyme activity as might be caused by inactivating one of the two alleles in a diploid. Therefore, the phenomenon of dominance cannot be a trivial "default" consequence of physiology but must be intimately linked to the manner in which metabolic networks have been moulded by natural selection.
Collapse
|