1
|
Huynh AT, Nguyen TTN, Villegas CA, Montemorso S, Strauss B, Pearson RA, Graham JG, Oribello J, Suresh R, Lustig B, Wang N. Prediction and confirmation of a switch-like region within the N-terminal domain of hSIRT1. Biochem Biophys Rep 2022; 30:101275. [PMID: 35592613 PMCID: PMC9112024 DOI: 10.1016/j.bbrep.2022.101275] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 05/02/2022] [Accepted: 05/04/2022] [Indexed: 11/28/2022] Open
Abstract
Many proteins display conformational changes resulting from allosteric regulation. Often only a few residues are crucial in conveying these structural and functional allosteric changes. These regions that undergo a significant change in structure upon receiving an input signal, such as molecular recognition, are defined as switch-like regions. Identifying these key residues within switch-like regions can help elucidate the mechanism of allosteric regulation and provide guidance for synthetic regulation. In this study, we combine a novel computational workflow with biochemical methods to identify a switch-like region in the N-terminal domain of human SIRT1 (hSIRT1), a lysine deacetylase that plays important roles in regulating cellular pathways. Based on primary sequence, computational methods predicted a region between residues 186-193 in hSIRT1 to exhibit switch-like behavior. Mutations were then introduced in this region and the resulting mutants were tested for allosteric reactions to resveratrol, a known hSIRT1 allosteric regulator. After fine-tuning the mutations based on comparison of known secondary structures, we were able to pinpoint M193 as the residue essential for allosteric regulation, likely by communicating the allosteric signal. Mutation of this residue maintained enzyme activity but abolished allosteric regulation by resveratrol. Our findings suggest a method to predict switch-like regions in allosterically regulated enzymes based on the primary sequence. If further validated, this could be an efficient way to identify key residues in enzymes for therapeutic drug targeting and other applications.
Collapse
Affiliation(s)
- Angelina T. Huynh
- Department of Chemistry, San José State University, San José, California, 95192, USA
| | - Thi-Tina N. Nguyen
- Department of Biological Sciences, San José State University, San José, California, 95192, USA
| | - Carina A. Villegas
- Department of Biological Sciences, San José State University, San José, California, 95192, USA
| | - Saira Montemorso
- Department of Chemistry, San José State University, San José, California, 95192, USA
| | - Benjamin Strauss
- Department of Computer Science, San José State University, San José, California, 95192, USA
| | - Richard A. Pearson
- Department of Chemistry, San José State University, San José, California, 95192, USA
| | - Jason G. Graham
- Department of Biomedical, Chemical, and Materials Engineering, San José State University, San José, California, 95192, USA
| | - Jonathan Oribello
- Department of Chemistry, San José State University, San José, California, 95192, USA
| | - Rohit Suresh
- Department of Chemistry, San José State University, San José, California, 95192, USA
| | - Brooke Lustig
- Department of Chemistry, San José State University, San José, California, 95192, USA
| | - Ningkun Wang
- Department of Chemistry, San José State University, San José, California, 95192, USA
| |
Collapse
|
2
|
Sabater B. Entropy Perspectives of Molecular and Evolutionary Biology. Int J Mol Sci 2022; 23:ijms23084098. [PMID: 35456917 PMCID: PMC9029946 DOI: 10.3390/ijms23084098] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 04/01/2022] [Accepted: 04/06/2022] [Indexed: 02/01/2023] Open
Abstract
Attempts to find and quantify the supposed low entropy of organisms and its preservation are revised. The absolute entropy of the mixed components of non-living biomass (approximately −1.6 × 103 J K−1 L−1) is the reference to which other entropy decreases would be ascribed to life. The compartmentation of metabolites and the departure from the equilibrium of metabolic reactions account for reductions in entropy of 1 and 40–50 J K−1 L−1, respectively, and, though small, are distinctive features of living tissues. DNA and proteins do not supply significant decreases in thermodynamic entropy, but their low informational entropy is relevant for life and its evolution. No other living feature contributes significantly to the low entropy associated with life. The photosynthetic conversion of radiant energy to biomass energy accounts for most entropy (2.8 × 105 J K−1 carbon kg−1) produced by living beings. The comparatively very low entropy produced in other processes (approximately 4.8 × 102 J K−1 L−1 day−1 in the human body) must be rapidly exported outside as heat to preserve low entropy decreases due to compartmentation and non-equilibrium metabolism. Enzymes and genes are described, whose control minimizes the rate of production of entropy and could explain selective pressures in biological evolution and the rapid proliferation of cancer cells.
Collapse
Affiliation(s)
- Bartolomé Sabater
- Department of Life Sciences, University of Alcalá, 28805 Alcalá de Henares, Madrid, Spain
| |
Collapse
|
3
|
Shi W, Singha M, Srivastava G, Pu L, Ramanujam J, Brylinski M. Pocket2Drug: An Encoder-Decoder Deep Neural Network for the Target-Based Drug Design. Front Pharmacol 2022; 13:837715. [PMID: 35359869 PMCID: PMC8962739 DOI: 10.3389/fphar.2022.837715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 02/10/2022] [Indexed: 11/13/2022] Open
Abstract
Computational modeling is an essential component of modern drug discovery. One of its most important applications is to select promising drug candidates for pharmacologically relevant target proteins. Because of continuing advances in structural biology, putative binding sites for small organic molecules are being discovered in numerous proteins linked to various diseases. These valuable data offer new opportunities to build efficient computational models predicting binding molecules for target sites through the application of data mining and machine learning. In particular, deep neural networks are powerful techniques capable of learning from complex data in order to make informed drug binding predictions. In this communication, we describe Pocket2Drug, a deep graph neural network model to predict binding molecules for a given a ligand binding site. This approach first learns the conditional probability distribution of small molecules from a large dataset of pocket structures with supervised training, followed by the sampling of drug candidates from the trained model. Comprehensive benchmarking simulations show that using Pocket2Drug significantly improves the chances of finding molecules binding to target pockets compared to traditional drug selection procedures. Specifically, known binders are generated for as many as 80.5% of targets present in the testing set consisting of dissimilar data from that used to train the deep graph neural network model. Overall, Pocket2Drug is a promising computational approach to inform the discovery of novel biopharmaceuticals.
Collapse
Affiliation(s)
- Wentao Shi
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, United States
| | - Manali Singha
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
| | - Gopal Srivastava
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
| | - Limeng Pu
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, United States
| | - J. Ramanujam
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, United States
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, United States
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, United States
- *Correspondence: Michal Brylinski,
| |
Collapse
|
4
|
Mullick B, Magar R, Jhunjhunwala A, Barati Farimani A. Understanding mutation hotspots for the SARS-CoV-2 spike protein using Shannon Entropy and K-means clustering. Comput Biol Med 2021; 138:104915. [PMID: 34655896 PMCID: PMC8492016 DOI: 10.1016/j.compbiomed.2021.104915] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 09/17/2021] [Accepted: 09/29/2021] [Indexed: 12/16/2022]
Abstract
The SARS-CoV-2 virus like many other viruses has transformed in a continual manner to give rise to new variants by means of mutations commonly through substitutions and indels. These mutations in some cases can give the virus a survival advantage making the mutants dangerous. In general, laboratory investigation must be carried to determine whether the new variants have any characteristics that can make them more lethal and contagious. Therefore, complex and time-consuming analyses are required in order to delve deeper into the exact impact of a particular mutation. The time required for these analyses makes it difficult to understand the variants of concern and thereby limiting the preventive action that can be taken against them spreading rapidly. In this analysis, we have deployed a statistical technique Shannon Entropy, to identify positions in the spike protein of SARS Cov-2 viral sequence which are most susceptible to mutations. Subsequently, we also use machine learning based clustering techniques to cluster known dangerous mutations based on similarities in properties. This work utilizes embeddings generated using language modeling, the ProtBERT model, to identify mutations of a similar nature and to pick out regions of interest based on proneness to change. Our entropy-based analysis successfully predicted the fifteen hotspot regions, among which we were able to validate ten known variants of interest, in six hotspot regions. As the situation of SARS-COV-2 virus rapidly evolves we believe that the remaining nine mutational hotspots may contain variants that can emerge in the future. We believe that this may be promising in helping the research community to devise therapeutics based on probable new mutation zones in the viral sequence and resemblance in properties of various mutations.
Collapse
Affiliation(s)
- Baishali Mullick
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Rishikesh Magar
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Aastha Jhunjhunwala
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Amir Barati Farimani
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, PA, 15213, USA,Corresponding author
| |
Collapse
|
5
|
Prigozhin DM, Krasileva KV. Analysis of intraspecies diversity reveals a subset of highly variable plant immune receptors and predicts their binding sites. THE PLANT CELL 2021; 33:998-1015. [PMID: 33561286 PMCID: PMC8226289 DOI: 10.1093/plcell/koab013] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Accepted: 12/28/2020] [Indexed: 05/21/2023]
Abstract
The evolution of recognition specificities by the immune system depends on the generation of receptor diversity and on connecting the binding of new antigens with the initiation of downstream signaling. In plant immunity, the innate Nucleotide-Binding Leucine-Rich Repeat (NLR) receptor family enables antigen binding and immune signaling. In this study, we surveyed the NLR complements of 62 ecotypes of Arabidopsis thaliana and 54 lines of Brachypodium distachyon and identified a limited number of NLR subfamilies that show high allelic diversity. We show that the predicted specificity-determining residues cluster on the surfaces of Leucine-Rich Repeat domains, but the locations of the clusters vary among NLR subfamilies. By comparing NLR phylogeny, allelic diversity, and known functions of the Arabidopsis NLRs, we formulate a hypothesis for the emergence of direct and indirect pathogen-sensing receptors and of the autoimmune NLRs. These findings reveal the recurring patterns of evolution of innate immunity and can inform NLR engineering efforts.
Collapse
|
6
|
Jia K, Jernigan RL. New amino acid substitution matrix brings sequence alignments into agreement with structure matches. Proteins 2021; 89:671-682. [PMID: 33469973 PMCID: PMC8641535 DOI: 10.1002/prot.26050] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 01/08/2021] [Accepted: 01/12/2021] [Indexed: 12/27/2022]
Abstract
Protein sequence matching presently fails to identify many structures that are highly similar, even when they are known to have the same function. The high packing densities in globular proteins lead to interdependent substitutions, which have not previously been considered for amino acid similarities. At present, sequence matching compares sequences based only upon the similarities of single amino acids, ignoring the fact that in densely packed protein, there are additional conservative substitutions representing exchanges between two interacting amino acids, such as a small-large pair changing to a large-small pair substitutions that are not individually so conservative. Here we show that including information for such pairs of substitutions yields improved sequence matches, and that these yield significant gains in the agreements between sequence alignments and structure matches of the same protein pair. The result shows sequence segments matched where structure segments are aligned. There are gains for all 2002 collected cases where the sequence alignments that were not previously congruent with the structure matches. Our results also demonstrate a significant gain in detecting homology for “twilight zone” protein sequences. The amino acid substitution metrics derived have many other potential applications, for annotations, protein design, mutagenesis design, and empirical potential derivation.
Collapse
Affiliation(s)
- Kejue Jia
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa, USA
| | - Robert L Jernigan
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa, USA
| |
Collapse
|
7
|
Echave J. Beyond Stability Constraints: A Biophysical Model of Enzyme Evolution with Selection on Stability and Activity. Mol Biol Evol 2018; 36:613-620. [DOI: 10.1093/molbev/msy244] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Affiliation(s)
- Julian Echave
- Escuela de Ciencia y Tecnología, Universidad Nacional de San Martín (UNSAM), Buenos Aires, Argentina
| |
Collapse
|
8
|
Karthika P, Vadivalagan C, Thirumurugan D, Kumar RR, Murugan K, Canale A, Benelli G. DNA barcoding of five Japanese encephalitis mosquito vectors (Culex fuscocephala, Culex gelidus, Culex tritaeniorhynchus, Culex pseudovishnui and Culex vishnui). Acta Trop 2018; 183:84-91. [PMID: 29625090 DOI: 10.1016/j.actatropica.2018.04.006] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2018] [Revised: 03/17/2018] [Accepted: 04/01/2018] [Indexed: 12/25/2022]
Abstract
Culex mosquitoes can act as vectors of several important diseases, including Japanese encephalitis, West Nile virus, St. Louis encephalitis and equine encephalitis. Besides the neurological sequelae caused in humans, Japanese encephalitis can lead to abortion in sows and encephalitis in horses. Effective vector control and early diagnosis, along with continuous serosurveillance in animals, are crucial to fight this arboviral disease. However, the success of vector control operations is linked with the fast and reliable identification of targeted species, and knowledge about their biology and ecology. Since the DNA barcoding of Culex vectors of Japanese encephalitis is scarcely explored, here we evaluated the efficacy of this tool to identify and analyze the variations among five overlooked Culex vectors of Japanese encephalitis, Culex fuscocephala, Culex gelidus, Culex tritaeniorhynchus, Culex pseudovishnui and Culex vishnui, relying to the analysis of mitochondrial CO1 gene. Variations in their base pair range were elucidated by the entropy Hx plot. The differences among individual conspecifics and on base pair range across the same were studied. The C (501-750 bp) region showed a moderate variation among all the selected species. C. tritaeniorhynchus exhibited the highest variation in all the ranges. The observed genetic divergence was partially non-discriminatory. i.e., the overall intra- and inter nucleotide divergence was 0.0920 (0.92%) and 0.125 (1.25%), respectively. However, 10X rule fits accurately intraspecies divergence <3% for the five selected Culex species. The analysis of individual scatter plots showed threshold values (10X) of 0.008 (0.08%), 0.005 (0.05%), 0.123 (1.23%), 0.033 (0.33%) and 0.019 (0.19%) for C. fuscocephala, C. gelidus, C. tritaeniorhynchus, C. pseudovishnui and C. vishnui, respectively. The C. tritaeniorhynchus haplotypes KU497604, KU497603, AB690847 and AB690854 exhibited the highest divergence range, i.e., from 0.465 -0.546. Comparatively, the intra-divergence among the other haplotypes of C. tritaeniorhynchus ranged from 0-0.056. The maximum parsimony tree was formed by distinctive conspecific clusters with appreciable branch values illustrating their close congruence and extensive genetic deviations. Overall, this study adds valuable knowledge to the molecular biology and systematics of five overlooked mosquito species acting as major vectors of Japanese encephalitis in Asian countries.
Collapse
Affiliation(s)
- Pushparaj Karthika
- Department of Zoology, Avinashilingam Institute for Home Science and Higher Education for Women, Coimbatore 641 043, Tamil Nadu, India
| | - Chithravel Vadivalagan
- Department of Biotechnology, SRM Institute of Science and Technology, Kattankulathur, Chennai, Tamil Nadu 603203, India; Entomology Laboratory, Department of Zoology, Bharathiar University, Coimbatore, 641046, Tamil Nadu, India.
| | - Durairaj Thirumurugan
- Department of Biotechnology, SRM Institute of Science and Technology, Kattankulathur, Chennai, Tamil Nadu 603203, India
| | - Rangaswamy Ravi Kumar
- Centre for Medical Entomology and Vector Control, National Center for Disease Control, M/o Health and Family Welfare, Govt. of India, 22-Shamnath Marg, Delhi, 110054, India
| | - Kadarkarai Murugan
- Entomology Laboratory, Department of Zoology, Bharathiar University, Coimbatore, 641046, Tamil Nadu, India; Department of Biotechnology, Thiruvalluvar University, Serkkadu, Vellore 632 115, Tamil Nadu, India
| | - Angelo Canale
- Department of Agriculture, Food and Environment, University of Pisa, via del Borghetto 80, 56124 Pisa, Italy
| | - Giovanni Benelli
- Department of Agriculture, Food and Environment, University of Pisa, via del Borghetto 80, 56124 Pisa, Italy; The BioRobotics Institute, Sant'Anna School of Advanced Studies, viale Rinaldo Piaggio 34, 56025 Pontedera, Pisa, Italy.
| |
Collapse
|
9
|
Mishra SK, Sankar K, Jernigan RL. Altered dynamics upon oligomerization corresponds to key functional sites. Proteins 2017; 85:1422-1434. [PMID: 28383162 DOI: 10.1002/prot.25302] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Accepted: 04/03/2017] [Indexed: 12/18/2022]
Abstract
It is known that over half of the proteins encoded by most organisms function as oligomeric complexes. Oligomerization confers structural stability and dynamics changes in proteins. We investigate the effects of oligomerization on protein dynamics and its functional significance for a set of 145 multimeric proteins. Using coarse-grained elastic network models, we inspect the changes in residue fluctuations upon oligomerization and then compare with residue conservation scores to identify the functional significance of these changes. Our study reveals conservation of about ½ of the fluctuations, with ¼ of the residues increasing in their mobilities and ¼ having reduced fluctuations. The residues with dampened fluctuations are evolutionarily more conserved and can serve as orthosteric binding sites, indicating their importance. We also use triosephosphate isomerase as a test case to understand why certain enzymes function only in their oligomeric forms despite the monomer including all required catalytic residues. To this end, we compare the residue communities (groups of residues which are highly correlated in their fluctuations) in the monomeric and dimeric forms of the enzyme. We observe significant changes to the dynamical community architecture of the catalytic core of this enzyme. This relates to its functional mechanism and is seen only in the oligomeric form of the protein, answering why proteins are oligomeric structures. Proteins 2017; 85:1422-1434. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Sambit Kumar Mishra
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, Iowa, 50011.,Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa, 50011
| | - Kannan Sankar
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, Iowa, 50011.,Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa, 50011
| | - Robert L Jernigan
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, Iowa, 50011.,Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa, 50011
| |
Collapse
|
10
|
Jackson EL, Shahmoradi A, Spielman SJ, Jack BR, Wilke CO. Intermediate divergence levels maximize the strength of structure-sequence correlations in enzymes and viral proteins. Protein Sci 2016; 25:1341-53. [PMID: 26971720 PMCID: PMC4918415 DOI: 10.1002/pro.2920] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2015] [Accepted: 03/04/2016] [Indexed: 12/16/2022]
Abstract
Structural properties such as solvent accessibility and contact number predict site-specific sequence variability in many proteins. However, the strength and significance of these structure-sequence relationships vary widely among different proteins, with absolute correlation strengths ranging from 0 to 0.8. In particular, two recent works have made contradictory observations. Yeh et al. (Mol. Biol. Evol. 31:135-139, 2014) found that both relative solvent accessibility (RSA) and weighted contact number (WCN) are good predictors of sitewise evolutionary rate in enzymes, with WCN clearly out-performing RSA. Shahmoradi et al. (J. Mol. Evol. 79:130-142, 2014) considered these same predictors (as well as others) in viral proteins and found much weaker correlations and no clear advantage of WCN over RSA. Because these two studies had substantial methodological differences, however, a direct comparison of their results is not possible. Here, we reanalyze the datasets of the two studies with one uniform analysis pipeline, and we find that many apparent discrepancies between the two analyses can be attributed to the extent of sequence divergence in individual alignments. Specifically, the alignments of the enzyme dataset are much more diverged than those of the virus dataset, and proteins with higher divergence exhibit, on average, stronger structure-sequence correlations. However, the highest structure-sequence correlations are observed at intermediate divergence levels, where both highly conserved and highly variable sites are present in the same alignment.
Collapse
Affiliation(s)
- Eleisha L Jackson
- Department of Integrative Biology, The University of Texas at Austin, Austin, Texas, 78712
- Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, Texas, 78712
- Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, Texas, 78712
| | - Amir Shahmoradi
- Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, Texas, 78712
- Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, Texas, 78712
- Department of Physics, The University of Texas at Austin, Austin, Texas, 78712
| | - Stephanie J Spielman
- Department of Integrative Biology, The University of Texas at Austin, Austin, Texas, 78712
- Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, Texas, 78712
- Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, Texas, 78712
| | - Benjamin R Jack
- Department of Integrative Biology, The University of Texas at Austin, Austin, Texas, 78712
- Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, Texas, 78712
- Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, Texas, 78712
| | - Claus O Wilke
- Department of Integrative Biology, The University of Texas at Austin, Austin, Texas, 78712
- Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, Texas, 78712
- Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, Texas, 78712
| |
Collapse
|
11
|
Shahmoradi A, Wilke CO. Dissecting the roles of local packing density and longer-range effects in protein sequence evolution. Proteins 2016; 84:841-54. [PMID: 26990194 PMCID: PMC5292938 DOI: 10.1002/prot.25034] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2015] [Revised: 02/01/2016] [Accepted: 02/24/2016] [Indexed: 11/07/2022]
Abstract
What are the structural determinants of protein sequence evolution? A number of site-specific structural characteristics have been proposed, most of which are broadly related to either the density of contacts or the solvent accessibility of individual residues. Most importantly, there has been disagreement in the literature over the relative importance of solvent accessibility and local packing density for explaining site-specific sequence variability in proteins. We show that this discussion has been confounded by the definition of local packing density. The most commonly used measures of local packing, such as contact number and the weighted contact number, represent the combined effects of local packing density and longer-range effects. As an alternative, we propose a truly local measure of packing density around a single residue, based on the Voronoi cell volume. We show that the Voronoi cell volume, when calculated relative to the geometric center of amino-acid side chains, behaves nearly identically to the relative solvent accessibility, and each individually can explain, on average, approximately 34% of the site-specific variation in evolutionary rate in a data set of 209 enzymes. An additional 10% of variation can be explained by nonlocal effects that are captured in the weighted contact number. Consequently, evolutionary variation at a site is determined by the combined effects of the immediate amino-acid neighbors of that site and effects mediated by more distant amino acids. We conclude that instead of contrasting solvent accessibility and local packing density, future research should emphasize on the relative importance of immediate contacts and longer-range effects on evolutionary variation. Proteins 2016; 84:841-854. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Amir Shahmoradi
- Department of Physics, The University of Texas at Austin
- Center for Computational Biology and Bioinformatics, The University
of Texas at Austin
- Institute for Cellular and Molecular Biology, The University of
Texas at Austin
| | - Claus O. Wilke
- Center for Computational Biology and Bioinformatics, The University
of Texas at Austin
- Institute for Cellular and Molecular Biology, The University of
Texas at Austin
- Department of Integrative Biology, The University of Texas at
Austin
| |
Collapse
|
12
|
Echave J, Spielman SJ, Wilke CO. Causes of evolutionary rate variation among protein sites. Nat Rev Genet 2016; 17:109-21. [PMID: 26781812 DOI: 10.1038/nrg.2015.18] [Citation(s) in RCA: 176] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
It has long been recognized that certain sites within a protein, such as sites in the protein core or catalytic residues in enzymes, are evolutionarily more conserved than other sites. However, our understanding of rate variation among sites remains surprisingly limited. Recent progress to address this includes the development of a wide array of reliable methods to estimate site-specific substitution rates from sequence alignments. In addition, several molecular traits have been identified that correlate with site-specific mutation rates, and novel mechanistic biophysical models have been proposed to explain the observed correlations. Nonetheless, current models explain, at best, approximately 60% of the observed variance, highlighting the limitations of current methods and models and the need for new research directions.
Collapse
Affiliation(s)
- Julian Echave
- Escuela de Ciencia y Tecnología, Universidad Nacional de San Martín, 1650 San Martín, Buenos Aires, Argentina
| | - Stephanie J Spielman
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Claus O Wilke
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, Texas 78712, USA
| |
Collapse
|
13
|
Sammond DW, Kastelowitz N, Himmel ME, Yin H, Crowley MF, Bomble YJ. Comparing Residue Clusters from Thermophilic and Mesophilic Enzymes Reveals Adaptive Mechanisms. PLoS One 2016; 11:e0145848. [PMID: 26741367 PMCID: PMC4704809 DOI: 10.1371/journal.pone.0145848] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2015] [Accepted: 12/09/2015] [Indexed: 11/18/2022] Open
Abstract
Understanding how proteins adapt to function at high temperatures is important for deciphering the energetics that dictate protein stability and folding. While multiple principles important for thermostability have been identified, we lack a unified understanding of how internal protein structural and chemical environment determine qualitative or quantitative impact of evolutionary mutations. In this work we compare equivalent clusters of spatially neighboring residues between paired thermophilic and mesophilic homologues to evaluate adaptations under the selective pressure of high temperature. We find the residue clusters in thermophilic enzymes generally display improved atomic packing compared to mesophilic enzymes, in agreement with previous research. Unlike residue clusters from mesophilic enzymes, however, thermophilic residue clusters do not have significant cavities. In addition, anchor residues found in many clusters are highly conserved with respect to atomic packing between both thermophilic and mesophilic enzymes. Thus the improvements in atomic packing observed in thermophilic homologues are not derived from these anchor residues but from neighboring positions, which may serve to expand optimized protein core regions.
Collapse
Affiliation(s)
- Deanne W Sammond
- Biosciences Center, National Renewable Energy Laboratory, Golden, Colorado, 80401, United States of America
| | - Noah Kastelowitz
- Department of Chemistry & Biochemistry and the BioFrontiers Institute, University of Colorado, Boulder, Colorado, 80309, United States of America
| | - Michael E Himmel
- Biosciences Center, National Renewable Energy Laboratory, Golden, Colorado, 80401, United States of America
| | - Hang Yin
- Department of Chemistry & Biochemistry and the BioFrontiers Institute, University of Colorado, Boulder, Colorado, 80309, United States of America
| | - Michael F Crowley
- Biosciences Center, National Renewable Energy Laboratory, Golden, Colorado, 80401, United States of America
| | - Yannick J Bomble
- Biosciences Center, National Renewable Energy Laboratory, Golden, Colorado, 80401, United States of America
| |
Collapse
|
14
|
Nepal R, Spencer J, Bhogal G, Nedunuri A, Poelman T, Kamath T, Chung E, Kantardjieff K, Gottlieb A, Lustig B. Logistic regression models to predict solvent accessible residues using sequence- and homology-based qualitative and quantitative descriptors applied to a domain-complete X-ray structure learning set. J Appl Crystallogr 2015; 48:1976-1984. [PMID: 26664348 PMCID: PMC4665666 DOI: 10.1107/s1600576715018531] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2014] [Accepted: 10/03/2015] [Indexed: 11/11/2022] Open
Abstract
A working example of relative solvent accessibility (RSA) prediction for proteins is presented. Novel logistic regression models with various qualitative descriptors that include amino acid type and quantitative descriptors that include 20- and six-term sequence entropy have been built and validated. A domain-complete learning set of over 1300 proteins is used to fit initial models with various sequence homology descriptors as well as query residue qualitative descriptors. Homology descriptors are derived from BLASTp sequence alignments, whereas the RSA values are determined directly from the crystal structure. The logistic regression models are fitted using dichotomous responses indicating buried or accessible solvent, with binary classifications obtained from the RSA values. The fitted models determine binary predictions of residue solvent accessibility with accuracies comparable to other less computationally intensive methods using the standard RSA threshold criteria 20 and 25% as solvent accessible. When an additional non-homology descriptor describing Lobanov-Galzitskaya residue disorder propensity is included, incremental improvements in accuracy are achieved with 25% threshold accuracies of 76.12 and 74.79% for the Manesh-215 and CASP(8+9) test sets, respectively. Moreover, the described software and the accompanying learning and validation sets allow students and researchers to explore the utility of RSA prediction with simple, physically intuitive models in any number of related applications.
Collapse
Affiliation(s)
- Reecha Nepal
- Department of Chemistry, San Jose State University, San Jose, CA 95192-0101, USA
| | - Joanna Spencer
- Department of Mathematics and Statistics, San Jose State University, San Jose, CA 95192-0101, USA
| | - Guneet Bhogal
- Department of Biomedical, Chemical and Materials Engineering, San Jose State University, San Jose, CA 95192-0101, USA
| | - Amulya Nedunuri
- Department of General Engineering, San Jose State University, San Jose, CA 95192-0101, USA
| | - Thomas Poelman
- Department of Chemistry and Biochemistry, Cal Poly San Luis Obispo, San Luis Obispo, CA 93407, USA
| | - Thejas Kamath
- Department of Bioengineering, University of California, San Diego, San Diego, CA 92093-0412, USA
| | - Edwin Chung
- Department of Biomedical, Chemical and Materials Engineering, San Jose State University, San Jose, CA 95192-0101, USA
| | - Katherine Kantardjieff
- College of Science and Mathematics, California State University San Marcos, San Marcos, CA 92096-0001, USA
| | - Andrea Gottlieb
- Department of Mathematics and Statistics, San Jose State University, San Jose, CA 95192-0101, USA
| | - Brooke Lustig
- Department of Chemistry, San Jose State University, San Jose, CA 95192-0101, USA
| |
Collapse
|
15
|
Martins F, Gonçalves R, Oliveira J, Cruz-Monteagudo M, Nieto-Villar JM, Paz-y-Miño C, Rebelo I, Tejera E. Unravelling the relationship between protein sequence and low-complexity regions entropies: Interactome implications. J Theor Biol 2015; 382:320-7. [PMID: 26164061 DOI: 10.1016/j.jtbi.2015.06.049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 06/12/2015] [Accepted: 06/28/2015] [Indexed: 10/23/2022]
Abstract
Low-complexity regions are sub-sequences of biased composition in a protein sequence. The influence of these regions over protein evolution, specific functions and highly interactive capacities is well known. Although protein sequence entropy has been largely studied, its relationship with low-complexity regions and the subsequent effects on protein function remains unclear. In this work we propose a theoretical and empirical model integrating the sequence entropy with local complexity parameters. Our results indicate that the protein sequence entropy is related with the protein length, the entropies inside and outside the low-complexity regions as well as their number and average size. We found a small but significant increment in the sequence entropy of hubs proteins. In agreement with our theoretical model, this increment is highly dependent of the balance between the increment of protein length and average size of the low-complexity regions. Finally, our models and proteins analysis provide evidence supporting that modifications in the average size is more relevant in hubs proteins than changes in the number of low-complexity regions.
Collapse
Affiliation(s)
- F Martins
- Department of Biochemistry, Faculty of Pharmacy, University of Porto, Portugal
| | - R Gonçalves
- Department of Biochemistry, Faculty of Pharmacy, University of Porto, Portugal
| | - J Oliveira
- Department of Biochemistry, Faculty of Pharmacy, University of Porto, Portugal
| | - M Cruz-Monteagudo
- Instituto de Investigaciones Biomédicas, Universidad de las Américas, Quito, Ecuador
| | - J M Nieto-Villar
- Dpto. de Química-Física, Fac. de Química, Universidad de La Habana, Cuba. Cátedra de Sistemas Complejos "H. Poincaré", Universidad de La Habana, Cuba
| | - C Paz-y-Miño
- Instituto de Investigaciones Biomédicas, Universidad de las Américas, Quito, Ecuador
| | - I Rebelo
- Department of Biochemistry, Faculty of Pharmacy, University of Porto, Portugal; UCIBIO@REQUIMTE, Portugal.
| | - E Tejera
- Instituto de Investigaciones Biomédicas, Universidad de las Américas, Quito, Ecuador
| |
Collapse
|
16
|
Huang TT, Hwang JK, Chen CH, Chu CS, Lee CW, Chen CC. (PS)2: protein structure prediction server version 3.0. Nucleic Acids Res 2015; 43:W338-42. [PMID: 25943546 PMCID: PMC4489310 DOI: 10.1093/nar/gkv454] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2015] [Accepted: 04/24/2015] [Indexed: 11/16/2022] Open
Abstract
Protein complexes are involved in many biological processes. Examining coupling between subunits of a complex would be useful to understand the molecular basis of protein function. Here, our updated (PS)2 web server predicts the three-dimensional structures of protein complexes based on comparative modeling; furthermore, this server examines the coupling between subunits of the predicted complex by combining structural and evolutionary considerations. The predicted complex structure could be indicated and visualized by Java-based 3D graphics viewers and the structural and evolutionary profiles are shown and compared chain-by-chain. For each subunit, considerations with or without the packing contribution of other subunits cause the differences in similarities between structural and evolutionary profiles, and these differences imply which form, complex or monomeric, is preferred in the biological condition for the subunit. We believe that the (PS)2 server would be a useful tool for biologists who are interested not only in the structures of protein complexes but also in the coupling between subunits of the complexes. The (PS)2 is freely available at http://ps2v3.life.nctu.edu.tw/.
Collapse
Affiliation(s)
- Tsun-Tsao Huang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 30068, Taiwan Center for Bioinformatics Research, National Chiao Tung University, Hsinchu 30068, Taiwan
| | - Jenn-Kang Hwang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 30068, Taiwan Center for Bioinformatics Research, National Chiao Tung University, Hsinchu 30068, Taiwan Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 30068, Taiwan Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan
| | - Chu-Huang Chen
- Department of Vascular and Medicinal Research, Texas Heart Institute, Houston, TX 77030, USA Center for Lipid Biosciences, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80708, Taiwan L5 Research Center, China Medical University Hospital, China Medical University, Taichung 40402, Taiwan Section of Cardiovascular Research, Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA
| | - Chih-Sheng Chu
- Center for Lipid Biosciences, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80708, Taiwan Department of Internal Medicine, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80708, Taiwan
| | - Chi-Wen Lee
- Center for Bioinformatics Research, National Chiao Tung University, Hsinchu 30068, Taiwan Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 30068, Taiwan
| | - Chih-Chieh Chen
- Center for Lipid Biosciences, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80708, Taiwan Center for Lipid and Glycomedicine Research, Kaohsiung Medical University, Kaohsiung 80708, Taiwan Institute of Medical Science and Technology, National Sun Yat-sen University, Kaohsiung 80424, Taiwan
| |
Collapse
|
17
|
Echave J, Jackson EL, Wilke CO. Relationship between protein thermodynamic constraints and variation of evolutionary rates among sites. Phys Biol 2015; 12:025002. [PMID: 25787027 PMCID: PMC4391963 DOI: 10.1088/1478-3975/12/2/025002] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Evolutionary-rate variation among sites within proteins depends on functional and biophysical properties that constrain protein evolution. It is generally accepted that proteins must be able to fold stably in order to function. However, the relationship between stability constraints and among-sites rate variation is not well understood. Here, we present a biophysical model that links the thermodynamic stability changes due to mutations at sites in proteins ([Formula: see text]) to the rate at which mutations accumulate at those sites over evolutionary time. We find that such a 'stability model' generally performs well, displaying correlations between predicted and empirically observed rates of up to 0.75 for some proteins. We further find that our model has comparable predictive power as does an alternative, recently proposed 'stress model' that explains evolutionary-rate variation among sites in terms of the excess energy needed for mutants to adopt the correct active structure ([Formula: see text]). The two models make distinct predictions, though, and for some proteins the stability model outperforms the stress model and vice versa. We conclude that both stability and stress constrain site-specific sequence evolution in proteins.
Collapse
|
18
|
Shahmoradi A, Sydykova DK, Spielman SJ, Jackson EL, Dawson ET, Meyer AG, Wilke CO. Predicting evolutionary site variability from structure in viral proteins: buriedness, packing, flexibility, and design. J Mol Evol 2014; 79:130-42. [PMID: 25217382 PMCID: PMC4216736 DOI: 10.1007/s00239-014-9644-x] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2014] [Accepted: 08/31/2014] [Indexed: 12/27/2022]
Abstract
Several recent works have shown that protein structure can predict site-specific evolutionary sequence variation. In particular, sites that are buried and/or have many contacts with other sites in a structure have been shown to evolve more slowly, on average, than surface sites with few contacts. Here, we present a comprehensive study of the extent to which numerous structural properties can predict sequence variation. The quantities we considered include buriedness (as measured by relative solvent accessibility), packing density (as measured by contact number), structural flexibility (as measured by B factors, root-mean-square fluctuations, and variation in dihedral angles), and variability in designed structures. We obtained structural flexibility measures both from molecular dynamics simulations performed on nine non-homologous viral protein structures and from variation in homologous variants of those proteins, where they were available. We obtained measures of variability in designed structures from flexible-backbone design in the Rosetta software. We found that most of the structural properties correlate with site variation in the majority of structures, though the correlations are generally weak (correlation coefficients of 0.1-0.4). Moreover, we found that buriedness and packing density were better predictors of evolutionary variation than structural flexibility. Finally, variability in designed structures was a weaker predictor of evolutionary variability than buriedness or packing density, but it was comparable in its predictive power to the best structural flexibility measures. We conclude that simple measures of buriedness and packing density are better predictors of evolutionary variation than the more complicated predictors obtained from dynamic simulations, ensembles of homologous structures, or computational protein design.
Collapse
Affiliation(s)
- Amir Shahmoradi
- Department of Physics, The University of Texas at Austin, Austin, TX 78712, USA
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, TX 78712, USA
| | - Dariya K. Sydykova
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, TX 78712, USA
| | - Stephanie J. Spielman
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, TX 78712, USA
| | - Eleisha L. Jackson
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, TX 78712, USA
| | - Eric T. Dawson
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, TX 78712, USA
| | - Austin G. Meyer
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, TX 78712, USA
| | - Claus O. Wilke
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
19
|
Local packing density is the main structural determinant of the rate of protein sequence evolution at site level. BIOMED RESEARCH INTERNATIONAL 2014; 2014:572409. [PMID: 25121105 PMCID: PMC4119917 DOI: 10.1155/2014/572409] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2014] [Revised: 06/06/2014] [Accepted: 06/09/2014] [Indexed: 01/02/2023]
Abstract
Functional and biophysical constraints result in site-dependent patterns of protein sequence variability. It is commonly assumed that the key structural determinant of site-specific rates of evolution is the Relative Solvent Accessibility (RSA). However, a recent study found that amino acid substitution rates correlate better with two Local Packing Density (LPD) measures, the Weighted Contact Number (WCN) and the Contact Number (CN), than with RSA. This work aims at a more thorough assessment. To this end, in addition to substitution rates, we considered four other sequence variability scores, four measures of solvent accessibility (SA), and other CN measures. We compared all properties for each protein of a structurally and functionally diverse representative dataset of monomeric enzymes. We show that the best sequence variability measures take into account phylogenetic tree topology. More importantly, we show that both LPD measures (WCN and CN) correlate better than all of the SA measures, regardless of the sequence variability score used. Moreover, the independent contribution of the best LPD measure is approximately four times larger than that of the best SA measure. This study strongly supports the conclusion that a site's packing density rather than its solvent accessibility is the main structural determinant of its rate of evolution.
Collapse
|
20
|
Matsuoka M, Kikuchi T. Sequence analysis on the information of folding initiation segments in ferredoxin-like fold proteins. BMC STRUCTURAL BIOLOGY 2014; 14:15. [PMID: 24884463 PMCID: PMC4055915 DOI: 10.1186/1472-6807-14-15] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Accepted: 05/15/2014] [Indexed: 02/06/2023]
Abstract
BACKGROUND While some studies have shown that the 3D protein structures are more conservative than their amino acid sequences, other experimental studies have shown that even if two proteins share the same topology, they may have different folding pathways. There are many studies investigating this issue with molecular dynamics or Go-like model simulations, however, one should be able to obtain the same information by analyzing the proteins' amino acid sequences, if the sequences contain all the information about the 3D structures. In this study, we use information about protein sequences to predict the location of their folding segments. We focus on proteins with a ferredoxin-like fold, which has a characteristic topology. Some of these proteins have different folding segments. RESULTS Despite the simplicity of our methods, we are able to correctly determine the experimentally identified folding segments by predicting the location of the compact regions considered to play an important role in structural formation. We also apply our sequence analyses to some homologues of each protein and confirm that there are highly conserved folding segments despite the homologues' sequence diversity. These homologues have similar folding segments even though the homology of two proteins' sequences is not so high. CONCLUSION Our analyses have proven useful for investigating the common or different folding features of the proteins studied.
Collapse
Affiliation(s)
| | - Takeshi Kikuchi
- Department of Bioinformatics, College of Life Sciences, Ritsumeikan University, 1-1-1 Nojihigashi, Kusatsu, Shiga 525-8577, Japan.
| |
Collapse
|
21
|
Jiang J, Lai Z, Wang J, Mukamel S. Signatures of the Protein Folding Pathway in Two-Dimensional Ultraviolet Spectroscopy. J Phys Chem Lett 2014; 5:1341-1346. [PMID: 24803996 PMCID: PMC3999791 DOI: 10.1021/jz5002264] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2014] [Accepted: 03/19/2014] [Indexed: 05/24/2023]
Abstract
The function of protein relies on their folding to assume the proper structure. Probing the structural variations during the folding process is crucial for understanding the underlying mechanism. We present a combined quantum mechanics/molecular dynamics simulation study that demonstrates how coherent resonant nonlinear ultraviolet spectra can be used to follow the fast folding dynamics of a mini-protein, Trp-cage. Two dimensional ultraviolet signals of the backbone transitions carry rich information of both local (secondary) and global (tertiary) structures. The complexity of signals decreases as the conformational entropy decreases in the course of the folding process. We show that the approximate entropy of the signals provides a quantitative marker of protein folding status, accessible by both theoretical calculations and experiments.
Collapse
Affiliation(s)
- Jun Jiang
- Department
of Chemical Physics, University of Science
and Technology of China, No. 96, JinZhai Road Baohe District, Hefei 230026, China
- Chemistry
Department, University of California Irvine, 433A Rowland Hall, Irvine, California 92697, United States
| | - Zaizhi Lai
- Department
of Chemistry and Physics, University of
New York at Stony Brook, Stony
Brook, New York 11794, United States
| | - Jin Wang
- Department
of Chemistry and Physics, University of
New York at Stony Brook, Stony
Brook, New York 11794, United States
- State
Key Laboratory of Electroanalytical Chemistry, Changchun Institute
of Applied Chemistry, Chinese Academy of
Sciences, No. 5625, Ren
Min Street, Changchun, Jilin 130021, China
| | - Shaul Mukamel
- Chemistry
Department, University of California Irvine, 433A Rowland Hall, Irvine, California 92697, United States
| |
Collapse
|
22
|
Huang TT, del Valle Marcos ML, Hwang JK, Echave J. A mechanistic stress model of protein evolution accounts for site-specific evolutionary rates and their relationship with packing density and flexibility. BMC Evol Biol 2014; 14:78. [PMID: 24716445 PMCID: PMC4101840 DOI: 10.1186/1471-2148-14-78] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2014] [Accepted: 03/21/2014] [Indexed: 12/29/2022] Open
Abstract
Background Protein sites evolve at different rates due to functional and biophysical constraints. It is usually considered that the main structural determinant of a site’s rate of evolution is its Relative Solvent Accessibility (RSA). However, a recent comparative study has shown that the main structural determinant is the site’s Local Packing Density (LPD). LPD is related with dynamical flexibility, which has also been shown to correlate with sequence variability. Our purpose is to investigate the mechanism that connects a site’s LPD with its rate of evolution. Results We consider two models: an empirical Flexibility Model and a mechanistic Stress Model. The Flexibility Model postulates a linear increase of site-specific rate of evolution with dynamical flexibility. The Stress Model, introduced here, models mutations as random perturbations of the protein’s potential energy landscape, for which we use simple Elastic Network Models (ENMs). To account for natural selection we assume a single active conformation and use basic statistical physics to derive a linear relationship between site-specific evolutionary rates and the local stress of the mutant’s active conformation. We compare both models on a large and diverse dataset of enzymes. In a protein-by-protein study we found that the Stress Model outperforms the Flexibility Model for most proteins. Pooling all proteins together we show that the Stress Model is strongly supported by the total weight of evidence. Moreover, it accounts for the observed nonlinear dependence of sequence variability on flexibility. Finally, when mutational stress is controlled for, there is very little remaining correlation between sequence variability and dynamical flexibility. Conclusions We developed a mechanistic Stress Model of evolution according to which the rate of evolution of a site is predicted to depend linearly on the local mutational stress of the active conformation. Such local stress is proportional to LPD, so that this model explains the relationship between LPD and evolutionary rate. Moreover, the model also accounts for the nonlinear dependence between evolutionary rate and dynamical flexibility.
Collapse
Affiliation(s)
| | | | | | - Julian Echave
- Escuela de Ciencia y Tecnología, Universidad Nacional de San Martín, Martín de Irigoyen 3100, 1650 San Martín, Buenos Aires Argentina.
| |
Collapse
|
23
|
Jiang J, Golchert KJ, Kingsley CN, Brubaker WD, Martin RW, Mukamel S. Exploring the aggregation propensity of γS-crystallin protein variants using two-dimensional spectroscopic tools. J Phys Chem B 2013; 117:14294-301. [PMID: 24219230 DOI: 10.1021/jp408000k] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The formation of amyloid fibrils is associated with many serious diseases as well as diverse biological functions. Despite the importance of these aggregates, predicting the aggregation propensity of a particular sequence is a major challenge. We report a joint 2D nuclear magnetic resonance (NMR) and ultraviolet (2DUV) study of fibrillization in the wild-type and two aggregation-prone mutants of the eye lens protein γS-crystallin. Simulations show that the complexity of 2DUV signals as measured by their "approximate entropy" is a good indicator for the conformational entropy and in turn is strongly correlated with its aggregation propensity. These findings are in agreement with high-resolution NMR experiments and are corroborated for amyloid fibrils. The 2DUV technique is complementary to high-resolution structural methods and has the potential to make the evaluation of the aggregation propensity for protein variant propensity of protein structure more accessible to both theory and experiment. The approximate entropy of experimental 2DUV signals can be used for fast screening, enabling identification of variants with high fibrillization propensity for the much more time-consuming NMR structural studies, potentially expediting the characterization of protein variants associated with cataract and other protein aggregation diseases.
Collapse
Affiliation(s)
- Jun Jiang
- Department of Chemical Physics, University of Science and Technology of China , Hefei, China
| | | | | | | | | | | |
Collapse
|
24
|
Yeh SW, Liu JW, Yu SH, Shih CH, Hwang JK, Echave J. Site-specific structural constraints on protein sequence evolutionary divergence: local packing density versus solvent exposure. Mol Biol Evol 2013; 31:135-9. [PMID: 24109601 DOI: 10.1093/molbev/mst178] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Protein sequences evolve under selection pressures imposed by functional and biophysical requirements, resulting in site-dependent rates of amino acid substitution. Relative solvent accessibility (RSA) and local packing density (LPD) have emerged as the best candidates to quantify structural constraint. Recent research assumes that RSA is the main determinant of sequence divergence. However, it is not yet clear which is the best predictor of substitution rates. To address this issue, we compared RSA and LPD with site-specific rates of evolution for a diverse data set of enzymes. In contrast with recent studies, we found that LPD measures correlate better than RSA with evolutionary rate. Moreover, the independent contribution of RSA is minor. Taking into account that LPD is related to backbone flexibility, we put forward the possibility that the rate of evolution of a site is determined by the ease with which the backbone deforms to accommodate mutations.
Collapse
Affiliation(s)
- So-Wei Yeh
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, HsinChu, Taiwan, ROC
| | | | | | | | | | | |
Collapse
|
25
|
Chang CM, Huang YW, Shih CH, Hwang JK. On the relationship between the sequence conservation and the packing density profiles of the protein complexes. Proteins 2013; 81:1192-9. [DOI: 10.1002/prot.24268] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2012] [Revised: 01/29/2013] [Accepted: 02/01/2013] [Indexed: 11/12/2022]
Affiliation(s)
- Chih-Min Chang
- Institute of Bioinformatics and Systems Biology; National Chiao Tung University; HsinChu 30050; Taiwan; Republic of China
| | - Yu-Wen Huang
- Institute of Bioinformatics and Systems Biology; National Chiao Tung University; HsinChu 30050; Taiwan; Republic of China
| | - Chien-Hua Shih
- Institute of Bioinformatics and Systems Biology; National Chiao Tung University; HsinChu 30050; Taiwan; Republic of China
| | - Jenn-Kang Hwang
- Institute of Bioinformatics and Systems Biology; National Chiao Tung University; HsinChu 30050; Taiwan; Republic of China
| |
Collapse
|
26
|
Gupta SK, Rai AK, Kanwar SS, Sharma TR. Comparative analysis of zinc finger proteins involved in plant disease resistance. PLoS One 2012; 7:e42578. [PMID: 22916136 PMCID: PMC3419713 DOI: 10.1371/journal.pone.0042578] [Citation(s) in RCA: 112] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2012] [Accepted: 07/10/2012] [Indexed: 11/19/2022] Open
Abstract
A meta-analysis was performed to understand the role of zinc finger domains in proteins of resistance (R) genes cloned from different crops. We analyzed protein sequences of seventy R genes of various crops in which twenty six proteins were found to have zinc finger domains along with nucleotide binding sites - leucine rice repeats (NBS-LRR) domains. We identified thirty four zinc finger domains in the R proteins of nine crops and were grouped into 19 types of zinc fingers. The size of individual zinc finger domain within the R genes varied from 11 to 84 amino acids, whereas the size of proteins containing these domains varied from 263 to 1305 amino acids. The biophysical analysis revealed that molecular weight of Pi54 zinc finger was lowest whereas the highest one was found in rice Pib zinc finger named as Transposes Transcription Factor (TTF). The instability (R(2) =0.95) and the aliphatic (R(2) =0.94) indices profile of zinc finger domains follows the polynomial distribution pattern. The pairwise identity analysis showed that the Lin11, Isl-1 & Mec-3 (LIM) zinc finger domain of rice blast resistance protein pi21 have 12.3% similarity with the nuclear transcription factor, X-box binding-like 1 (NFX) type zinc finger domain of Pi54 protein. For the first time, we reported that Pi54 (Pi-k(h)-Tetep), a rice blast resistance (R) protein have a small zinc finger domain of NFX type located on the C-terminal in between NBS and LRR domains of the R-protein. Compositional analysis depicted by the helical wheel diagram revealed the presence of a hydrophobic region within this domain which might help in exposing the LRR region for a possible R-Avr interaction. This domain is unique among all other cloned plant disease resistance genes and might play an important role in broad-spectrum nature of rice blast resistance gene Pi54.
Collapse
Affiliation(s)
- Santosh Kumar Gupta
- National Research Centre on Plant Biotechnology, Indian Agricultural Research Institute, New Delhi, India
- Department of Biotechnology, Himachal Pradesh University, Summer-Hill, Shimla, India
| | - Amit Kumar Rai
- National Research Centre on Plant Biotechnology, Indian Agricultural Research Institute, New Delhi, India
- Department of Biotechnology, Himachal Pradesh University, Summer-Hill, Shimla, India
| | - Shamsher Singh Kanwar
- Department of Biotechnology, Himachal Pradesh University, Summer-Hill, Shimla, India
| | - Tilak R. Sharma
- National Research Centre on Plant Biotechnology, Indian Agricultural Research Institute, New Delhi, India
| |
Collapse
|
27
|
Zimmermann MT, Leelananda SP, Kloczkowski A, Jernigan RL. Combining statistical potentials with dynamics-based entropies improves selection from protein decoys and docking poses. J Phys Chem B 2012; 116:6725-31. [PMID: 22490366 DOI: 10.1021/jp2120143] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Protein structure prediction and protein-protein docking are important and widely used tools, but methods to confidently evaluate the quality of a predicted structure or binding pose have had limited success. Typically, either knowledge-based or physics-based energy functions are employed to evaluate a set of predicted structures (termed "decoys" in structure prediction and "poses" in docking), with the lowest energy structure being assumed to be the one closest to the native state. While successful for many cases, failures are still common. Thus, improvements to structure evaluation methods are essential for future improvements. In this work, we combine multibody statistical potentials with dynamics models, evaluating fluctuation-based entropies that include contributions from the entire structure. This leads to enhanced selection of native-like structures for CASP9 decoys, refined ClusPro docking poses, as well as large sets of docking poses from the Benchmark 3.0 and Dockground data sets. The data used include both bound and unbound docking, and positive results are found for each type. Not only does this method yield improved average results, but for high quality docking poses, we often pick the best pose.
Collapse
Affiliation(s)
- Michael T Zimmermann
- Bioinformatics and Computational Biology Interdepartmental Graduate Program, Iowa State University, Ames, Iowa 50011, USA
| | | | | | | |
Collapse
|
28
|
Shih CH, Chang CM, Lin YS, Lo WC, Hwang JK. Evolutionary information hidden in a single protein structure. Proteins 2012; 80:1647-57. [DOI: 10.1002/prot.24058] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2011] [Revised: 02/07/2012] [Accepted: 02/12/2012] [Indexed: 11/07/2022]
|
29
|
Batista MV, Ferreira TA, Freitas AC, Balbino VQ. An entropy-based approach for the identification of phylogenetically informative genomic regions of Papillomavirus. INFECTION GENETICS AND EVOLUTION 2011; 11:2026-33. [DOI: 10.1016/j.meegid.2011.09.013] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2011] [Revised: 09/09/2011] [Accepted: 09/14/2011] [Indexed: 11/17/2022]
|
30
|
Kadirvelraj R, Sennett NC, Polizzi SJ, Weitzel S, Wood ZA. Role of packing defects in the evolution of allostery and induced fit in human UDP-glucose dehydrogenase. Biochemistry 2011; 50:5780-9. [PMID: 21595445 DOI: 10.1021/bi2005637] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Allosteric feedback inhibition is the mechanism by which metabolic end products regulate their own biosynthesis by binding to an upstream enzyme. Despite its importance in controlling metabolism, there are relatively few allosteric mechanisms understood in detail. This is because allostery does not have an identifiable structural motif, making the discovery of new allosteric enzymes a difficult process. The lack of a conserved motif implies that the evolution of each allosteric mechanism is unique. Here we describe an atypical allosteric mechanism in human UDP-α-d-glucose 6-dehydrogenase (hUGDH) based on an easily acquired and identifiable structural attribute: packing defects in the protein core. In contrast to classic allostery, the active and allosteric sites in hUGDH are present as a single, bifunctional site. Using two new crystal structures, we show that binding of the feedback inhibitor, UDP-α-d-xylose, elicits a distinct induced-fit response; a buried loop translates ∼4 Å along and rotates ∼180° about the main chain axis, requiring surrounding side chains to repack. This allosteric transition is facilitated by packing defects, which negate the steric conformational restraints normally imposed by the protein core. Sedimentation velocity studies show that this repacking favors the formation of an inactive hexameric complex with unusual symmetry. We present evidence that hUGDH and the unrelated enzyme dCTP deaminase have converged to very similar atypical allosteric mechanisms using the same adaptive strategy, the selection for packing defects. Thus, the selection for packing defects is a robust mechanism for the evolution of allostery and induced fit.
Collapse
Affiliation(s)
- Renuka Kadirvelraj
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia 30602, USA
| | | | | | | | | |
Collapse
|
31
|
Rorick MM, Wagner GP. Protein structural modularity and robustness are associated with evolvability. Genome Biol Evol 2011; 3:456-75. [PMID: 21602570 PMCID: PMC3134980 DOI: 10.1093/gbe/evr046] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Theory suggests that biological modularity and robustness allow for maintenance of fitness under mutational change, and when this change is adaptive, for evolvability. Empirical demonstrations that these traits promote evolvability in nature remain scant however. This is in part because modularity, robustness, and evolvability are difficult to define and measure in real biological systems. Here, we address whether structural modularity and/or robustness confer evolvability at the level of proteins by looking for associations between indices of protein structural modularity, structural robustness, and evolvability. We propose a novel index for protein structural modularity: the number of regular secondary structure elements (helices and strands) divided by the number of residues in the structure. We index protein evolvability as the proportion of sites with evidence of being under positive selection multiplied by the average rate of adaptive evolution at these sites, and we measure this as an average over a phylogeny of 25 mammalian species. We use contact density as an index of protein designability, and thus, structural robustness. We find that protein evolvability is positively associated with structural modularity as well as structural robustness and that the effect of structural modularity on evolvability is independent of the structural robustness index. We interpret these associations to be the result of reduced constraints on amino acid substitutions in highly modular and robust protein structures, which results in faster adaptation through natural selection.
Collapse
Affiliation(s)
- Mary M Rorick
- Department of Genetics, Yale University, New Haven, Connecticut, USA.
| | | |
Collapse
|
32
|
Abstract
The quantitative underpinning of the information content of biosequences represents an elusive goal and yet also an obvious prerequisite to the quantitative modeling and study of biological function and evolution. Several past studies have addressed the question of what distinguishes biosequences from random strings, the latter being clearly unpalatable to the living cell. Such studies typically analyze the organization of biosequences in terms of their constituent characters or substrings and have, in particular, consistently exposed a tenacious lack of compressibility on behalf of biosequences. This article attempts, perhaps for the first time, an assessement of the structure and randomness of polypeptides in terms on newly introduced parameters that relate to the vocabulary of their (suitably constrained) subsequences rather than their substrings. It is shown that such parameters grasp structural/functional information, and are related to each other under a specific set of rules that span biochemically diverse polypeptides. Measures on subsequences separate few amino acid strings from their random permutations, but show that the random permutations of most polypeptides amass along specific linear loci.
Collapse
Affiliation(s)
- Alberto Apostolico
- College of Computing, Georgia Institute of Technology, Atlanta, GA 30318, USA.
| | | |
Collapse
|
33
|
Dou Y, Zheng X, Wang J. Several appropriate background distributions for entropy-based protein sequence conservation measures. J Theor Biol 2009; 262:317-22. [PMID: 19808039 DOI: 10.1016/j.jtbi.2009.09.030] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2009] [Revised: 09/25/2009] [Accepted: 09/25/2009] [Indexed: 11/25/2022]
Abstract
Amino acid background distribution is an important factor for entropy-based methods which extract sequence conservation information from protein multiple sequence alignments (MSAs). However, MSAs are usually not large enough to allow a reliable observed background distribution. In this paper, we propose two new estimations of background distribution. One is an integration of the observed background distribution and the position-specific residue distribution, and the other is a normalized square root of observed background frequency. To validate these new background distributions, they are applied to the relative entropy model to find catalytic sites and ligand binding sites from protein MSAs. Experimental results show that they are superior to the observed background distribution in predicting functionally important residues.
Collapse
Affiliation(s)
- Yongchao Dou
- School of Mathematical Science, Dalian University of Technology, Dalian 116024, PR China
| | | | | |
Collapse
|
34
|
Marín D, Martín M, Sabater B. Entropy decrease associated to solute compartmentalization in the cell. Biosystems 2009; 98:31-6. [DOI: 10.1016/j.biosystems.2009.07.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2009] [Revised: 07/01/2009] [Accepted: 07/02/2009] [Indexed: 10/20/2022]
|
35
|
Huggins W, Ghosh SK, Wollenzien P. Hydrogen bonding and packing density are factors most strongly connected to limiting sites of high flexibility in the 16S rRNA in the 30S ribosome. BMC STRUCTURAL BIOLOGY 2009; 9:49. [PMID: 19643000 PMCID: PMC2731775 DOI: 10.1186/1472-6807-9-49] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/03/2009] [Accepted: 07/30/2009] [Indexed: 11/10/2022]
Abstract
BACKGROUND Conformational flexibility in structured RNA frequently is critical to function. The 30S ribosomal subunit exists in different conformations in different functional states due to changes in the central part of the 16S rRNA. We are interested in evaluating the factors that might be responsible for restricting flexibility to specific parts of the 16S rRNA using biochemical data obtained from the 30S subunit in solution. This problem was approached taking advantage of the observation that there must be a high degree of conformational flexibility at sites where UV photocrosslinking occurs and a lack of flexibility inhibits photoreactivity at many other sites that are otherwise suitable for reaction. RESULTS We used 30S x-ray structures to quantify the properties of the nucleotide pairs at UV- and UVA-s4U-induced photocrosslinking sites in 16S rRNA and compared these to the properties of many hundreds of additional sites that have suitable geometry but do not undergo photocrosslinking. Five factors that might affect RNA flexibility were investigated - RNA interactions with ribosomal proteins, interactions with Mg2+ ions, the presence of long-range A minor motif interactions, hydrogen bonding and the count of neighboring heavy atoms around the center of each nucleobase to estimate the neighbor packing density. The two factors that are very different in the unreactive inflexible pairs compared to the reactive ones are the average number of hydrogen bonds and the average value for the number of neighboring atoms. In both cases, these factors are greater for the unreactive nucleotide pairs at a statistically very significant level. CONCLUSION The greater extent of hydrogen bonding and neighbor atom density in the unreactive nucleotide pairs is consistent with reduced flexibility at a majority of the unreactive sites. The reactive photocrosslinking sites are clustered in the 30S subunit and this indicates nonuniform patterns of hydrogen bonding and packing density in the 16S rRNA tertiary structure. Because this analysis addresses inter-nucleotide distances and geometry between nucleotides distant in the primary sequence, the results indicate regional and global flexibility of the rRNA.
Collapse
Affiliation(s)
- Wayne Huggins
- Department of Molecular and Structural Biochemistry, North Carolina State University, Raleigh, USA
- RTI International, Research Triangle Park, USA
| | - Sujit K Ghosh
- Department of Statistics, North Carolina State University, Raleigh, USA
| | - Paul Wollenzien
- Department of Molecular and Structural Biochemistry, North Carolina State University, Raleigh, USA
| |
Collapse
|
36
|
Melton SJ, Landry SJ. Three dimensional structure directs T-cell epitope dominance associated with allergy. Clin Mol Allergy 2008; 6:9. [PMID: 18793409 PMCID: PMC2553403 DOI: 10.1186/1476-7961-6-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2008] [Accepted: 09/15/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND CD4+ T-cell epitope immunodominance is not adequately explained by peptide selectivity in class II major histocompatibility proteins, but it has been correlated with adjacent segments of conformational flexibility in several antigens. METHODS The published T-cell responses to two venom allergens and two aeroallergens were used to construct profiles of epitope dominance, which were correlated with the distribution of conformational flexibility, as measured by crystallographic B factors, solvent-accessible surface, COREX residue stability, and sequence entropy. RESULTS Epitopes associated with allergy tended to be excluded from and lie adjacent to flexible segments of the allergen. CONCLUSION During the initiation of allergy, the N- and/or C-terminal ends of proteolytic processing intermediates were preferentially loaded into antigen presenting proteins for the priming of CD4+ T cells.
Collapse
Affiliation(s)
- Scott J Melton
- Biomedical Sciences Graduate Program, Tulane University Health Sciences Center, New Orleans, LA, 70112, USA.
| | | |
Collapse
|
37
|
Mirano-Bascos D, Tary-Lehmann M, Landry SJ. Antigen structure influences helper T-cell epitope dominance in the human immune response to HIV envelope glycoprotein gp120. Eur J Immunol 2008; 38:1231-7. [PMID: 18398933 DOI: 10.1002/eji.200738011] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The development of an effective vaccine against HIV/AIDS has been hampered, in part, by a poor understanding of the rules governing helper T-cell epitope immunodominance. Studies in mice have shown that antigen structure modulates epitope immunodominance by affecting the processing and subsequent presentation of helper T-cell epitopes. Previous epitope mapping studies showed that the immunodominant helper T-cell epitopes in mice immunized with gp120 were found flanking flexible loops of the protein. In this report, we show that helper T-cell epitopes against gp120 in humans infected with HIV are also found flanking flexible loops. Immunodominant epitopes were found to be located primarily in the outer domain, an average of 12 residues C-terminal to flexible loops. In the less immunogenic inner domain, epitopes were found an average of five residues N-terminal to conserved regions of the protein, once again placing the epitopes C-terminal to flexible loops. These results show that antigen structure plays a significant role in the shaping of the helper T-cell response against HIV gp120 in humans. This relationship between antigen structure and helper T-cell epitope immunodominance may prove to be useful in the development of rationally designed vaccines against pathogens such as HIV.
Collapse
Affiliation(s)
- Denise Mirano-Bascos
- Interdisciplinary Program in the Biomedical Sciences, Tulane University, New Orleans, LA 70112, USA
| | | | | |
Collapse
|
38
|
Measuring the functional sequence complexity of proteins. THEORETICAL BIOLOGY & MEDICAL MODELLING 2007. [PMID: 18062814 DOI: 10.1186/1742‐4682‐4‐47] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND Abel and Trevors have delineated three aspects of sequence complexity, Random Sequence Complexity (RSC), Ordered Sequence Complexity (OSC) and Functional Sequence Complexity (FSC) observed in biosequences such as proteins. In this paper, we provide a method to measure functional sequence complexity. METHODS AND RESULTS We have extended Shannon uncertainty by incorporating the data variable with a functionality variable. The resulting measured unit, which we call Functional bit (Fit), is calculated from the sequence data jointly with the defined functionality variable. To demonstrate the relevance to functional bioinformatics, a method to measure functional sequence complexity was developed and applied to 35 protein families. Considerations were made in determining how the measure can be used to correlate functionality when relating to the whole molecule and sub-molecule. In the experiment, we show that when the proposed measure is applied to the aligned protein sequences of ubiquitin, 6 of the 7 highest value sites correlate with the binding domain. CONCLUSION For future extensions, measures of functional bioinformatics may provide a means to evaluate potential evolving pathways from effects such as mutations, as well as analyzing the internal structural and functional relationships within the 3-D structure of proteins.
Collapse
|
39
|
Durston KK, Chiu DKY, Abel DL, Trevors JT. Measuring the functional sequence complexity of proteins. Theor Biol Med Model 2007; 4:47. [PMID: 18062814 PMCID: PMC2217542 DOI: 10.1186/1742-4682-4-47] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2007] [Accepted: 12/06/2007] [Indexed: 11/29/2022] Open
Abstract
Background Abel and Trevors have delineated three aspects of sequence complexity, Random Sequence Complexity (RSC), Ordered Sequence Complexity (OSC) and Functional Sequence Complexity (FSC) observed in biosequences such as proteins. In this paper, we provide a method to measure functional sequence complexity. Methods and Results We have extended Shannon uncertainty by incorporating the data variable with a functionality variable. The resulting measured unit, which we call Functional bit (Fit), is calculated from the sequence data jointly with the defined functionality variable. To demonstrate the relevance to functional bioinformatics, a method to measure functional sequence complexity was developed and applied to 35 protein families. Considerations were made in determining how the measure can be used to correlate functionality when relating to the whole molecule and sub-molecule. In the experiment, we show that when the proposed measure is applied to the aligned protein sequences of ubiquitin, 6 of the 7 highest value sites correlate with the binding domain. Conclusion For future extensions, measures of functional bioinformatics may provide a means to evaluate potential evolving pathways from effects such as mutations, as well as analyzing the internal structural and functional relationships within the 3-D structure of proteins.
Collapse
Affiliation(s)
- Kirk K Durston
- Department of Biophysics, University of Guelph, Guelph, ON, N1G 2W1, Canada.
| | | | | | | |
Collapse
|
40
|
|
41
|
Abstract
The recent structural elucidation of about one dozen channels (in which we include transporters) has provided further evidence that these membrane proteins typically undergo large movements during their function. However, it is still not well understood how these proteins achieve the necessary trade-off between stability and mobility. To identify specific structural properties of channels, we compared the helix-packing and hydrogen-bonding patterns of channels with those of membrane coils; the latter is a class of membrane proteins whose structures are expected to be more rigid. We describe in detail how in channels, helix pairs are usually arranged in packing motifs with large crossing angles (|tau| approximately 40 degrees ), where the (small) side chains point away from the packing core and the backbones of the two helices are in close contact. We found that this contributes to a significant enrichment of Calpha-H...O bonds and to a packing geometry where right-handed parallel (tau = -40 degrees +/- 10 degrees ) and antiparallel (tau = +140 degrees +/- 25 degrees ) arrangements are equally preferred. By sharp contrast, the interdigitation and hydrogen bonding of side chains in helix pairs of membrane coils results in narrowly distributed left-handed antiparallel arrangements with crossing angles tau = -160 degrees +/- 10 degrees (|tau| approximately 20 degrees ). In addition, we show that these different helix-packing modes of the two types of membrane proteins correspond to specific hydrogen-bonding patterns. In particular, in channels, three times as many of the hydrogen-bonded helix pairs are found in parallel right-handed motifs than are non-hydrogen-bonded helix pairs. Finally, we discuss how the presence of weak hydrogen bonds, water-containing cavities, and right-handed crossing angles may facilitate the required conformational flexibility between helix pairs of channels while maintaining sufficient structural stability.
Collapse
|
42
|
Chakrabarti P, Bhattacharyya R. Geometry of nonbonded interactions involving planar groups in proteins. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2007; 95:83-137. [PMID: 17629549 DOI: 10.1016/j.pbiomolbio.2007.03.016] [Citation(s) in RCA: 152] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2006] [Accepted: 03/18/2007] [Indexed: 11/26/2022]
Abstract
Although hydrophobic interaction is the main contributing factor to the stability of the protein fold, the specificity of the folding process depends on many directional interactions. An analysis has been carried out on the geometry of interaction between planar moieties of ten side chains (Phe, Tyr, Trp, His, Arg, Pro, Asp, Glu, Asn and Gln), the aromatic residues and the sulfide planes (of Met and cystine), and the aromatic residues and the peptide planes within the protein tertiary structures available in the Protein Data Bank. The occurrence of hydrogen bonds and other nonconventional interactions such as C-H...pi, C-H...O, electrophile-nucleophile interactions involving the planar moieties has been elucidated. The specific nature of the interactions constraints many of the residue pairs to occur with a fixed sequence difference, maintaining a sequential order, when located in secondary structural elements, such as alpha-helices and beta-turns. The importance of many of these interactions (for example, aromatic residues interacting with Pro or cystine sulfur atom) is revealed by the higher degree of conservation observed for them in protein structures and binding regions. The planar residues are well represented in the active sites, and the geometry of their interactions does not deviate from the general distribution. The geometrical relationship between interacting residues provides valuable insights into the process of protein folding and would be useful for the design of protein molecules and modulation of their binding properties.
Collapse
Affiliation(s)
- Pinak Chakrabarti
- Department of Biochemistry and Bioinformatics Centre, Bose Institute, P-1/12 CIT Scheme VIIM, Kolkata 700054, India.
| | | |
Collapse
|
43
|
Jernigan RL, Kloczkowski A. Packing regularities in biological structures relate to their dynamics. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2006; 350:251-76. [PMID: 16957327 PMCID: PMC2039702 DOI: 10.1385/1-59745-189-4:251] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]
Abstract
The high packing density inside proteins leads to certain geometric regularities and also is one of the most important contributors to the high extent of cooperativity manifested by proteins in their cohesive domain motions. The orientations between neighboring nonbonded residues in proteins substantially follow the similar geometric regularities, regardless of whether the residues are on the surface or buried, a direct result of hydrophobicity forces. These orientations are relatively fixed and correspond closely to small deformations from those of the face-centered cubic lattice, which is the way in which identical spheres pack at the highest density. Packing density also is related to the extent of conservation of residues, and we show this relationship for residue packing densities by averaging over a large sample or residue packings. There are three regimes: (1) over a broad range of packing densities the relationship between sequence entropy and inverse packing density is nearly linear, (2) over a limited range of low packing densities the sequence entropy is nearly constant, and (3) at extremely low packing densities the sequence entropy is highly variable. These packing results provide important justification for the simple elastic network models that have been shown for a large number of proteins to represent protein dynamics so successfully, even when the models are extremely coarse grained. Elastic network models for polymeric chains are simple and could be combined with these protein elastic networks to represent partially denatured parts of proteins. Finally, we show results of applications of the elastic network model to study the functional motions of the ribosome, based on its known structure. These results indicate expected correlations among its components for the step-wise processing steps in protein synthesis, and suggest ways to use these elastic network models to develop more detailed mechanisms, an important possibility because most experiments yield only static structures.
Collapse
Affiliation(s)
- Robert L Jernigan
- Department of Biochemistry, Biophysics, and Molecular Biology, Laurence H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA, USA
| | | |
Collapse
|
44
|
Mozo-Villarías A, Cedano J, Querol E. Hydrophobicity Density Profiles to Predict Thermal Stability Enhancement in Proteins. Protein J 2006; 25:529-35. [PMID: 17106643 DOI: 10.1007/s10930-006-9039-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
A hydrophobicity density is defined for a protein through its hydrophobicity tensor (similar to the inertia tensor), by using the Eisenberg hydrophobicity scale of the hydrophobic amino acids of a protein. This allows calculation of the radii of the corresponding hydrophobic ellipsoid of a protein and thus subsequently of its hydrophobic density. A hydrophobicity density profile is then obtained by simulating point mutations of each amino acid of a protein either to a high hydrophobicity value or to zero hydrophobicity. It is found that an increase in the hydrophobic density of the protein correlates with an increase of its mid-point transition temperature. From this profile it is possible to determine the amino acids or domain stretches in a protein that are most amenable to mutation in order to increase the thermal stability. The model is tested to predict the thermostabilisation effects of two mutations in a beta-glucanase: M29G and M29F. This model is compared with other hydrophobicity-related profiles described by other authors.
Collapse
Affiliation(s)
- Angel Mozo-Villarías
- Departament de Ciències Mèdiques Bàsiques, Universitat de Lleida, 25198, Lleida, Spain.
| | | | | |
Collapse
|
45
|
Sen TZ, Feng Y, Garcia JV, Kloczkowski A, Jernigan RL. The Extent of Cooperativity of Protein Motions Observed with Elastic Network Models Is Similar for Atomic and Coarser-Grained Models. J Chem Theory Comput 2006; 2:696-704. [PMID: 17710199 PMCID: PMC1948848 DOI: 10.1021/ct600060d] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Coarse-grained elastic network models have been successful in determining functionally relevant collective motions. The level of coarse-graining, however, has usually focused on the level of one point per residue. In this work, we compare the applicability of elastic network models over a broader range of representational scales. We apply normal mode analysis for multiple scales on a high-resolution protein data set using various cutoff radii to define the residues considered to be interacting, or the extent of cooperativity of their motions. These scales include the residue-, atomic-, proton-, and explicit solvent-levels. Interestingly, atomic, proton, and explicit solvent level calculations all provide similar results at the same cutoff value, with the computed mean-square fluctuations showing only a slightly higher correlation (0.61) with the experimental temperature factors from crystallography than the results of the residue-level coarse-graining. The qualitative behavior of each level of coarse graining is similar at different cutoff values. The correlations between these fluctuations and the number of internal contacts improve with increased cutoff values. Our results demonstrate that atomic level elastic network models provide an improved representation for the collective motions of proteins compared to the coarse-grained models.
Collapse
Affiliation(s)
- Taner Z. Sen
- L. H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011-3020
- Department of Biochemistry, Biophysics, and Molecular Biology, Iowa State University, Ames, IA 50011
| | - Yaping Feng
- Department of Biochemistry, Biophysics, and Molecular Biology, Iowa State University, Ames, IA 50011
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA 50011
| | - John V. Garcia
- L. H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011-3020
| | - Andrzej Kloczkowski
- L. H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011-3020
| | - Robert L. Jernigan
- L. H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011-3020
- Department of Biochemistry, Biophysics, and Molecular Biology, Iowa State University, Ames, IA 50011
| |
Collapse
|
46
|
Guharoy M, Chakrabarti P. Conservation and relative importance of residues across protein-protein interfaces. Proc Natl Acad Sci U S A 2005; 102:15447-52. [PMID: 16221766 PMCID: PMC1266102 DOI: 10.1073/pnas.0505425102] [Citation(s) in RCA: 190] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2005] [Accepted: 08/23/2005] [Indexed: 11/18/2022] Open
Abstract
A core region surrounded by a rim characterizes biological interfaces. We ascertain the importance of the core by showing the sequence entropies of the residues comprising the core to be smaller than those in the rim. Such a distinction is not seen in the 2-fold-related, nonphysiological interfaces formed in crystal lattices of monomeric proteins, thereby providing a procedure for characterizing the oligomeric state from crystal structures of protein molecules. This method is better than those that rely on the comparison of the sequence entropies in the interface and the rest of the protein surface, especially in cases where the surface harbors additional binding sites. To a good approximation there is a correlation between the accessible surface area lost because of complexation and DeltaDeltaG values obtained through alanine-scanning mutagenesis (26-38 cal per A(2) of the surface buried) for residues located in the core, a relationship that is not discernable for rim residues. If, however, a residue participates in hydrogen bonding across the interface, the extent of stabilization is 52 cal/mol per 1 A(2) of the nonpolar surface area buried by the residue. As opposed to an amino acid classification used earlier, an environment-based grouping of residues yields a better discrimination in the sequence entropy between the core and the rim.
Collapse
Affiliation(s)
- Mainak Guharoy
- Department of Biochemistry, Bose Institute, P-1/12 CIT Scheme VIIM, Calcutta 700 054, India
| | | |
Collapse
|