1
|
Jiang H, Li H, Wong WH, Fan X. Revealing Free Energy Landscape From MD Data via Conditional Angle Partition Tree. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1384-1394. [PMID: 35503836 DOI: 10.1109/tcbb.2022.3172352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Deciphering the free energy landscape of biomolecular structure space is crucial for understanding many complex molecular processes, such as protein-protein interaction, RNA folding, and protein folding. A major source of current dynamic structure data is Molecular Dynamics (MD) simulations. Several methods have been proposed to investigate the free energy landscape from MD data, but all of them rely on the assumption that kinetic similarity is associated with global geometric similarity, which may lead to unsatisfactory results. In this paper, we proposed a new method called Conditional Angle Partition Tree to reveal the hierarchical free energy landscape by correlating local geometric similarity with kinetic similarity. Its application on the benchmark alanine dipeptide MD data showed a much better performance than existing methods in exploring and understanding the free energy landscape. We also applied it to the MD data of Villin HP35. Our results are more reasonable on various aspects than those from other methods and very informative on the hierarchical structure of its energy landscape.
Collapse
|
2
|
Gogoi CR, Rahman A, Saikia B, Baruah A. Protein Dihedral Angle Prediction: The State of the Art. ChemistrySelect 2023. [DOI: 10.1002/slct.202203427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Affiliation(s)
| | - Aziza Rahman
- Department of Chemistry Dibrugarh University Dibrugarh Assam India
| | - Bondeepa Saikia
- Department of Chemistry Dibrugarh University Dibrugarh Assam India
| | - Anupaul Baruah
- Department of Chemistry Dibrugarh University Dibrugarh Assam India
| |
Collapse
|
3
|
Mills R, Vogler RJ, Bernard M, Concolino J, Hersh LB, Wei Y, Hastings JT, Dziubla T, Baldridge KC, Bhattacharyya D. Aerosol capture and coronavirus spike protein deactivation by enzyme functionalized antiviral membranes. COMMUNICATIONS MATERIALS 2022; 3:34. [PMID: 36406238 PMCID: PMC9674191 DOI: 10.1038/s43246-022-00256-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 05/02/2022] [Indexed: 06/16/2023]
Abstract
The airborne nature of coronavirus transmission makes it critical to develop new barrier technologies that can simultaneously reduce aerosol and viral spread. Here, we report nanostructured membranes with tunable thickness and porosity for filtering coronavirus-sized aerosols, combined with antiviral enzyme functionalization that can denature spike glycoproteins of the SARS-CoV-2 virus in low-hydration environments. Thin, asymmetric membranes with subtilisin enzyme and methacrylic functionalization show more than 98.90% filtration efficiency for 100-nm unfunctionalized and protein-functionalized polystyrene latex aerosol particles. Unfunctionalized membranes provided a protection factor of 540 ± 380 for coronavirus-sized particle, above the Occupational Safety and Health Administration's standard of 10 for N95 masks. SARS-CoV-2 spike glycoprotein on the surface of coronavirus-sized particles was denatured in 30 s by subtilisin enzyme-functionalized membranes with 0.02-0.2% water content on the membrane surface.
Collapse
Affiliation(s)
- Rollie Mills
- Department of Chemical and Materials Engineering, University of Kentucky, Lexington, KY 40506, USA
| | - Ronald J. Vogler
- Department of Chemical and Materials Engineering, University of Kentucky, Lexington, KY 40506, USA
- These authors contributed equally: Ronald J. Vogler, Matthew Bernard
| | - Matthew Bernard
- Department of Chemical and Materials Engineering, University of Kentucky, Lexington, KY 40506, USA
- These authors contributed equally: Ronald J. Vogler, Matthew Bernard
| | - Jacob Concolino
- Department of Chemical and Materials Engineering, University of Kentucky, Lexington, KY 40506, USA
| | - Louis B. Hersh
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, KY 40506, USA
| | - Yinan Wei
- Department of Chemistry, University of Kentucky, Lexington, KY 40506, USA
| | - Jeffrey Todd Hastings
- Department of Electrical and Computer Engineering, University of Kentucky, Lexington, KY 40506, USA
| | - Thomas Dziubla
- Department of Chemical and Materials Engineering, University of Kentucky, Lexington, KY 40506, USA
| | - Kevin C. Baldridge
- Department of Chemical and Materials Engineering, University of Kentucky, Lexington, KY 40506, USA
| | - Dibakar Bhattacharyya
- Department of Chemical and Materials Engineering, University of Kentucky, Lexington, KY 40506, USA
| |
Collapse
|
4
|
Investigation of machine learning techniques on proteomics: A comprehensive survey. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2019; 149:54-69. [PMID: 31568792 DOI: 10.1016/j.pbiomolbio.2019.09.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Revised: 09/16/2019] [Accepted: 09/23/2019] [Indexed: 11/21/2022]
Abstract
Proteomics is the extensive investigation of proteins which has empowered the recognizable proof of consistently expanding quantities of protein. Proteins are necessary part of living life form, with numerous capacities. The proteome is the complete arrangement of proteins that are created or altered by a life form or framework of the organism. Proteome fluctuates with time and unambiguous prerequisites, or stresses, that a cell or organism experiences. Proteomics is an interdisciplinary area that has derived from the hereditary data of different genome ventures. Much proteomics information is gathered with the assistance of high throughput techniques, for example, mass spectrometry and microarray. It would regularly take weeks or months to analyze the information and perform examinations by hand. Therefore, scholars and scientific experts are teaming up with computer science researchers and mathematicians to make projects and pipeline to computationally examine the protein information. Utilizing bioinformatics procedures, scientists are prepared to do quicker investigation and protein information storing. The goal of this paper is to brief about the review of machine learning procedures and its application in the field of proteomics.
Collapse
|
5
|
Gao Y, Wang S, Deng M, Xu J. RaptorX-Angle: real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning. BMC Bioinformatics 2018; 19:100. [PMID: 29745828 PMCID: PMC5998898 DOI: 10.1186/s12859-018-2065-x] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Background Protein dihedral angles provide a detailed description of protein local conformation. Predicted dihedral angles can be used to narrow down the conformational space of the whole polypeptide chain significantly, thus aiding protein tertiary structure prediction. However, direct angle prediction from sequence alone is challenging. Results In this article, we present a novel method (named RaptorX-Angle) to predict real-valued angles by combining clustering and deep learning. Tested on a subset of PDB25 and the targets in the latest two Critical Assessment of protein Structure Prediction (CASP), our method outperforms the existing state-of-art method SPIDER2 in terms of Pearson Correlation Coefficient (PCC) and Mean Absolute Error (MAE). Our result also shows approximately linear relationship between the real prediction errors and our estimated bounds. That is, the real prediction error can be well approximated by our estimated bounds. Conclusions Our study provides an alternative and more accurate prediction of dihedral angles, which may facilitate protein structure prediction and functional study. Electronic supplementary material The online version of this article (10.1186/s12859-018-2065-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yujuan Gao
- Center for Quantitative Biology, Peking University, Beijing, China.,Toyota Technological Institute at Chicago, 6045 S Kenwood Ave., Chicago, USA
| | - Sheng Wang
- Toyota Technological Institute at Chicago, 6045 S Kenwood Ave., Chicago, USA
| | - Minghua Deng
- Center for Quantitative Biology, Peking University, Beijing, China. .,School of Mathematical Sciences, Beijing, China. .,Center for Statistical Sciences, Beijing, China.
| | - Jinbo Xu
- Toyota Technological Institute at Chicago, 6045 S Kenwood Ave., Chicago, USA.
| |
Collapse
|
6
|
Zhang Y, Zai-Rose V, Price CJ, Ezzell NA, Bidwell GL, Correia JJ, Fitzkee NC. Modeling the Early Stages of Phase Separation in Disordered Elastin-like Proteins. Biophys J 2018; 114:1563-1578. [PMID: 29642027 PMCID: PMC5954566 DOI: 10.1016/j.bpj.2018.01.045] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Revised: 12/19/2017] [Accepted: 01/31/2018] [Indexed: 12/13/2022] Open
Abstract
Elastin-like proteins (ELPs) are known to undergo liquid-liquid phase separation reversibly above a concentration-dependent transition temperature. Previous studies suggested that, as temperature increases, ELPs experience an increased propensity for type II β-turns. However, how the ELPs behave below the phase transition temperature itself is still elusive. Here, we investigate the importance of β-turn formation during the early stages of ELP self-association. We examined the behavior of two ELPs, a 150-repeat construct that had been investigated previously (ELP[V5G3A2-150] as well as a new 40-repeat construct (ELP40) suitable for nuclear magnetic resonance measurements. Structural analysis of ELP40 reveals a disordered conformation, and chemical shifts throughout the sequence are insensitive to changes in temperature over 20°C. However, a low population of β-turn conformation cannot be ruled out based on chemical shifts alone. To examine the structural consequences of β-turns in ELPs, a series of structural ensembles of ELP[V5G3A2-150] were generated, incorporating differing amounts of β-turn bias throughout the chain. To mimic the early stages of the phase change, two monomers were paired, assuming preferential interaction at β-turn regions. This approach was justified by the observation that buried hydrophobic turns are commonly observed to interact in the Protein Data Bank. After dimerization, the ensemble-averaged hydrodynamic properties were calculated for each degree of β-turn bias, and the results were compared with analytical ultracentrifugation experiments at various temperatures. We find that the temperature dependence of the sedimentation coefficient (s20,wo) can be reproduced by increasing the β-turn content in the structural ensemble. This analysis allows us to estimate the presence of β-turns and weak associations under experimental conditions. Because disordered proteins frequently exhibit weak biases in secondary structure propensity, these experimentally-driven ensemble calculations may complement existing methods for modeling disordered proteins generally.
Collapse
Affiliation(s)
- Yue Zhang
- Department of Chemistry, Mississippi State University, Mississippi State, Mississippi
| | - Valeria Zai-Rose
- Department of Biochemistry, University of Mississippi Medical Center, Jackson, Mississippi
| | - Cody J Price
- Department of Chemistry, Mississippi State University, Mississippi State, Mississippi
| | - Nicholas A Ezzell
- Department of Chemistry, Mississippi State University, Mississippi State, Mississippi
| | - Gene L Bidwell
- Department of Biochemistry, University of Mississippi Medical Center, Jackson, Mississippi
| | - John J Correia
- Department of Biochemistry, University of Mississippi Medical Center, Jackson, Mississippi
| | - Nicholas C Fitzkee
- Department of Chemistry, Mississippi State University, Mississippi State, Mississippi.
| |
Collapse
|
7
|
Tanner JJ. Empirical power laws for the radii of gyration of protein oligomers. Acta Crystallogr D Struct Biol 2016; 72:1119-1129. [PMID: 27710933 PMCID: PMC5053138 DOI: 10.1107/s2059798316013218] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2016] [Accepted: 08/16/2016] [Indexed: 11/10/2022] Open
Abstract
The radius of gyration is a fundamental structural parameter that is particularly useful for describing polymers. It has been known since Flory's seminal work in the mid-20th century that polymers show a power-law dependence, where the radius of gyration is proportional to the number of residues raised to a power. The power-law exponent has been measured experimentally for denatured proteins and derived empirically for folded monomeric proteins using crystal structures. Here, the biological assemblies in the Protein Data Bank are surveyed to derive the power-law parameters for protein oligomers having degrees of oligomerization of 2-6 and 8. The power-law exponents for oligomers span a narrow range of 0.38-0.41, which is close to the value of 0.40 obtained for monomers. This result shows that protein oligomers exhibit essentially the same power-law behavior as monomers. A simple power-law formula is provided for estimating the oligomeric state from an experimental measurement of the radius of gyration. Several proteins in the Protein Data Bank are found to deviate substantially from power-law behavior by having an atypically large radius of gyration. Some of the outliers have highly elongated structures, such as coiled coils. For coiled coils, the radius of gyration does not follow a power law and instead scales linearly with the number of residues in the oligomer. Other outliers are proteins whose oligomeric state or quaternary structure is incorrectly annotated in the Protein Data Bank. The power laws could be used to identify such errors and help prevent them in future depositions.
Collapse
Affiliation(s)
- John J. Tanner
- Departments of Biochemistry and Chemistry, University of Missouri-Columbia, Columbia, MO 65211, USA
| |
Collapse
|
8
|
Hao XH, Zhang GJ, Zhou XG, Yu XF. A Novel Method Using Abstract Convex Underestimation in Ab-Initio Protein Structure Prediction for Guiding Search in Conformational Feature Space. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:887-900. [PMID: 26552093 DOI: 10.1109/tcbb.2015.2497226] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
To address the searching problem of protein conformational space in ab-initio protein structure prediction, a novel method using abstract convex underestimation (ACUE) based on the framework of evolutionary algorithm was proposed. Computing such conformations, essential to associate structural and functional information with gene sequences, is challenging due to the high-dimensionality and rugged energy surface of the protein conformational space. As a consequence, the dimension of protein conformational space should be reduced to a proper level. In this paper, the high-dimensionality original conformational space was converted into feature space whose dimension is considerably reduced by feature extraction technique. And, the underestimate space could be constructed according to abstract convex theory. Thus, the entropy effect caused by searching in the high-dimensionality conformational space could be avoided through such conversion. The tight lower bound estimate information was obtained to guide the searching direction, and the invalid searching area in which the global optimal solution is not located could be eliminated in advance. Moreover, instead of expensively calculating the energy of conformations in the original conformational space, the estimate value is employed to judge if the conformation is worth exploring to reduce the evaluation time, thereby making computational cost lower and the searching process more efficient. Additionally, fragment assembly and the Monte Carlo method are combined to generate a series of metastable conformations by sampling in the conformational space. The proposed method provides a novel technique to solve the searching problem of protein conformational space. Twenty small-to-medium structurally diverse proteins were tested, and the proposed ACUE method was compared with It Fix, HEA, Rosetta and the developed method LEDE without underestimate information. Test results show that the ACUE method can more rapidly and more efficiently obtain the near-native protein structure.
Collapse
|
9
|
DasGupta D, Kaushik R, Jayaram B. From Ramachandran Maps to Tertiary Structures of Proteins. J Phys Chem B 2015; 119:11136-45. [DOI: 10.1021/acs.jpcb.5b02999] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Debarati DasGupta
- Department of Chemistry, ‡Supercomputing Facility for Bioinformatics & Computational Biology, and §Kusuma School of Biological Sciences, Indian Institute of Technology, Hauz Khas, New Delhi-110016, India
| | - Rahul Kaushik
- Department of Chemistry, ‡Supercomputing Facility for Bioinformatics & Computational Biology, and §Kusuma School of Biological Sciences, Indian Institute of Technology, Hauz Khas, New Delhi-110016, India
| | - B. Jayaram
- Department of Chemistry, ‡Supercomputing Facility for Bioinformatics & Computational Biology, and §Kusuma School of Biological Sciences, Indian Institute of Technology, Hauz Khas, New Delhi-110016, India
| |
Collapse
|
10
|
Saleh S, Olson B, Shehu A. A population-based evolutionary search approach to the multiple minima problem in de novo protein structure prediction. BMC STRUCTURAL BIOLOGY 2013; 13 Suppl 1:S4. [PMID: 24565020 PMCID: PMC3953177 DOI: 10.1186/1472-6807-13-s1-s4] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Background Elucidating the native structure of a protein molecule from its sequence of amino acids, a problem known as de novo structure prediction, is a long standing challenge in computational structural biology. Difficulties in silico arise due to the high dimensionality of the protein conformational space and the ruggedness of the associated energy surface. The issue of multiple minima is a particularly troublesome hallmark of energy surfaces probed with current energy functions. In contrast to the true energy surface, these surfaces are weakly-funneled and rich in comparably deep minima populated by non-native structures. For this reason, many algorithms seek to be inclusive and obtain a broad view of the low-energy regions through an ensemble of low-energy (decoy) conformations. Conformational diversity in this ensemble is key to increasing the likelihood that the native structure has been captured. Methods We propose an evolutionary search approach to address the multiple-minima problem in decoy sampling for de novo structure prediction. Two population-based evolutionary search algorithms are presented that follow the basic approach of treating conformations as individuals in an evolving population. Coarse graining and molecular fragment replacement are used to efficiently obtain protein-like child conformations from parents. Potential energy is used both to bias parent selection and determine which subset of parents and children will be retained in the evolving population. The effect on the decoy ensemble of sampling minima directly is measured by additionally mapping a conformation to its nearest local minimum before considering it for retainment. The resulting memetic algorithm thus evolves not just a population of conformations but a population of local minima. Results and conclusions Results show that both algorithms are effective in terms of sampling conformations in proximity of the known native structure. The additional minimization is shown to be key to enhancing sampling capability and obtaining a diverse ensemble of decoy conformations, circumventing premature convergence to sub-optimal regions in the conformational space, and approaching the native structure with proximity that is comparable to state-of-the-art decoy sampling methods. The results are shown to be robust and valid when using two representative state-of-the-art coarse-grained energy functions.
Collapse
|
11
|
Olson BS, Shehu A. Rapid sampling of local minima in protein energy surface and effective reduction through a multi-objective filter. Proteome Sci 2013; 11:S12. [PMID: 24564970 PMCID: PMC3908317 DOI: 10.1186/1477-5956-11-s1-s12] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many problems in protein modeling require obtaining a discrete representation of the protein conformational space as an ensemble of conformations. In ab-initio structure prediction, in particular, where the goal is to predict the native structure of a protein chain given its amino-acid sequence, the ensemble needs to satisfy energetic constraints. Given the thermodynamic hypothesis, an effective ensemble contains low-energy conformations which are similar to the native structure. The high-dimensionality of the conformational space and the ruggedness of the underlying energy surface currently make it very difficult to obtain such an ensemble. Recent studies have proposed that Basin Hopping is a promising probabilistic search framework to obtain a discrete representation of the protein energy surface in terms of local minima. Basin Hopping performs a series of structural perturbations followed by energy minimizations with the goal of hopping between nearby energy minima. This approach has been shown to be effective in obtaining conformations near the native structure for small systems. Recent work by us has extended this framework to larger systems through employment of the molecular fragment replacement technique, resulting in rapid sampling of large ensembles. METHODS This paper investigates the algorithmic components in Basin Hopping to both understand and control their effect on the sampling of near-native minima. Realizing that such an ensemble is reduced before further refinement in full ab-initio protocols, we take an additional step and analyze the quality of the ensemble retained by ensemble reduction techniques. We propose a novel multi-objective technique based on the Pareto front to filter the ensemble of sampled local minima. RESULTS AND CONCLUSIONS We show that controlling the magnitude of the perturbation allows directly controlling the distance between consecutively-sampled local minima and, in turn, steering the exploration towards conformations near the native structure. For the minimization step, we show that the addition of Metropolis Monte Carlo-based minimization is no more effective than a simple greedy search. Finally, we show that the size of the ensemble of sampled local minima can be effectively and efficiently reduced by a multi-objective filter to obtain a simpler representation of the probed energy surface.
Collapse
Affiliation(s)
- Brian S Olson
- Department of Computer Science, George Mason University, 4400 University Dr., Fairfax, VA, 22030, USA
| | - Amarda Shehu
- Department of Computer Science, George Mason University, 4400 University Dr., Fairfax, VA, 22030, USA
- Department of Bioengineering, George Mason University, 4400 University Dr., Fairfax, VA, 22030, USA
- School of Systems Biology, George Mason University, 10900 University Blvd., Manassas, VA, 20110, USA
| |
Collapse
|
12
|
Molloy K, Saleh S, Shehu A. Probabilistic search and energy guidance for biased decoy sampling in ab initio protein structure prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:1162-1175. [PMID: 24384705 DOI: 10.1109/tcbb.2013.29] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Adequate sampling of the conformational space is a central challenge in ab initio protein structure prediction. In the absence of a template structure, a conformational search procedure guided by an energy function explores the conformational space, gathering an ensemble of low-energy decoy conformations. If the sampling is inadequate, the native structure may be missed altogether. Even if reproduced, a subsequent stage that selects a subset of decoys for further structural detail and energetic refinement may discard near-native decoys if they are high energy or insufficiently represented in the ensemble. Sampling should produce a decoy ensemble that facilitates the subsequent selection of near-native decoys. In this paper, we investigate a robotics-inspired framework that allows directly measuring the role of energy in guiding sampling. Testing demonstrates that a soft energy bias steers sampling toward a diverse decoy ensemble less prone to exploiting energetic artifacts and thus more likely to facilitate retainment of near-native conformations by selection techniques. We employ two different energy functions, the associative memory Hamiltonian with water and Rosetta. Results show that enhanced sampling provides a rigorous testing of energy functions and exposes different deficiencies in them, thus promising to guide development of more accurate representations and energy functions.
Collapse
|
13
|
Basin Hopping as a General and Versatile Optimization Framework for the Characterization of Biological Macromolecules. ACTA ACUST UNITED AC 2012. [DOI: 10.1155/2012/674832] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Since its introduction, the basin hopping (BH) framework has proven useful for hard nonlinear optimization problems with multiple variables and modalities. Applications span a wide range, from packing problems in geometry to characterization of molecular states in statistical physics. BH is seeing a reemergence in computational structural biology due to its ability to obtain a coarse-grained representation of
the protein energy surface in terms of local minima. In this paper, we show that the BH framework is general and versatile, allowing to address problems related to the characterization of protein structure, assembly, and motion due to its fundamental ability to sample minima in a high-dimensional variable space. We show how specific implementations of the main components in BH yield algorithmic realizations that attain state-of-the-art results in the context of ab initio protein structure prediction and rigid protein-protein docking. We also show that BH can map intermediate minima related with motions connecting diverse stable functionally relevant states in a protein molecule,
thus serving as a first step towards the characterization of transition trajectories connecting these states.
Collapse
|
14
|
Maadooliat M, Gao X, Huang JZ. Assessing protein conformational sampling methods based on bivariate lag-distributions of backbone angles. Brief Bioinform 2012; 14:724-36. [PMID: 22926831 DOI: 10.1093/bib/bbs052] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Despite considerable progress in the past decades, protein structure prediction remains one of the major unsolved problems in computational biology. Angular-sampling-based methods have been extensively studied recently due to their ability to capture the continuous conformational space of protein structures. The literature has focused on using a variety of parametric models of the sequential dependencies between angle pairs along the protein chains. In this article, we present a thorough review of angular-sampling-based methods by assessing three main questions: What is the best distribution type to model the protein angles? What is a reasonable number of components in a mixture model that should be considered to accurately parameterize the joint distribution of the angles? and What is the order of the local sequence-structure dependency that should be considered by a prediction method? We assess the model fits for different methods using bivariate lag-distributions of the dihedral/planar angles. Moreover, the main information across the lags can be extracted using a technique called Lag singular value decomposition (LagSVD), which considers the joint distribution of the dihedral/planar angles over different lags using a nonparametric approach and monitors the behavior of the lag-distribution of the angles using singular value decomposition. As a result, we developed graphical tools and numerical measurements to compare and evaluate the performance of different model fits. Furthermore, we developed a web-tool (http://www.stat.tamu.edu/∼madoliat/LagSVD) that can be used to produce informative animations.
Collapse
Affiliation(s)
- Mehdi Maadooliat
- Mathematical and Computer Sciences and Engineering Division, 4700 King Abdullah University of Science and Technology, Thuwal 23955-6900, Kingdom of Saudi Arabia, . Jianhua Z. Huang, Department of Statistics, 447 Blocker Building, Texas A&M University, 3143 TAMU, College Station, TX 77843-3143 (USA), E-mail:
| | | | | |
Collapse
|
15
|
Chellapa GD, Rose GD. Reducing the dimensionality of the protein-folding search problem. Protein Sci 2012; 21:1231-40. [PMID: 22692765 DOI: 10.1002/pro.2106] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2012] [Revised: 06/04/2012] [Accepted: 06/05/2012] [Indexed: 11/10/2022]
Abstract
How does a folding protein negotiate a vast, featureless conformational landscape and adopt its native structure in biological real time? Motivated by this search problem, we developed a novel algorithm to compare protein structures. Procedures to identify structural analogs are typically conducted in three-dimensional space: the tertiary structure of a target protein is matched against each candidate in a database of structures, and goodness of fit is evaluated by a distance-based measure, such as the root-mean-square distance between target and candidate. This is an expensive approach because three-dimensional space is complex. Here, we transform the problem into a simpler one-dimensional procedure. Specifically, we identify and label the 11 most populated residue basins in a database of high-resolution protein structures. Using this 11-letter alphabet, any protein's three-dimensional structure can be transformed into a one-dimensional string by mapping each residue onto its corresponding basin. Similarity between the resultant basin strings can then be evaluated by conventional sequence-based comparison. The disorder → order folding transition is abridged on both sides. At the onset, folding conditions necessitate formation of hydrogen-bonded scaffold elements on which proteins are assembled, severely restricting the magnitude of accessible conformational space. Near the end, chain topology is established prior to emergence of the close-packed native state. At this latter stage of folding, the chain remains molten, and residues populate natural basins that are approximated by the 11 basins derived here. In essence, our algorithm reduces the protein-folding search problem to mapping the amino acid sequence onto a restricted basin string.
Collapse
Affiliation(s)
- George D Chellapa
- TC Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | | |
Collapse
|
16
|
Sun JM, Li TH, Cong PS, Tang SN, Xiong WW. Retrieving backbone string neighbors provides insights into structural modeling of membrane proteins. Mol Cell Proteomics 2012; 11:M111.016808. [PMID: 22415040 DOI: 10.1074/mcp.m111.016808] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Identification of protein structural neighbors to a query is fundamental in structure and function prediction. Here we present BS-align, a systematic method to retrieve backbone string neighbors from primary sequences as templates for protein modeling. The backbone conformation of a protein is represented by the backbone string, as defined in Ramachandran space. The backbone string of a query can be accurately predicted by two innovative technologies: a knowledge-driven sequence alignment and encoding of a backbone string element profile. Then, the predicted backbone string is employed to align against a backbone string database and retrieve a set of backbone string neighbors. The backbone string neighbors were shown to be close to native structures of query proteins. BS-align was successfully employed to predict models of 10 membrane proteins with lengths ranging between 229 and 595 residues, and whose high-resolution structural determinations were difficult to elucidate both by experiment and prediction. The obtained TM-scores and root mean square deviations of the models confirmed that the models based on the backbone string neighbors retrieved by the BS-align were very close to the native membrane structures although the query and the neighbor shared a very low sequence identity. The backbone string system represents a new road for the prediction of protein structure from sequence, and suggests that the similarity of the backbone string would be more informative than describing a protein as belonging to a fold.
Collapse
Affiliation(s)
- Jiang-Ming Sun
- Department of Chemistry, Tongji University, 1239 Siping Road, Shanghai 200092, China
| | | | | | | | | |
Collapse
|
17
|
Perskie LL, Rose GD. Physical-chemical determinants of coil conformations in globular proteins. Protein Sci 2010; 19:1127-36. [PMID: 20512968 DOI: 10.1002/pro.399] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
We present a method with the potential to generate a library of coil segments from first principles. Proteins are built from alpha-helices and/or beta-strands interconnected by these coil segments. Here, we investigate the conformational determinants of short coil segments, with particular emphasis on chain turns. Toward this goal, we extracted a comprehensive set of two-, three-, and four-residue turns from X-ray-elucidated proteins and classified them by conformation. A remarkably small number of unique conformers account for most of this experimentally determined set, whereas remaining members span a large number of rare conformers, many occurring only once in the entire protein database. Factors determining conformation were identified via Metropolis Monte Carlo simulations devised to test the effectiveness of various energy terms. Simulated structures were validated by comparison to experimental counterparts. After filtering rare conformers, we found that 98% of the remaining experimentally determined turn population could be reproduced by applying a hydrogen bond energy term to an exhaustively generated ensemble of clash-free conformers in which no backbone polar group lacks a hydrogen-bond partner. Further, at least 90% of longer coil segments, ranging from 5- to 20 residues, were found to be structural composites of these shorter primitives. These results are pertinent to protein structure prediction, where approaches can be divided into either empirical or ab initio methods. Empirical methods use database-derived information; ab initio methods rely on physical-chemical principles exclusively. Replacing the database-derived coil library with one generated from first principles would transform any empirically based method into its corresponding ab initio homologue.
Collapse
Affiliation(s)
- Lauren L Perskie
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | | |
Collapse
|
18
|
DeBartolo J, Hocky G, Wilde M, Xu J, Freed KF, Sosnick TR. Protein structure prediction enhanced with evolutionary diversity: SPEED. Protein Sci 2010; 19:520-34. [PMID: 20066664 DOI: 10.1002/pro.330] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
For naturally occurring proteins, similar sequence implies similar structure. Consequently, multiple sequence alignments (MSAs) often are used in template-based modeling of protein structure and have been incorporated into fragment-based assembly methods. Our previous homology-free structure prediction study introduced an algorithm that mimics the folding pathway by coupling the formation of secondary and tertiary structure. Moves in the Monte Carlo procedure involve only a change in a single pair of phi,psi backbone dihedral angles that are obtained from a Protein Data Bank-based distribution appropriate for each amino acid, conditional on the type and conformation of the flanking residues. We improve this method by using MSAs to enrich the sampling distribution, but in a manner that does not require structural knowledge of any protein sequence (i.e., not homologous fragment insertion). In combination with other tools, including clustering and refinement, the accuracies of the predicted secondary and tertiary structures are substantially improved and a global and position-resolved measure of confidence is introduced for the accuracy of the predictions. Performance of the method in the Critical Assessment of Structure Prediction (CASP8) is discussed.
Collapse
Affiliation(s)
- Joe DeBartolo
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois 60637, USA
| | | | | | | | | | | |
Collapse
|
19
|
Guiding the Search for Native-like Protein Conformations with an Ab-initio Tree-based Exploration. Int J Rob Res 2010. [DOI: 10.1177/0278364910371527] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
In this paper we propose a robotics-inspired method to enhance sampling of native-like conformations when employing only aminoacid sequence information for a protein at hand. Computing such conformations, essential to associating structural and functional information with gene sequences, is challenging due to the high-dimensionality and the rugged energy surface of the protein conformational space. The contribution of this paper is a novel two-layered method to enhance the sampling of geometrically distinct low-energy conformations at a coarse-grained level of detail. The method grows a tree in conformational space reconciling two goals: (i) guiding the tree towards lower energies; and (ii) not oversampling geometrically similar conformations. Discretizations of the energy surface and a low-dimensional projection space are employed to select more often for expansion low-energy conformations in under-explored regions of the conformational space. The tree is expanded with low-energy conformations through a Metropolis Monte Carlo framework that uses a move set of physical fragment configurations. Testing on sequences of eight small-to-medium structurally diverse proteins shows that the method rapidly samples native-like conformations in a few hours on a single CPU. Analysis shows that computed conformations are good candidates for further detailed energetic refinements by larger studies in protein engineering and design.
Collapse
|
20
|
Electrostatic solvation energy for two oppositely charged ions in a solvated protein system: salt bridges can stabilize proteins. Biophys J 2010; 98:470-7. [PMID: 20141761 DOI: 10.1016/j.bpj.2009.10.031] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2009] [Revised: 10/21/2009] [Accepted: 10/22/2009] [Indexed: 11/23/2022] Open
Abstract
Born-type electrostatic continuum methods have been an indispensable ingredient in a variety of implicit-solvent methods that reduce computational effort by orders of magnitude compared to explicit-solvent MD simulations and thus enable treatment using larger systems and/or longer times. An analysis of the limitations and failures of the Born approaches serves as a guide for fundamental improvements without diminishing the importance of prior works. One of the major limitations of the Born theory is the lack of a liquidlike description of the response of solvent dipoles to the electrostatic field of the solute and the changes therein, a feature contained in the continuum Langevin-Debye (LD) model applied here to investigate how Coulombic interactions depend on the location of charges relative to the protein/water boundary. This physically more realistic LD model is applied to study the stability of salt bridges. When compared head to head using the same (independently measurable) physical parameters (radii, dielectric constants, etc.), the LD model is in good agreement with observations, whereas the Born model is grossly in error. Our calculations also suggest that a salt bridge on the protein's surface can be stabilizing when the charge separation is < or =4 A.
Collapse
|
21
|
Abstract
Motivation: Rapid methods for protein structure search enable biological discoveries based on flexibly defined structural similarity, unleashing the power of the ever greater number of solved protein structures. Projection methods show promise for the development of fast structural database search solutions. Projection methods map a structure to a point in a high-dimensional space and compare two structures by measuring distance between their projected points. These methods offer a tremendous increase in speed over residue-level structural alignment methods. However, current projection methods are not practical, partly because they are unable to identify local similarities. Results: We propose a new projection-based approach that can rapidly detect global as well as local structural similarities. Local structural search is enabled by a topology-inspired writhe decomposition protocol that produces a small number of fragments while ensuring that similar structures are cut in a similar manner. In benchmark tests, we show that our method, writher, improves accuracy over existing projection methods in terms of recognizing scop domains out of multi-domain proteins, while maintaining accuracy comparable with existing projection methods in a standard single-domain benchmark test. Availability: The source code is available at the following website: http://compbio.berkeley.edu/proj/writher/ Contact:dzhi@compbio.berkeley.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Degui Zhi
- Department of Plant and Microbial Biology, UC Berkeley and Physical Biosciences Division, LBNL, Berkeley, CA 94720, USA.
| | | | | |
Collapse
|
22
|
Faraggi E, Yang Y, Zhang S, Zhou Y. Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction. Structure 2010; 17:1515-27. [PMID: 19913486 DOI: 10.1016/j.str.2009.09.006] [Citation(s) in RCA: 91] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2009] [Revised: 09/01/2009] [Accepted: 09/03/2009] [Indexed: 11/30/2022]
Abstract
Local structures predicted from protein sequences are used extensively in every aspect of modeling and prediction of protein structure and function. For more than 50 years, they have been predicted at a low-resolution coarse-grained level (e.g., three-state secondary structure). Here, we combine a two-state classifier with real-value predictor to predict local structure in continuous representation by backbone torsion angles. The accuracy of the angles predicted by this approach is close to that derived from NMR chemical shifts. Their substitution for predicted secondary structure as restraints for ab initio structure prediction doubles the success rate. This result demonstrates the potential of predicted local structure for fragment-free tertiary-structure prediction. It further implies potentially significant benefits from using predicted real-valued torsion angles as a replacement for or supplement to the secondary-structure prediction tools used almost exclusively in many computational methods ranging from sequence alignment to function prediction.
Collapse
Affiliation(s)
- Eshel Faraggi
- Indiana University School of Informatics, Indiana University-Purdue University and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | | | | | | |
Collapse
|
23
|
Wolff K, Vendruscolo M, Porto M. Efficient identification of near-native conformations in ab initio protein structure prediction using structural profiles. Proteins 2010; 78:249-58. [PMID: 19701942 DOI: 10.1002/prot.22533] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
One of the major bottlenecks in many ab initio protein structure prediction methods is currently the selection of a small number of candidate structures for high-resolution refinement from large sets of low-resolution decoys. This step often includes a scoring by low-resolution energy functions and a clustering of conformations by their pairwise root mean square deviations (RMSDs). As an efficient selection is crucial to reduce the overall computational cost of the predictions, any improvement in this direction can increase the overall performance of the predictions and the range of protein structures that can be predicted. We show here that the use of structural profiles, which can be predicted with good accuracy from the amino acid sequences of proteins, provides an efficient means to identify good candidate structures.
Collapse
Affiliation(s)
- Katrin Wolff
- Institut für Festkörperphysik, Technische Universität Darmstadt, 64289 Darmstadt, Germany
| | | | | |
Collapse
|
24
|
Zhou T, Shu N, Hovmöller S. A novel method for accurate one-dimensional protein structure prediction based on fragment matching. Bioinformatics 2009; 26:470-7. [DOI: 10.1093/bioinformatics/btp679] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
25
|
Shehu A, Kavraki LE, Clementi C. Multiscale characterization of protein conformational ensembles. Proteins 2009; 76:837-51. [PMID: 19280604 DOI: 10.1002/prot.22390] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We propose a multiscale exploration method to characterize the conformational space populated by a protein at equilibrium. The method efficiently obtains a large set of equilibrium conformations in two stages: first exploring the entire space at a coarse-grained level of detail, then narrowing a refined exploration to selected low-energy regions. The coarse-grained exploration periodically adds all-atom detail to selected conformations to ensure that the search leads to regions which maintain low energies in all-atom detail. The second stage reconstructs selected low-energy coarse-grained conformations in all-atom detail. A low-dimensional energy landscape associated with all-atom conformations allows focusing the exploration to energy minima and their conformational ensembles. The lowest energy ensembles are enriched with additional all-atom conformations through further multiscale exploration. The lowest energy ensembles obtained from the application of the method to three different proteins correctly capture the known functional states of the considered systems.
Collapse
Affiliation(s)
- Amarda Shehu
- Department of Computer Science, Rice University, Houston, Texas 77005, USA
| | | | | |
Collapse
|
26
|
Robustelli P, Cavalli A, Dobson CM, Vendruscolo M, Salvatella X. Folding of Small Proteins by Monte Carlo Simulations with Chemical Shift Restraints without the Use of Molecular Fragment Replacement or Structural Homology. J Phys Chem B 2009; 113:7890-6. [DOI: 10.1021/jp900780b] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Paul Robustelli
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K., and ICREA and Institute for Research in Biomedicine Barcelona, Baldiri Reixac 10-12, 08028 Barcelona, Spain
| | - Andrea Cavalli
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K., and ICREA and Institute for Research in Biomedicine Barcelona, Baldiri Reixac 10-12, 08028 Barcelona, Spain
| | - Christopher M. Dobson
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K., and ICREA and Institute for Research in Biomedicine Barcelona, Baldiri Reixac 10-12, 08028 Barcelona, Spain
| | - Michele Vendruscolo
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K., and ICREA and Institute for Research in Biomedicine Barcelona, Baldiri Reixac 10-12, 08028 Barcelona, Spain
| | - Xavier Salvatella
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K., and ICREA and Institute for Research in Biomedicine Barcelona, Baldiri Reixac 10-12, 08028 Barcelona, Spain
| |
Collapse
|
27
|
|
28
|
Zhao F, Li S, Sterner BW, Xu J. Discriminative learning for protein conformation sampling. Proteins 2009; 73:228-40. [PMID: 18412258 DOI: 10.1002/prot.22057] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Protein structure prediction without using templates (i.e., ab initio folding) is one of the most challenging problems in structural biology. In particular, conformation sampling poses as a major bottleneck of ab initio folding. This article presents CRFSampler, an extensible protein conformation sampler, built on a probabilistic graphical model Conditional Random Fields (CRFs). Using a discriminative learning method, CRFSampler can automatically learn more than ten thousand parameters quantifying the relationship among primary sequence, secondary structure, and (pseudo) backbone angles. Using only compactness and self-avoiding constraints, CRFSampler can efficiently generate protein-like conformations from primary sequence and predicted secondary structure. CRFSampler is also very flexible in that a variety of model topologies and feature sets can be defined to model the sequence-structure relationship without worrying about parameter estimation. Our experimental results demonstrate that using a simple set of features, CRFSampler can generate decoys with much higher quality than the most recent HMM model.
Collapse
Affiliation(s)
- Feng Zhao
- Toyota Technological Institute at Chicago, Chicago, Illinois, USA
| | | | | | | |
Collapse
|
29
|
Santos J, Sica MP, Buslje CM, Garrote AM, Ermácora MR, Delfino JM. Structural selection of a native fold by peptide recognition. Insights into the thioredoxin folding mechanism. Biochemistry 2009; 48:595-607. [PMID: 19119857 DOI: 10.1021/bi801969w] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Thioredoxins (TRXs) are monomeric alpha/beta proteins with a fold characterized by a central twisted beta-sheet surrounded by alpha-helical elements. The interaction of the C-terminal alpha-helix 5 of TRX against the remainder of the protein involves a close packing of hydrophobic surfaces, offering the opportunity of studying a fine-tuned molecular recognition phenomenon with long-range consequences on the acquisition of tertiary structure. In this work, we focus on the significance of interactions involving residues L94, L99, E101, F102, L103 and L107 on the formation of the noncovalent complex between reduced TRX1-93 and TRX94-108. The conformational status of the system was assessed experimentally by circular dichroism, intrinsic fluorescence emission and enzymic activity; and theoretically by molecular dynamics simulations (MDS). Alterations in tertiary structure of the complexes, resulting as a consequence of site specific mutation, were also examined. To distinguish the effect of alanine scanning mutagenesis on secondary structure stability, the intrinsic helix-forming ability of the mutant peptides was monitored experimentally by far-UV CD spectroscopy upon the addition of 2,2,2-trifluoroethanol, and also theoretically by Monte Carlo conformational search and MDS. This evidence suggests a key role of residues L99, F102 and L103 on the stabilization of the secondary structure of alpha-helix 5, and on the acquisition of tertiary structure upon complex formation. We hypothesize that the transition between a partially folded and a native-like conformation of reduced TRX1-93 would fundamentally depend on the consolidation of a cooperative tertiary unit based on the interaction between alpha-helix 3 and alpha-helix 5.
Collapse
Affiliation(s)
- Javier Santos
- Department of Biological Chemistry and Institute of Biochemistry and Biophysics (IQUIFIB), School of Pharmacy and Biochemistry, University of Buenos Aires, Junín 956, C1113AAD, Buenos Aires, Argentina
| | | | | | | | | | | |
Collapse
|
30
|
Li SC, Bu D, Xu J, Li M. Fragment-HMM: a new approach to protein structure prediction. Protein Sci 2008; 17:1925-34. [PMID: 18723665 DOI: 10.1110/ps.036442.108] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
We designed a simple position-specific hidden Markov model to predict protein structure. Our new framework naturally repeats itself to converge to a final target, conglomerating fragment assembly, clustering, target selection, refinement, and consensus, all in one process. Our initial implementation of this theory converges to within 6 A of the native structures for 100% of decoys on all six standard benchmark proteins used in ROSETTA (discussed by Simons and colleagues in a recent paper), which achieved only 14%-94% for the same data. The qualities of the best decoys and the final decoys our theory converges to are also notably better.
Collapse
Affiliation(s)
- Shuai Cheng Li
- David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario N2L3G1, Canada
| | | | | | | |
Collapse
|
31
|
Influence of nonlinear electrostatics on transfer energies between liquid phases: charge burial is far less expensive than Born model. Proc Natl Acad Sci U S A 2008; 105:11146-51. [PMID: 18678891 DOI: 10.1073/pnas.0804506105] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The widely used Born model describes the electrostatic response of continuous media using static dielectric constants. However, when applied to a liquid environment, a comparison of Born model predictions with experimental values (e.g., transfer free energies and pK(a) shifts) found that agreement is only achieved by using physically unrealistic dielectric constants for proteins, lipids, etc., and/or equally unrealistic atomic radii. This leads to questions concerning the physical origins for this failure of the Born model. We partially resolve this question by applying the Langevin-Debye (LD) model of a continuous distribution of point, polarizable dipoles, a model that contains an added dependence of the electrostatic response on the solvent's optical dielectric constant and both gas- and liquid-phase dipole moments, features absent in the Born model to which the LD model reduces for weak fields. The LD model is applied to simple representations of three biologically relevant systems: (i) globular proteins, (ii) lipid bilayers, and (iii) membrane proteins. The linear Born treatment greatly overestimates both the self-energy and the transfer free energy from water to hydrophobic environments (e.g., a protein interior). By using the experimental dielectric constant, the energy cost of charge burial in either globular or membrane proteins of the Born model is reduced by almost 50% with the nonlinear theory as is the pK(a) shift, and the shifts agree well with experimental trends.
Collapse
|
32
|
Malkov SN, Zivković MV, Beljanski MV, Hall MB, Zarić SD. A reexamination of the propensities of amino acids towards a particular secondary structure: classification of amino acids based on their chemical structure. J Mol Model 2008; 14:769-75. [PMID: 18504624 DOI: 10.1007/s00894-008-0313-0] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2007] [Accepted: 04/08/2008] [Indexed: 10/22/2022]
Abstract
The correlation between the primary and secondary structures of proteins was analysed using a large data set from the Protein Data Bank. Clear preferences of amino acids towards certain secondary structures classify amino acids into four groups: alpha-helix preferrers, strand preferrers, turn and bend preferrers, and His and Cys (the latter two amino acids show no clear preference for any secondary structure). Amino acids in the same group have similar structural characteristics at their Cbeta and Cgamma atoms that predicts their preference for a particular secondary structure. All alpha-helix preferrers have neither polar heteroatoms on Cbeta and Cgamma atoms, nor branching or aromatic group on the Cbeta atom. All strand preferrers have aromatic groups or branching groups on the Cbeta atom. All turn and bend preferrers have a polar heteroatom on the Cbeta or Cgamma atoms or do not have a Cbeta atom at all. These new rules could be helpful in making predictions about non-natural amino acids.
Collapse
Affiliation(s)
- Sasa N Malkov
- Department of Mathematics, University of Belgrade, Studentski trg 16, 11000, Belgrade, Serbia
| | | | | | | | | |
Collapse
|
33
|
Perskie LL, Street TO, Rose GD. Structures, basins, and energies: a deconstruction of the Protein Coil Library. Protein Sci 2008; 17:1151-61. [PMID: 18434497 DOI: 10.1110/ps.035055.108] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Globular proteins adopt complex folds, composed of organized assemblies of alpha-helix and beta-sheet together with irregular regions that interconnect these scaffold elements. Here, we seek to parse the irregular regions into their structural constituents and to rationalize their formative energetics. Toward this end, we dissected the Protein Coil Library, a structural database of protein segments that are neither alpha-helix nor beta-strand, extracted from high-resolution protein structures. The backbone dihedral angles of residues from coil library segments are distributed indiscriminately across the phi,psi map, but when contoured, seven distinct basins emerge clearly. The structures and energetics associated with the two least-studied basins are the primary focus of this article. Specifically, the structural motifs associated with these basins were characterized in detail and then assessed in simple simulations designed to capture their energetic determinants. It is found that conformational constraints imposed by excluded volume and hydrogen bonding are sufficient to reproduce the observed ,psi distributions of these motifs; no additional energy terms are required. These three motifs in conjunction with alpha-helices, strands of beta-sheet, canonical beta-turns, and polyproline II conformers comprise approximately 90% of all protein structure.
Collapse
Affiliation(s)
- Lauren L Perskie
- TC Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | | | | |
Collapse
|
34
|
Assessing the solvent-dependent surface area of unfolded proteins using an ensemble model. Proc Natl Acad Sci U S A 2008; 105:3321-6. [PMID: 18305164 DOI: 10.1073/pnas.0712240105] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We present a physically rigorous method to calculate solvent-dependent accessible surface areas (ASAs) of amino acid residues in unfolded proteins. ASA values will be larger in a good solvent, where solute-solvent interactions dominate and promote chain extension. Conversely, they will be smaller in a poor solvent, where solute-solute interactions dominate and promote chain collapse. In the method described here, these solvent-dependent effects are modeled by Boltzmann-weighting a simulated ensemble for solvent quality-good or poor. Solvent quality is parameterized as intramolecular hydrogen bond strength, using a "hydrogen bond dial" that can be varied from "off" to "high" (i.e., from 0 to -6 kcal/mol per hydrogen bond). When plotted as a function of hydrogen bond strength, the Boltzmann-weighted distribution of conformers describes a sigmoidal curve, with a transition midpoint near 1.5 kcal/mol per hydrogen bond. ASA tables for the 20 residues are provided under good solvent conditions and at this transition midpoint. For the backbone, these midpoint ASA values are found to be in good agreement with the earlier estimate of unfolded state ASA given by the mean of Creamer's upper and lower bounds [Creamer TP, et al. (1997) Biochemistry 36:2832-2835], a gratifying result in that cosolvents of experimental interest, such as urea (good solvent) and trimethylamine N-oxide (poor solvent), are known to affect the backbone predominantly. Unanticipated results from our simulations predict that a significant population of three-residue, hydrogen-bonded turns (inverse gamma-turns) will be detectable in blocked polyalanyl heptamers in poor solvent-an experimentally verifiable conjecture.
Collapse
|
35
|
Chen K, Liu Z, Zhou C, Bracken WC, Kallenbach NR. Spin relaxation enhancement confirms dominance of extended conformations in short alanine peptides. Angew Chem Int Ed Engl 2008; 46:9036-9. [PMID: 17943945 DOI: 10.1002/anie.200703376] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Kang Chen
- Laboratory of Molecular Biophysics, National Heart, Lung and Blood Institute, National Institutes of Health, 50 South Drive, Bethesda, MD 20892, USA
| | | | | | | | | |
Collapse
|
36
|
Chen K, Liu Z, Zhou C, Bracken W, Kallenbach N. Spin Relaxation Enhancement Confirms Dominance of Extended Conformations in Short Alanine Peptides. Angew Chem Int Ed Engl 2007. [DOI: 10.1002/ange.200703376] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
37
|
Gong H, Shen Y, Rose GD. Building native protein conformation from NMR backbone chemical shifts using Monte Carlo fragment assembly. Protein Sci 2007; 16:1515-21. [PMID: 17656574 PMCID: PMC2203357 DOI: 10.1110/ps.072988407] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
We have been analyzing the extent to which protein secondary structure determines protein tertiary structure in simple protein folds. An earlier paper demonstrated that three-dimensional structure can be obtained successfully using only highly approximate backbone torsion angles for every residue. Here, the initial information is further diluted by introducing a realistic degree of experimental uncertainty into this process. In particular, we tackle the practical problem of determining three-dimensional structure solely from backbone chemical shifts, which can be measured directly by NMR and are known to be correlated with a protein's backbone torsion angles. Extending our previous algorithm to incorporate these experimentally determined data, clusters of structures compatible with the experimentally determined chemical shifts were generated by fragment assembly Monte Carlo. The cluster that corresponds to the native conformation was then identified based on four energy terms: steric clash, solvent-squeezing, hydrogen-bonding, and hydrophobic contact. Currently, the method has been applied successfully to five small proteins with simple topology. Although still under development, this approach offers promise for high-throughput NMR structure determination.
Collapse
Affiliation(s)
- Haipeng Gong
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | | | | |
Collapse
|
38
|
Cavalli A, Salvatella X, Dobson CM, Vendruscolo M. Protein structure determination from NMR chemical shifts. Proc Natl Acad Sci U S A 2007; 104:9615-20. [PMID: 17535901 PMCID: PMC1887584 DOI: 10.1073/pnas.0610313104] [Citation(s) in RCA: 406] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2006] [Indexed: 11/18/2022] Open
Abstract
NMR spectroscopy plays a major role in the determination of the structures and dynamics of proteins and other biological macromolecules. Chemical shifts are the most readily and accurately measurable NMR parameters, and they reflect with great specificity the conformations of native and nonnative states of proteins. We show, using 11 examples of proteins representative of the major structural classes and containing up to 123 residues, that it is possible to use chemical shifts as structural restraints in combination with a conventional molecular mechanics force field to determine the conformations of proteins at a resolution of 2 angstroms or better. This strategy should be widely applicable and, subject to further development, will enable quantitative structural analysis to be carried out to address a range of complex biological problems not accessible to current structural techniques.
Collapse
Affiliation(s)
- Andrea Cavalli
- Department of Chemistry, Cambridge University, Cambridge CB2 1EW, United Kingdom
| | - Xavier Salvatella
- Department of Chemistry, Cambridge University, Cambridge CB2 1EW, United Kingdom
| | | | - Michele Vendruscolo
- Department of Chemistry, Cambridge University, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
39
|
Rose GD, Fleming PJ, Banavar JR, Maritan A. A backbone-based theory of protein folding. Proc Natl Acad Sci U S A 2006; 103:16623-33. [PMID: 17075053 PMCID: PMC1636505 DOI: 10.1073/pnas.0606843103] [Citation(s) in RCA: 344] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Under physiological conditions, a protein undergoes a spontaneous disorder order transition called "folding." The protein polymer is highly flexible when unfolded but adopts its unique native, three-dimensional structure when folded. Current experimental knowledge comes primarily from thermodynamic measurements in solution or the structures of individual molecules, elucidated by either x-ray crystallography or NMR spectroscopy. From the former, we know the enthalpy, entropy, and free energy differences between the folded and unfolded forms of hundreds of proteins under a variety of solvent/cosolvent conditions. From the latter, we know the structures of approximately 35,000 proteins, which are built on scaffolds of hydrogen-bonded structural elements, alpha-helix and beta-sheet. Anfinsen showed that the amino acid sequence alone is sufficient to determine a protein's structure, but the molecular mechanism responsible for self-assembly remains an open question, probably the most fundamental open question in biochemistry. This perspective is a hybrid: partly review, partly proposal. First, we summarize key ideas regarding protein folding developed over the past half-century and culminating in the current mindset. In this view, the energetics of side-chain interactions dominate the folding process, driving the chain to self-organize under folding conditions. Next, having taken stock, we propose an alternative model that inverts the prevailing side-chain/backbone paradigm. Here, the energetics of backbone hydrogen bonds dominate the folding process, with preorganization in the unfolded state. Then, under folding conditions, the resultant fold is selected from a limited repertoire of structural possibilities, each corresponding to a distinct hydrogen-bonded arrangement of alpha-helices and/or strands of beta-sheet.
Collapse
Affiliation(s)
- George D Rose
- T. C. Jenkins Department of Biophysics,The Johns Hopkins University, Jenkins Hall, 3400 North Charles Street, Baltimore, MD 21218, USA.
| | | | | | | |
Collapse
|
40
|
Colubri A, Jha AK, Shen MY, Sali A, Berry RS, Sosnick TR, Freed KF. Minimalist representations and the importance of nearest neighbor effects in protein folding simulations. J Mol Biol 2006; 363:835-57. [PMID: 16982067 DOI: 10.1016/j.jmb.2006.08.035] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2006] [Revised: 07/18/2006] [Accepted: 08/16/2006] [Indexed: 10/24/2022]
Abstract
In order to investigate the level of representation required to simulate folding and predict structure, we test the ability of a variety of reduced representations to identify native states in decoy libraries and to recover the native structure given the advanced knowledge of the very broad native Ramachandran basin assignments. Simplifications include the removal of the entire side-chain or the retention of only the Cbeta atoms. Scoring functions are derived from an all-atom statistical potential that distinguishes between atoms and different residue types. Structures are obtained by minimizing the scoring function with a computationally rapid simulated annealing algorithm. Results are compared for simulations in which backbone conformations are sampled from a Protein Data Bank-based backbone rotamer library generated by either ignoring or including a dependence on the identity and conformation of the neighboring residues. Only when the Cbeta atoms and nearest neighbor effects are included do the lowest energy structures generally fall within 4 A of the native backbone root-mean square deviation (RMSD), despite the initial configuration being highly expanded with an average RMSD > or = 10 A. The side-chains are reinserted into the Cbeta models with minimal steric clash. Therefore, the detailed, all-atom information lost in descending to a Cbeta-level representation is recaptured to a large measure using backbone dihedral angle sampling that includes nearest neighbor effects and an appropriate scoring function.
Collapse
Affiliation(s)
- Andrés Colubri
- Department of Chemistry, The University of Chicago, Chicago, IL 60637, USA
| | | | | | | | | | | | | |
Collapse
|
41
|
Möglich A, Joder K, Kiefhaber T. End-to-end distance distributions and intrachain diffusion constants in unfolded polypeptide chains indicate intramolecular hydrogen bond formation. Proc Natl Acad Sci U S A 2006; 103:12394-9. [PMID: 16894178 PMCID: PMC1567890 DOI: 10.1073/pnas.0604748103] [Citation(s) in RCA: 202] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2006] [Indexed: 11/18/2022] Open
Abstract
Characterization of the unfolded state is essential for the understanding of the protein folding reaction. We performed time-resolved FRET measurements to gain information on the dimensions and the internal dynamics of unfolded polypeptide chains. Using an approach based on global analysis of data obtained from two different donor-acceptor pairs allowed for the determination of distance distribution functions and diffusion constants between the chromophores. Results on a polypeptide chain consisting of 16 Gly-Ser repeats between the FRET chromophores reveal an increase in the average end-to-end distance from 18.9 to 39.2 Angstrom between 0 and 8 M GdmCl. The increase in chain dimensions is accompanied by an increase in the end-to-end diffusion constant from (3.6 +/- 1.0) x 10(-7) cm(2) s(-1) in water to (14.8 +/- 2.5) x 10(-7) cm(2) s(-1) in 8 M GdmCl. This finding suggests that intrachain interactions in water exist even in very flexible chains lacking hydrophobic groups, which indicates intramolecular hydrogen bond formation. The interactions are broken upon denaturant binding, which leads to increased chain flexibility and longer average end-to-end distances. This finding implies that rapid collapse of polypeptide chains during refolding of denaturant-unfolded proteins is an intrinsic property of polypeptide chains and can, at least in part, be ascribed to nonspecific intramolecular hydrogen bonding. Despite decreased intrachain diffusion constants, the conformational search is accelerated in the collapsed state because of shorter diffusion distances. The measured distance distribution functions and diffusion constants in combination with Szabo-Schulten-Schulten theory were able to reproduce experimentally determined rate constants for end-to-end loop formation.
Collapse
Affiliation(s)
- Andreas Möglich
- Division of Biophysical Chemistry, Biozentrum der Universität Basel, Klingelbergstrasse 70, CH-4056 Basel, Switzerland
| | - Karin Joder
- Division of Biophysical Chemistry, Biozentrum der Universität Basel, Klingelbergstrasse 70, CH-4056 Basel, Switzerland
| | - Thomas Kiefhaber
- Division of Biophysical Chemistry, Biozentrum der Universität Basel, Klingelbergstrasse 70, CH-4056 Basel, Switzerland
| |
Collapse
|
42
|
Abstract
Using a test set of 13 small, compact proteins, we demonstrate that a remarkably simple protocol can capture native topology from secondary structure information alone, in the absence of long-range interactions. It has been a long-standing open question whether such information is sufficient to determine a protein's fold. Indeed, even the far simpler problem of reconstructing the three-dimensional structure of a protein from its exact backbone torsion angles has remained a difficult challenge owing to the small, but cumulative, deviations from ideality in backbone planarity, which, if ignored, cause large errors in structure. As a familiar example, a small change in an elbow angle causes a large displacement at the end of your arm; the longer the arm, the larger the displacement. Here, correct secondary structure assignments (alpha-helix, beta-strand, beta-turn, polyproline II, coil) were used to constrain polypeptide backbone chains devoid of side chains, and the most stable folded conformations were determined, using Monte Carlo simulation. Just three terms were used to assess stability: molecular compaction, steric exclusion, and hydrogen bonding. For nine of the 13 proteins, this protocol restricts the main chain to a surprisingly small number of energetically favorable topologies, with the native one prominent among them.
Collapse
Affiliation(s)
- Patrick J Fleming
- TC Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | | | | |
Collapse
|