1
|
Torrisi M, Pollastri G. Brewery: deep learning and deeper profiles for the prediction of 1D protein structure annotations. Bioinformatics 2020; 36:3897-3898. [PMID: 32207516 DOI: 10.1093/bioinformatics/btaa204] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 03/17/2020] [Accepted: 03/19/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Protein structural annotations (PSAs) are essential abstractions to deal with the prediction of protein structures. Many increasingly sophisticated PSAs have been devised in the last few decades. However, the need for annotations that are easy to compute, process and predict has not diminished. This is especially true for protein structures that are hardest to predict, such as novel folds. RESULTS We propose Brewery, a suite of ab initio predictors of 1D PSAs. Brewery uses multiple sources of evolutionary information to achieve state-of-the-art predictions of secondary structure, structural motifs, relative solvent accessibility and contact density. AVAILABILITY AND IMPLEMENTATION The web server, standalone program, Docker image and training sets of Brewery are available at http://distilldeep.ucd.ie/brewery/. CONTACT gianluca.pollastri@ucd.ie.
Collapse
Affiliation(s)
- Mirko Torrisi
- School of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Gianluca Pollastri
- School of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland
| |
Collapse
|
2
|
Wang Z, Zhou X, Zuo G. EspcTM: Kinetic Transition Network Based on Trajectory Mapping in Effective Energy Rescaling Space. Front Mol Biosci 2020; 7:589718. [PMID: 33195438 PMCID: PMC7653181 DOI: 10.3389/fmolb.2020.589718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 09/24/2020] [Indexed: 11/27/2022] Open
Abstract
The transition network provides a key to reveal the thermodynamic and kinetic properties of biomolecular systems. In this paper, we introduce a new method, named effective energy rescaling space trajectory mapping (EspcTM), to detect metastable states and construct transition networks based on the simulation trajectories of the complex biomolecular system. It mapped simulation trajectories into an orthogonal function space, whose bases were rescaled by effective energy, and clustered the interrelation between these trajectories to locate metastable states. By using the EspcTM method, we identified the metastable states and elucidated interstate transition kinetics of a Brownian particle and a dodecapeptide. It was found that the scaling parameters of effective energy also provided a clue to the dominating factors in dynamics. We believe that the EspcTM method is a useful tool for the studies of dynamics of the complex system and may provide new insight into the understanding of thermodynamics and kinetics of biomolecular systems.
Collapse
Affiliation(s)
- Zhenyu Wang
- T-Life Research Center, State Key Laboratory of Surface Physics, Department of Physics, Fudan University, Shanghai, China
| | - Xin Zhou
- School of Physical Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Guanghong Zuo
- T-Life Research Center, State Key Laboratory of Surface Physics, Department of Physics, Fudan University, Shanghai, China
| |
Collapse
|
3
|
Chen X, Yang B, Lin Z. A random forest learning assisted "divide and conquer" approach for peptide conformation search. Sci Rep 2018; 8:8796. [PMID: 29891960 PMCID: PMC5995823 DOI: 10.1038/s41598-018-27167-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Accepted: 05/17/2018] [Indexed: 11/09/2022] Open
Abstract
Computational determination of peptide conformations is challenging as it is a problem of finding minima in a high-dimensional space. The "divide and conquer" approach is promising for reliably reducing the search space size. A random forest learning model is proposed here to expand the scope of applicability of the "divide and conquer" approach. A random forest classification algorithm is used to characterize the distributions of the backbone φ-ψ units ("words"). A random forest supervised learning model is developed to analyze the combinations of the φ-ψ units ("grammar"). It is found that amino acid residues may be grouped as equivalent "words", while the φ-ψ combinations in low-energy peptide conformations follow a distinct "grammar". The finding of equivalent words empowers the "divide and conquer" method with the flexibility of fragment substitution. The learnt grammar is used to improve the efficiency of the "divide and conquer" method by removing unfavorable φ-ψ combinations without the need of dedicated human effort. The machine learning assisted search method is illustrated by efficiently searching the conformations of GGG/AAA/GGGG/AAAA/GGGGG through assembling the structures of GFG/GFGG. Moreover, the computational cost of the new method is shown to increase rather slowly with the peptide length.
Collapse
Affiliation(s)
- Xin Chen
- Hefei National Laboratory for Physical Sciences at Microscales & CAS Key Laboratory of Strongly-Coupled Quantum Matter Physics, Department of Physics, University of Science and Technology of China, Hefei, 230026, China
| | - Bing Yang
- Hefei National Laboratory for Physical Sciences at Microscales & CAS Key Laboratory of Strongly-Coupled Quantum Matter Physics, Department of Physics, University of Science and Technology of China, Hefei, 230026, China
| | - Zijing Lin
- Hefei National Laboratory for Physical Sciences at Microscales & CAS Key Laboratory of Strongly-Coupled Quantum Matter Physics, Department of Physics, University of Science and Technology of China, Hefei, 230026, China.
| |
Collapse
|
4
|
Zhang C, Yu J, Zhou X. Imaging Metastable States and Transitions in Proteins by Trajectory Map. J Phys Chem B 2017; 121:4678-4686. [PMID: 28425289 DOI: 10.1021/acs.jpcb.7b00664] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
It has been a long-standing and intriguing issue to develop robust methods to identify metastable states and interstate transitions from simulations or experimental data to understand the functional conformational changes of proteins. It is usually hard to define the complicated boundaries of the states in the conformational space using most of the existing methods, and they often lead to parameter-sensitive results. Here, we present a new approach, visualized Trajectory Map (vTM), to identify the metastable states and the rare interstate transitions, by considering both the conformational similarity and the temporal successiveness of conformations. The vTM is able to give a nonambiguous description of slow dynamics. The case study of a β-hairpin peptide shows that the vTM can reveal the states and transitions from all-atom MD trajectory data even when a single observable (i.e, one-dimensional reaction coordinate) is used. We also use the vTM to refine the folding/unfolding mechanism of HP35 in explicit water by analyzing a 125 μs all-atom MD trajectory and obtain folding/unfolding rates of about 1/μs, which are in good agreement with the experimental values.
Collapse
Affiliation(s)
- Chuanbiao Zhang
- School of Physical Sciences, University of Chinese Academy of Sciences , Beijing 100049, China
| | - Jin Yu
- Beijing Computer Science Research Center , Beijing 100193, China
| | - Xin Zhou
- School of Physical Sciences, University of Chinese Academy of Sciences , Beijing 100049, China
| |
Collapse
|
5
|
Abstract
More than two decades of research have enabled dihedral angle predictions at an accuracy that makes them an interesting alternative or supplement to secondary structure prediction that provides detailed local structure information for every residue of a protein. The evolution of dihedral angle prediction methods is closely linked to advancements in machine learning and other relevant technologies. Consequently recent improvements in large-scale training of deep neural networks have led to the best method currently available, which achieves a mean absolute error of 19° for phi, and 30° for psi. This performance opens interesting perspectives for the application of dihedral angle prediction in the comparison, prediction, and design of protein structures.
Collapse
Affiliation(s)
- Olav Zimmermann
- Jülich Supercomputing Centre (JSC), Institute for Advanced Simulation (IAS), Forschungszentrum Jülich GmbH, 52425, Jülich, Germany.
| |
Collapse
|
6
|
Abstract
The potential energy landscape of pentapeptides was mapped in a collective coordinate principal conformational subspace derived from principal component analysis of a nonredundant representative set of protein structures from the PDB. Three pentapeptide sequences that are known to be distinct in terms of their secondary structure characteristics, (Ala)5, (Gly)5, and Val.Asn.Thr.Phe.Val, were considered. Partitioning the landscapes into different energy valleys allowed for calculation of the relative propensities of the peptide secondary structures in a statistical mechanical framework. The distribution of the observed conformations of pentapeptide data showed good correspondence to the topology of the energy landscape of the (Ala)5 sequence where, in accord with reported trends, the α-helix showed a predominant propensity at 298 K. The topography of the landscapes indicates that the stabilization of the α-helix in the (Ala)5 sequence is enthalpic in nature while entropic factors are important for stabilization of the β-sheet in the Val.Asn.Thr.Phe.Val sequence. The results indicate that local interactions within small pentapeptide segments can lead to conformational preference of one secondary structure over the other where account of conformational entropy is important in order to reveal such preference. The method, therefore, can provide critical structural information for ab initio protein folding methods.
Collapse
|
7
|
Singh K, Senthil V, Arokiaraj AWR, Leprince J, Lefranc B, Vaudry D, Allam AA, Ajarem J, Chow BKC. Structure-Activity Relationship Studies of N- and C-Terminally Modified Secretin Analogs for the Human Secretin Receptor. PLoS One 2016; 11:e0149359. [PMID: 26930505 PMCID: PMC4773067 DOI: 10.1371/journal.pone.0149359] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2014] [Accepted: 01/03/2016] [Indexed: 11/18/2022] Open
Abstract
The pleiotropic role of human secretin (hSCT) validates its potential use as a therapeutic agent. Nevertheless, the structure of secretin in complex with its receptor is necessary to develop a suitable therapeutic agent. Therefore, in an effort to design a three-dimensional virtual homology model and identify a peptide agonist and/or antagonist for the human secretin receptor (hSR), the significance of the primary sequence of secretin peptides in allosteric binding and activation was elucidated using virtual docking, FRET competitive binding and assessment of the cAMP response. Secretin analogs containing various N- or C-terminal modifications were prepared based on previous findings of the role of these domains in receptor binding and activation. These analogs exhibited very low or no binding affinity in a virtual model, and were found to neither exhibit in vitro binding nor agonistic or antagonistic properties. A parallel analysis of the analogs in the virtual model and in vitro studies revealed instability of these peptide analogs to bind and activate the receptor.
Collapse
Affiliation(s)
- Kailash Singh
- School of Biological Sciences, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China
| | - Vijayalakshmi Senthil
- School of Biological Sciences, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China
| | | | - Jérôme Leprince
- Laboratory of Neuronal and Neuroendocrine Differentiation and Communication, Neurotrophic Factors and Neuronal Differentiation Team, Inserm U982, Associated International Laboratory Samuel de Champlain, Regional Platform for Cell Imaging of Haute-Normandie (PRIMACEN), University of Rouen, Mont-Saint-Aignan, France
| | - Benjamin Lefranc
- Laboratory of Neuronal and Neuroendocrine Differentiation and Communication, Neurotrophic Factors and Neuronal Differentiation Team, Inserm U982, Associated International Laboratory Samuel de Champlain, Regional Platform for Cell Imaging of Haute-Normandie (PRIMACEN), University of Rouen, Mont-Saint-Aignan, France
| | - David Vaudry
- Laboratory of Neuronal and Neuroendocrine Differentiation and Communication, Neurotrophic Factors and Neuronal Differentiation Team, Inserm U982, Associated International Laboratory Samuel de Champlain, Regional Platform for Cell Imaging of Haute-Normandie (PRIMACEN), University of Rouen, Mont-Saint-Aignan, France
| | - Ahmed A. Allam
- Department of Zoology, College of Science, King Saud University, Riyadh 11451, Saudi Arabia
- Department of Zoology, Faculty of Science, Beni-Suef University, Beni-Suef, Egypt
| | - Jamaan Ajarem
- Department of Zoology, College of Science, King Saud University, Riyadh 11451, Saudi Arabia
| | - Billy K. C. Chow
- School of Biological Sciences, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China
- * E-mail:
| |
Collapse
|
8
|
Kandathil SM, Handl J, Lovell SC. Toward a detailed understanding of search trajectories in fragment assembly approaches to protein structure prediction. Proteins 2016; 84:411-26. [PMID: 26799916 PMCID: PMC4982100 DOI: 10.1002/prot.24987] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Revised: 12/03/2015] [Accepted: 12/31/2015] [Indexed: 11/30/2022]
Abstract
Energy functions, fragment libraries, and search methods constitute three key components of fragment‐assembly methods for protein structure prediction, which are all crucial for their ability to generate high‐accuracy predictions. All of these components are tightly coupled; efficient searching becomes more important as the quality of fragment libraries decreases. Given these relationships, there is currently a poor understanding of the strengths and weaknesses of the sampling approaches currently used in fragment‐assembly techniques. Here, we determine how the performance of search techniques can be assessed in a meaningful manner, given the above problems. We describe a set of techniques that aim to reduce the impact of the energy function, and assess exploration in view of the search space defined by a given fragment library. We illustrate our approach using Rosetta and EdaFold, and show how certain features of these methods encourage or limit conformational exploration. We demonstrate that individual trajectories of Rosetta are susceptible to local minima in the energy landscape, and that this can be linked to non‐uniform sampling across the protein chain. We show that EdaFold's novel approach can help balance broad exploration with locating good low‐energy conformations. This occurs through two mechanisms which cannot be readily differentiated using standard performance measures: exclusion of false minima, followed by an increasingly focused search in low‐energy regions of conformational space. Measures such as ours can be helpful in characterizing new fragment‐based methods in terms of the quality of conformational exploration realized. Proteins 2016; 84:411–426. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Shaun M Kandathil
- Faculty of Life Sciences, the University of Manchester, Manchester, M13 9PL, United Kingdom
| | - Julia Handl
- Alliance Manchester Business School, Faculty of Humanities, the University of Manchester, Manchester, M13 9PL, United Kingdom
| | - Simon C Lovell
- Faculty of Life Sciences, the University of Manchester, Manchester, M13 9PL, United Kingdom
| |
Collapse
|
9
|
Systematically constructing kinetic transition network in polypeptide from top to down: trajectory mapping. PLoS One 2015; 10:e0125932. [PMID: 25962177 PMCID: PMC4427365 DOI: 10.1371/journal.pone.0125932] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Accepted: 03/24/2015] [Indexed: 11/23/2022] Open
Abstract
Molecular dynamics (MD) simulation is an important tool for understanding bio-molecules in microscopic temporal/spatial scales. Besides the demand in improving simulation techniques to approach experimental scales, it becomes more and more crucial to develop robust methodology for precisely and objectively interpreting massive MD simulation data. In our previous work [J Phys Chem B 114, 10266 (2010)], the trajectory mapping (TM) method was presented to analyze simulation trajectories then to construct a kinetic transition network of metastable states. In this work, we further present a top-down implementation of TM to systematically detect complicate features of conformational space. We first look at longer MD trajectory pieces to get a coarse picture of transition network at larger time scale, and then we gradually cut the trajectory pieces in shorter for more details. A robust clustering algorithm is designed to more effectively identify the metastable states and transition events. We applied this TM method to detect the hierarchical structure in the conformational space of alanine-dodeca-peptide from microsecond to nanosecond time scales. The results show a downhill folding process of the peptide through multiple pathways. Even in this simple system, we found that single common-used order parameter is not sufficient either in distinguishing the metastable states or predicting the transition kinetics among these states.
Collapse
|
10
|
Therrien E, Weill N, Tomberg A, Corbeil CR, Lee D, Moitessier N. Docking Ligands into Flexible and Solvated Macromolecules. 7. Impact of Protein Flexibility and Water Molecules on Docking-Based Virtual Screening Accuracy. J Chem Inf Model 2014; 54:3198-210. [DOI: 10.1021/ci500299h] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Eric Therrien
- Department of Chemistry, McGill University, 801 Sherbrooke Street W., Montréal, Québec, Canada H3A 0B8
| | - Nathanael Weill
- Department of Chemistry, McGill University, 801 Sherbrooke Street W., Montréal, Québec, Canada H3A 0B8
| | - Anna Tomberg
- Department of Chemistry, McGill University, 801 Sherbrooke Street W., Montréal, Québec, Canada H3A 0B8
| | - Christopher R. Corbeil
- Department of Chemistry, McGill University, 801 Sherbrooke Street W., Montréal, Québec, Canada H3A 0B8
| | - Devin Lee
- Department of Chemistry, McGill University, 801 Sherbrooke Street W., Montréal, Québec, Canada H3A 0B8
| | - Nicolas Moitessier
- Department of Chemistry, McGill University, 801 Sherbrooke Street W., Montréal, Québec, Canada H3A 0B8
| |
Collapse
|
11
|
Hollingsworth SA, Lewis MC, Berkholz DS, Wong WK, Karplus PA. (φ,ψ)₂ motifs: a purely conformation-based fine-grained enumeration of protein parts at the two-residue level. J Mol Biol 2011; 416:78-93. [PMID: 22198294 DOI: 10.1016/j.jmb.2011.12.022] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2011] [Revised: 12/05/2011] [Accepted: 12/09/2011] [Indexed: 10/14/2022]
Abstract
A deep understanding of protein structure benefits from the use of a variety of classification strategies that enhance our ability to effectively describe local patterns of conformation. Here, we use a clustering algorithm to analyze 76,533 all-trans segments from protein structures solved at 1.2 Å resolution or better to create a purely φ,ψ-based comprehensive empirical categorization of common conformations adopted by two adjacent φ,ψ pairs (i.e., (φ,ψ)(2) motifs). The clustering algorithm works in an origin-shifted four-dimensional space based on the two φ,ψ pairs to yield a parameter-dependent list of (φ,ψ)(2) motifs, in order of their prominence. The results are remarkably distinct from and complementary to the standard hydrogen-bond-centered view of secondary structure. New insights include an unprecedented level of precision in describing the φ,ψ angles of both previously known and novel motifs, ordering of these motifs by their population density, a data-driven recommendation that the standard C(α(i))…C(α(i+3))<7 Å criteria for defining turns be changed to 6.5 Å, identification of β-strand and turn capping motifs, and identification of conformational capping by residues in polypeptide II conformation. We further document that the conformational preferences of a residue are substantially influenced by the conformation of its neighbors, and we suggest that accounting for these dependencies will improve protein modeling accuracy. Although the CUEVAS-4D(r(10)є(14)) 'parts list' presented here is only an initial exploration of the complex (φ,ψ)(2) landscape of proteins, it shows that there is value to be had from this approach, and it opens the door to more in-depth characterizations at the (φ,ψ)(2) level and at higher dimensions.
Collapse
Affiliation(s)
- Scott A Hollingsworth
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, OR 97331, USA
| | | | | | | | | |
Collapse
|
12
|
Maps of protein structure space reveal a fundamental relationship between protein structure and function. Proc Natl Acad Sci U S A 2011; 108:12301-6. [PMID: 21737750 DOI: 10.1073/pnas.1102727108] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
To study the protein structure-function relationship, we propose a method to efficiently create three-dimensional maps of structure space using a very large dataset of > 30,000 Structural Classification of Proteins (SCOP) domains. In our maps, each domain is represented by a point, and the distance between any two points approximates the structural distance between their corresponding domains. We use these maps to study the spatial distributions of properties of proteins, and in particular those of local vicinities in structure space such as structural density and functional diversity. These maps provide a unique broad view of protein space and thus reveal previously undescribed fundamental properties thereof. At the same time, the maps are consistent with previous knowledge (e.g., domains cluster by their SCOP class) and organize in a unified, coherent representation previous observation concerning specific protein folds. To investigate the function-structure relationship, we measure the functional diversity (using the Gene Ontology controlled vocabulary) in local structural vicinities. Our most striking finding is that functional diversity varies considerably across structure space: The space has a highly diverse region, and diversity abates when moving away from it. Interestingly, the domains in this region are mostly alpha/beta structures, which are known to be the most ancient proteins. We believe that our unique perspective of structure space will open previously undescribed ways of studying proteins, their evolution, and the relationship between their structure and function.
Collapse
|
13
|
From the Cover: Simplifying the representation of complex free-energy landscapes using sketch-map. Proc Natl Acad Sci U S A 2011; 108:13023-8. [PMID: 21730167 DOI: 10.1073/pnas.1108486108] [Citation(s) in RCA: 201] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A new scheme, sketch-map, for obtaining a low-dimensional representation of the region of phase space explored during an enhanced dynamics simulation is proposed. We show evidence, from an examination of the distribution of pairwise distances between frames, that some features of the free-energy surface are inherently high-dimensional. This makes dimensionality reduction problematic because the data does not satisfy the assumptions made in conventional manifold learning algorithms We therefore propose that when dimensionality reduction is performed on trajectory data one should think of the resultant embedding as a quickly sketched set of directions rather than a road map. In other words, the embedding tells one about the connectivity between states but does not provide the vectors that correspond to the slow degrees of freedom. This realization informs the development of sketch-map, which endeavors to reproduce the proximity information from the high-dimensionality description in a space of lower dimensionality even when a faithful embedding is not possible.
Collapse
|
14
|
Mooney C, Davey N, Martin AJM, Walsh I, Shields DC, Pollastri G. In silico protein motif discovery and structural analysis. Methods Mol Biol 2011; 760:341-53. [PMID: 21780007 DOI: 10.1007/978-1-61779-176-5_21] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
A wealth of in silico tools is available for protein motif discovery and structural analysis. The aim of this chapter is to collect some of the most common and useful tools and to guide the biologist in their use. A detailed explanation is provided for the use of Distill, a suite of web servers for the prediction of protein structural features and the prediction of full-atom 3D models from a protein sequence. Besides this, we also provide pointers to many other tools available for motif discovery and secondary and tertiary structure prediction from a primary amino acid sequence. The prediction of protein intrinsic disorder and the prediction of functional sites and SLiMs are also briefly discussed. Given that user queries vary greatly in size, scope and character, the trade-offs in speed, accuracy and scale need to be considered when choosing which methods to adopt.
Collapse
Affiliation(s)
- Catherine Mooney
- Complex and Adaptive Systems Laboratory, University College Dublin, Belfield, Dublin 4, Ireland.
| | | | | | | | | | | |
Collapse
|
15
|
Perskie LL, Rose GD. Physical-chemical determinants of coil conformations in globular proteins. Protein Sci 2010; 19:1127-36. [PMID: 20512968 DOI: 10.1002/pro.399] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
We present a method with the potential to generate a library of coil segments from first principles. Proteins are built from alpha-helices and/or beta-strands interconnected by these coil segments. Here, we investigate the conformational determinants of short coil segments, with particular emphasis on chain turns. Toward this goal, we extracted a comprehensive set of two-, three-, and four-residue turns from X-ray-elucidated proteins and classified them by conformation. A remarkably small number of unique conformers account for most of this experimentally determined set, whereas remaining members span a large number of rare conformers, many occurring only once in the entire protein database. Factors determining conformation were identified via Metropolis Monte Carlo simulations devised to test the effectiveness of various energy terms. Simulated structures were validated by comparison to experimental counterparts. After filtering rare conformers, we found that 98% of the remaining experimentally determined turn population could be reproduced by applying a hydrogen bond energy term to an exhaustively generated ensemble of clash-free conformers in which no backbone polar group lacks a hydrogen-bond partner. Further, at least 90% of longer coil segments, ranging from 5- to 20 residues, were found to be structural composites of these shorter primitives. These results are pertinent to protein structure prediction, where approaches can be divided into either empirical or ab initio methods. Empirical methods use database-derived information; ab initio methods rely on physical-chemical principles exclusively. Replacing the database-derived coil library with one generated from first principles would transform any empirically based method into its corresponding ab initio homologue.
Collapse
Affiliation(s)
- Lauren L Perskie
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | | |
Collapse
|
16
|
Gong L, Zhou X. Kinetic Transition Network Based on Trajectory Mapping. J Phys Chem B 2010; 114:10266-76. [DOI: 10.1021/jp100737g] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Linchen Gong
- Asia Pacific Center for Theoretical Physics, Pohang, Gyeongbuk 790-784, Korea, Institute for Advanced Study, Tsinghua University, Beijing 100080, China, and Department of Physics, Pohang University of Science and Technology, Pohang, Gyeongbuk 790-784, Korea
| | - Xin Zhou
- Asia Pacific Center for Theoretical Physics, Pohang, Gyeongbuk 790-784, Korea, Institute for Advanced Study, Tsinghua University, Beijing 100080, China, and Department of Physics, Pohang University of Science and Technology, Pohang, Gyeongbuk 790-784, Korea
| |
Collapse
|
17
|
Manson A, Whitten ST, Ferreon JC, Fox RO, Hilser VJ. Characterizing the role of ensemble modulation in mutation-induced changes in binding affinity. J Am Chem Soc 2009; 131:6785-93. [PMID: 19397330 PMCID: PMC2711448 DOI: 10.1021/ja809133u] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Protein conformational fluctuations are key contributors to biological function, mediating important processes such as enzyme catalysis, molecular recognition, and allosteric signaling. To better understand the role of conformational fluctuations in substrate/ligand recognition, we analyzed, experimentally and computationally, the binding reaction between an SH3 domain and the recognition peptide of its partner protein. The fluctuations in this SH3 domain were enumerated by using an algorithm based on the hard sphere collision model, and the binding energetics resulting from these fluctuations were calculated using a structure-based energy function parametrized to solvent accessible surface areas. Surprisingly, this simple model reproduced the effects of mutations on the experimentally determined SH3 binding energetics, within the uncertainties of the measurements, indicating that conformational fluctuations in SH3, and in particular the RT loop region, are structurally diverse and are well-approximated by the randomly configured states. The mutated positions in SH3 were distant to the binding site and involved Ala and Gly substitutions of solvent exposed positions in the RT loop. To characterize these fluctuations, we applied principal coordinate analysis to the computed ensembles, uncovering the principal modes of conformational variation. It is shown that the observed differences in binding affinity between each mutant, and thus the apparent coupling between the mutated sites, can be described in terms of the changes in these principal modes. These results indicate that dynamic loops in proteins can populate a broad conformational ensemble and that a quantitative understanding of molecular recognition requires consideration of the entire distribution of states.
Collapse
Affiliation(s)
- Anthony Manson
- Department of Biochemistry and Molecular Biology, and Sealy Center for Structural Biology and Biophysics, University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Steven T Whitten
- Department of Biochemistry and Molecular Biology, and Sealy Center for Structural Biology and Biophysics, University of Texas Medical Branch, Galveston, TX 77555, USA
- RedStorm Scientific, Inc., Galveston, TX 77550, USA
| | - Josephine C. Ferreon
- Department of Biochemistry and Molecular Biology, and Sealy Center for Structural Biology and Biophysics, University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Robert O Fox
- Department of Biochemistry and Molecular Biology, and Sealy Center for Structural Biology and Biophysics, University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Vincent J Hilser
- Department of Biochemistry and Molecular Biology, and Sealy Center for Structural Biology and Biophysics, University of Texas Medical Branch, Galveston, TX 77555, USA
| |
Collapse
|
18
|
Mooney C, Pollastri G. Beyond the Twilight Zone: Automated prediction of structural properties of proteins by recursive neural networks and remote homology information. Proteins 2009; 77:181-90. [DOI: 10.1002/prot.22429] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
19
|
Perskie LL, Street TO, Rose GD. Structures, basins, and energies: a deconstruction of the Protein Coil Library. Protein Sci 2008; 17:1151-61. [PMID: 18434497 DOI: 10.1110/ps.035055.108] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Globular proteins adopt complex folds, composed of organized assemblies of alpha-helix and beta-sheet together with irregular regions that interconnect these scaffold elements. Here, we seek to parse the irregular regions into their structural constituents and to rationalize their formative energetics. Toward this end, we dissected the Protein Coil Library, a structural database of protein segments that are neither alpha-helix nor beta-strand, extracted from high-resolution protein structures. The backbone dihedral angles of residues from coil library segments are distributed indiscriminately across the phi,psi map, but when contoured, seven distinct basins emerge clearly. The structures and energetics associated with the two least-studied basins are the primary focus of this article. Specifically, the structural motifs associated with these basins were characterized in detail and then assessed in simple simulations designed to capture their energetic determinants. It is found that conformational constraints imposed by excluded volume and hydrogen bonding are sufficient to reproduce the observed ,psi distributions of these motifs; no additional energy terms are required. These three motifs in conjunction with alpha-helices, strands of beta-sheet, canonical beta-turns, and polyproline II conformers comprise approximately 90% of all protein structure.
Collapse
Affiliation(s)
- Lauren L Perskie
- TC Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | | | | |
Collapse
|
20
|
Shao J, Tanner SW, Thompson N, Cheatham TE. Clustering Molecular Dynamics Trajectories: 1. Characterizing the Performance of Different Clustering Algorithms. J Chem Theory Comput 2007; 3:2312-34. [DOI: 10.1021/ct700119m] [Citation(s) in RCA: 614] [Impact Index Per Article: 36.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jianyin Shao
- Departments of Medicinal Chemistry, Pharmaceutics and Pharmaceutical Chemistry, and Bioengineering, College of Pharmacy, University of Utah, 2000 East 30 South, Skaggs Hall 201, Salt Lake City, Utah 84112
| | - Stephen W. Tanner
- Departments of Medicinal Chemistry, Pharmaceutics and Pharmaceutical Chemistry, and Bioengineering, College of Pharmacy, University of Utah, 2000 East 30 South, Skaggs Hall 201, Salt Lake City, Utah 84112
| | - Nephi Thompson
- Departments of Medicinal Chemistry, Pharmaceutics and Pharmaceutical Chemistry, and Bioengineering, College of Pharmacy, University of Utah, 2000 East 30 South, Skaggs Hall 201, Salt Lake City, Utah 84112
| | - Thomas E. Cheatham
- Departments of Medicinal Chemistry, Pharmaceutics and Pharmaceutical Chemistry, and Bioengineering, College of Pharmacy, University of Utah, 2000 East 30 South, Skaggs Hall 201, Salt Lake City, Utah 84112
| |
Collapse
|
21
|
Altis A, Nguyen PH, Hegger R, Stock G. Dihedral angle principal component analysis of molecular dynamics simulations. J Chem Phys 2007; 126:244111. [PMID: 17614541 DOI: 10.1063/1.2746330] [Citation(s) in RCA: 227] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
It has recently been suggested by Mu et al. [Proteins 58, 45 (2005)] to use backbone dihedral angles instead of Cartesian coordinates in a principal component analysis of molecular dynamics simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a correct separation of internal and overall motion, which was found to be essential for the construction and interpretation of the free energy landscape of a biomolecule undergoing large structural rearrangements. To account for the circular statistics of angular variables, a transformation from the space of dihedral angles {phi(n)} to the metric coordinate space {x(n)=cos phi(n),y(n)=sin phi(n)} was employed. To study the validity and the applicability of the approach, in this work the theoretical foundations underlying the dihedral angle principal component analysis (dPCA) are discussed. It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the construction of the free energy landscape of decaalanine from a 300 ns molecular dynamics simulation, a critical comparison of the various methods is given.
Collapse
Affiliation(s)
- Alexandros Altis
- Institute of Physical and Theoretical Chemistry, J. W. Goethe University, Max-von-Laue-Strasse 7, D-60438 Frankfurt, Germany
| | | | | | | |
Collapse
|
22
|
Morozov AN, Lin SH. Modeling of folding and unfolding mechanisms in alanine-based alpha-helical polypeptides. J Phys Chem B 2007; 110:20555-61. [PMID: 17034243 DOI: 10.1021/jp061781e] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
alpha-Helix formation is known to be opposed by the entropy loss due to the folding and favored by the energy of molecular interactions. However, the underlying mechanism of these factors is still being discussed. Here we have used the experimental and calculation data for short alanine-based peptides embedded in water to model the mechanism of helix folding and unfolding and to calculate microscopically the free energy factors of alanine in the frame of helix coil conformational integrals. Classical helix-coil transition theories take into account the interactions in a peptide chain only if the i, i + 3 peptide bond participates in hydrogen bonding. But quantum mechanical calculations showed that interactions of the i, i + 2 peptide bond play an important role in helix folding too. We also included the short-range repulsive interactions due to molecular steric clashes and the end effects due to polar/hydrogen-bonding interactions at the N and C termini. The helix and coil regions of peptide conformational space were defined using an experimental steric criterion for hydrogen bonding. Arginine helix propensity was discussed and estimated. Monte Carlo numerical simulations of thermodynamics and kinetics for the 21 amino acid alpha-helical polypeptide Ac-A5(AAARA)3A-NMe were carried out and found to be in an agreement with the experimental results.
Collapse
Affiliation(s)
- Alexander N Morozov
- Institute of Atomic and Molecular Sciences, Academia Sinica, PO Box 23-166, Taipei, Taiwan, Republic of China.
| | | |
Collapse
|
23
|
Schlag EW, Sheu SY, Yang DY, Selzle HL, Lin SH. Distal charge transport in peptides. Angew Chem Int Ed Engl 2007; 46:3196-210. [PMID: 17372995 DOI: 10.1002/anie.200601623] [Citation(s) in RCA: 88] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Biological systems often transport charges and reactive processes over substantial distances. Traditional models of chemical kinetics generally do not describe such extreme distal processes. In this Review, an atomistic model for a distal transport of information, which was specifically developed for peptides, is considered. Chemical reactivity is taken as the result of distal effects based on two-step bifunctional kinetics involving unique, very rapid motional properties of peptides in the subpicosecond regime. The bifunctional model suggests highly efficient transport of charge and reactivity in an isolated peptide over a substantial distance; conversely, a very low efficiency in a water environment was found. The model suggests ultrafast transport of charge and reactivity over substantial molecular distances in a peptide environment. Many such domains can be active in a protein.
Collapse
Affiliation(s)
- Edward W Schlag
- Institut für Physikalische und Theoretische Chemie, Technische Universität München, Lichtenbergstrasse 4, 85748 Garching, Germany.
| | | | | | | | | |
Collapse
|
24
|
Schlag E, Sheu SY, Yang DY, Selzle H, Lin S. Distaler Ladungstransport in Peptiden. Angew Chem Int Ed Engl 2007. [DOI: 10.1002/ange.200601623] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
25
|
Nguyen PH. Conformational states and folding pathways of peptides revealed by principal-independent component analyses. Proteins 2007; 67:579-92. [PMID: 17348012 DOI: 10.1002/prot.21317] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Principal component analysis is a powerful method for projecting multidimensional conformational space of peptides or proteins onto lower dimensional subspaces in which the main conformations are present, making it easier to reveal the structures of molecules from e.g. molecular dynamics simulation trajectories. However, the identification of all conformational states is still difficult if the subspaces consist of more than two dimensions. This is mainly due to the fact that the principal components are not independent with each other, and states in the subspaces cannot be visualized. In this work, we propose a simple and fast scheme that allows one to obtain all conformational states in the subspaces. The basic idea is that instead of directly identifying the states in the subspace spanned by principal components, we first transform this subspace into another subspace formed by components that are independent of one other. These independent components are obtained from the principal components by employing the independent component analysis method. Because of independence between components, all states in this new subspace are defined as all possible combinations of the states obtained from each single independent component. This makes the conformational analysis much simpler. We test the performance of the method by analyzing the conformations of the glycine tripeptide and the alanine hexapeptide. The analyses show that our method is simple and quickly reveal all conformational states in the subspaces. The folding pathways between the identified states of the alanine hexapeptide are analyzed and discussed in some detail.
Collapse
Affiliation(s)
- Phuong H Nguyen
- Institute of Physical and Theoretical Chemistry, J. W. Goethe University, Max-von-Laue-Str. 7, D-60438 Frankfurt, Germany.
| |
Collapse
|
26
|
Mooney C, Vullo A, Pollastri G. Protein structural motif prediction in multidimensional phi-psi space leads to improved secondary structure prediction. J Comput Biol 2007; 13:1489-502. [PMID: 17061924 DOI: 10.1089/cmb.2006.13.1489] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
A significant step towards establishing the structure and function of a protein is the prediction of the local conformation of the polypeptide chain. In this article, we present systems for the prediction of three new alphabets of local structural motifs. The motifs are built by applying multidimensional scaling (MDS) and clustering to pair-wise angular distances for multiple phi-psi angle values collected from high-resolution protein structures. The predictive systems, based on ensembles of bidirectional recurrent neural network architectures, and trained on a large non-redundant set of protein structures, achieve 72%, 66%, and 60% correct motif prediction on an independent test set for di-peptides (six classes), tri-peptides (eight classes) and tetra-peptides (14 classes), respectively, 28-30% above baseline statistical predictors. We then build a further system, based on ensembles of two-layered bidirectional recurrent neural networks, to map structural motif predictions into a traditional 3-class (helix, strand, coil) secondary structure. This system achieves 79.5% correct prediction using the "hard" CASP 3-class assignment, and 81.4% with a more lenient assignment, outperforming a sophisticated state-of-the-art predictor (Porter) trained in the same experimental conditions. The structural motif predictor is publicly available at: http://distill.ucd.ie/porter+/.
Collapse
Affiliation(s)
- Catherine Mooney
- School of Computer Science and Informatics, University College Dublin, Belfield, Dublin, Ireland
| | | | | |
Collapse
|
27
|
Baú D, Martin AJM, Mooney C, Vullo A, Walsh I, Pollastri G. Distill: a suite of web servers for the prediction of one-, two- and three-dimensional structural features of proteins. BMC Bioinformatics 2006; 7:402. [PMID: 16953874 PMCID: PMC1574355 DOI: 10.1186/1471-2105-7-402] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2006] [Accepted: 09/05/2006] [Indexed: 04/13/2023] Open
Abstract
BACKGROUND We describe Distill, a suite of servers for the prediction of protein structural features: secondary structure; relative solvent accessibility; contact density; backbone structural motifs; residue contact maps at 6, 8 and 12 Angstrom; coarse protein topology. The servers are based on large-scale ensembles of recursive neural networks and trained on large, up-to-date, non-redundant subsets of the Protein Data Bank. Together with structural feature predictions, Distill includes a server for prediction of Calpha traces for short proteins (up to 200 amino acids). RESULTS The servers are state-of-the-art, with secondary structure predicted correctly for nearly 80% of residues (currently the top performance on EVA), 2-class solvent accessibility nearly 80% correct, and contact maps exceeding 50% precision on the top non-diagonal contacts. A preliminary implementation of the predictor of protein Calpha traces featured among the top 20 Novel Fold predictors at the last CASP6 experiment as group Distill (ID 0348). The majority of the servers, including the Calpha trace predictor, now take into account homology information from the PDB, when available, resulting in greatly improved reliability. CONCLUSION All predictions are freely available through a simple joint web interface and the results are returned by email. In a single submission the user can send protein sequences for a total of up to 32k residues to all or a selection of the servers. Distill is accessible at the address: http://distill.ucd.ie/distill/.
Collapse
Affiliation(s)
- Davide Baú
- School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland
| | - Alberto JM Martin
- School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland
| | - Catherine Mooney
- School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland
| | - Alessandro Vullo
- School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland
| | - Ian Walsh
- School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland
| | - Gianluca Pollastri
- School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland
| |
Collapse
|
28
|
Abstract
The local structures of protein segments were classified and their distribution was analyzed to explore the structural diversity of proteins. Representative proteins were divided into short segments using a sliding L-residue window. Each set of local structures consisting of consecutive 1-31 amino acids was classified using a single-pass clustering method. The results demonstrate that the local structures of proteins are very unevenly distributed in the protein universe. The distribution of local structures of relatively long segments shows a power-law behavior that is formulated well by Zipf's law, implying that a protein structure possesses recursive and fractal characteristics. The degree of effective conformational freedom per residue as well as the structure entropy per residue decreases gradually with an increasing value of L and then converges to constant values. This suggests that the number of protein conformations resides within the range between 1.2L and 1.5L and that 10- to 20-residue segments are already proteinlike in terms of their structural diversity.
Collapse
Affiliation(s)
- Yoshito Sawada
- National Institute of Advanced Industrial Science and Technology, Tsukuba, Japan
| | | |
Collapse
|
29
|
Sims GE, Kim SH. A method for evaluating the structural quality of protein models by using higher-order phi-psi pairs scoring. Proc Natl Acad Sci U S A 2006; 103:4428-32. [PMID: 16537409 PMCID: PMC1401231 DOI: 10.1073/pnas.0511333103] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A method is presented for scoring the model quality of experimental and theoretical protein structures. The structural model to be evaluated is dissected into small fragments via a sliding window, where each fragment is represented by a vector of multiple phi-psi angles. The sliding window ranges in size from a length of 1-10 phi-psi pairs (3-12 residues). In this method, the conformation of each fragment is scored based on the fit of multiple phi-psi angles of the fragment to a database of multiple phi-psi angles from high-resolution x-ray crystal structures. We show that measuring the fit of predicted structural models to the allowed conformational space of longer fragments is a significant discriminator for model quality. Reasonable models have higher-order phi-psi score fit values (m) > -1.00.
Collapse
Affiliation(s)
- Gregory E. Sims
- Berkeley Structural Genomics Center, Lawrence Berkeley National Laboratory, Berkeley, CA 94720; and
| | - Sung-Hou Kim
- Berkeley Structural Genomics Center, Lawrence Berkeley National Laboratory, Berkeley, CA 94720; and
- Department of Chemistry, University of California, Berkeley, CA 94720
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
30
|
Tendulkar AV, Sohoni MA, Ogunnaike B, Wangikar PP. A geometric invariant-based framework for the analysis of protein conformational space. Bioinformatics 2005; 21:3622-8. [PMID: 16096349 DOI: 10.1093/bioinformatics/bti621] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Characterization of the restricted nature of the protein local conformational space has remained a challenge, thereby necessitating a computationally expensive conformational search in protein modeling. Moreover, owing to the lack of unilateral structural descriptors, conventional data mining techniques, such as clustering and classification, have not been applied in protein structure analysis. RESULTS We first map the local conformations in a fixed dimensional space by using a carefully selected suite of geometric invariants (GIs) and then reduce the number of dimensions via principal component analysis (PCA). Distribution of the conformations in the space spanned by the first four PCs is visualized as a set of conditional bivariate probability distribution plots, where the peaks correspond to the preferred conformations. The locations of the different canonical structures in the PC-space have been interpreted in the context of the weights of the GIs to the first four PCs. Clustering of the available conformations reveals that the number of preferred local conformations is several orders of magnitude smaller than that suggested previously. SUPPLEMENTARY INFORMATION www.it.iitb.ac.in/~ashish/bioinfo2005/.
Collapse
Affiliation(s)
- Ashish V Tendulkar
- Kanwal Rekhi School of Information Technology, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | | | | | | |
Collapse
|