1
|
Harihar B, Saravanan KM, Gromiha MM, Selvaraj S. Importance of Inter-residue Contacts for Understanding Protein Folding and Unfolding Rates, Remote Homology, and Drug Design. Mol Biotechnol 2024:10.1007/s12033-024-01119-4. [PMID: 38498284 DOI: 10.1007/s12033-024-01119-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 02/10/2024] [Indexed: 03/20/2024]
Abstract
Inter-residue interactions in protein structures provide valuable insights into protein folding and stability. Understanding these interactions can be helpful in many crucial applications, including rational design of therapeutic small molecules and biologics, locating functional protein sites, and predicting protein-protein and protein-ligand interactions. The process of developing machine learning models incorporating inter-residue interactions has been improved recently. This review highlights the theoretical models incorporating inter-residue interactions in predicting folding and unfolding rates of proteins. Utilizing contact maps to depict inter-residue interactions aids researchers in developing computer models for detecting remote homologs and interface residues within protein-protein complexes which, in turn, enhances our knowledge of the relationship between sequence and structure of proteins. Further, the application of contact maps derived from inter-residue interactions is highlighted in the field of drug discovery. Overall, this review presents an extensive assessment of the significant models that use inter-residue interactions to investigate folding rates, unfolding rates, remote homology, and drug development, providing potential future advancements in constructing efficient computational models in structural biology.
Collapse
Affiliation(s)
- Balasubramanian Harihar
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
| | - Konda Mani Saravanan
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai, Tamil Nadu, 600073, India
| | - Michael M Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
| | - Samuel Selvaraj
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India.
| |
Collapse
|
2
|
Tripathi N, Saraf P, Bhardwaj N, Shrivastava SK, Jain SK. Identifying inflammation-related targets of natural lactones using network pharmacology, molecular modeling and in vitro approaches. J Biomol Struct Dyn 2024:1-16. [PMID: 38334283 DOI: 10.1080/07391102.2024.2310783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Accepted: 01/20/2024] [Indexed: 02/10/2024]
Abstract
Natural lactones have been used in traditional and folklore medicine for centuries owing to their anti-inflammatory properties. The study uses a multifaceted approach to identify lead anti-inflammatory lactones from the SISTEMATX natural products database. The study analyzed the natural lactone database, revealing 18 lactones linked to inflammation targets. The primary targets were PTGES, PTGS1, COX-2, ALOX5 and IL1B. STX 12273 was the best hit, with the lowest binding energy and potential for inhibiting the COX-2 enzyme. The study suggested natural lactone, STX 12273, from the SISTEMATX database with anti-inflammatory potential and postulated its use for inflammation treatment or prevention.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Nancy Tripathi
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (BHU), Varanasi, India
| | - Poorvi Saraf
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (BHU), Varanasi, India
| | - Nivedita Bhardwaj
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (BHU), Varanasi, India
| | - Sushant Kumar Shrivastava
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (BHU), Varanasi, India
| | - Shreyans K Jain
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (BHU), Varanasi, India
| |
Collapse
|
3
|
Plé T, Lagardère L, Piquemal JP. Force-field-enhanced neural network interactions: from local equivariant embedding to atom-in-molecule properties and long-range effects. Chem Sci 2023; 14:12554-12569. [PMID: 38020379 PMCID: PMC10646944 DOI: 10.1039/d3sc02581k] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 10/03/2023] [Indexed: 12/01/2023] Open
Abstract
We introduce FENNIX (Force-Field-Enhanced Neural Network InteraXions), a hybrid approach between machine-learning and force-fields. We leverage state-of-the-art equivariant neural networks to predict local energy contributions and multiple atom-in-molecule properties that are then used as geometry-dependent parameters for physically-motivated energy terms which account for long-range electrostatics and dispersion. Using high-accuracy ab initio data (small organic molecules/dimers), we trained a first version of the model. Exhibiting accurate gas-phase energy predictions, FENNIX is transferable to the condensed phase. It is able to produce stable Molecular Dynamics simulations, including nuclear quantum effects, for water predicting accurate liquid properties. The extrapolating power of the hybrid physically-driven machine learning FENNIX approach is exemplified by computing: (i) the solvated alanine dipeptide free energy landscape; (ii) the reactive dissociation of small molecules.
Collapse
Affiliation(s)
- Thomas Plé
- Sorbonne Université, LCT, UMR 7616 CNRS F-75005 Paris France thomas.ple@sorbonne-université louis.lagardere@sorbonne-université jean-philip.piquemal@sorbonne-université
| | - Louis Lagardère
- Sorbonne Université, LCT, UMR 7616 CNRS F-75005 Paris France thomas.ple@sorbonne-université louis.lagardere@sorbonne-université jean-philip.piquemal@sorbonne-université
| | - Jean-Philip Piquemal
- Sorbonne Université, LCT, UMR 7616 CNRS F-75005 Paris France thomas.ple@sorbonne-université louis.lagardere@sorbonne-université jean-philip.piquemal@sorbonne-université
| |
Collapse
|
4
|
Soleymani F, Paquet E, Viktor H, Michalowski W, Spinello D. Protein-protein interaction prediction with deep learning: A comprehensive review. Comput Struct Biotechnol J 2022; 20:5316-5341. [PMID: 36212542 PMCID: PMC9520216 DOI: 10.1016/j.csbj.2022.08.070] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 08/29/2022] [Accepted: 08/30/2022] [Indexed: 11/15/2022] Open
Abstract
Most proteins perform their biological function by interacting with themselves or other molecules. Thus, one may obtain biological insights into protein functions, disease prevalence, and therapy development by identifying protein-protein interactions (PPI). However, finding the interacting and non-interacting protein pairs through experimental approaches is labour-intensive and time-consuming, owing to the variety of proteins. Hence, protein-protein interaction and protein-ligand binding problems have drawn attention in the fields of bioinformatics and computer-aided drug discovery. Deep learning methods paved the way for scientists to predict the 3-D structure of proteins from genomes, predict the functions and attributes of a protein, and modify and design new proteins to provide desired functions. This review focuses on recent deep learning methods applied to problems including predicting protein functions, protein-protein interaction and their sites, protein-ligand binding, and protein design.
Collapse
Affiliation(s)
- Farzan Soleymani
- Department of Mechanical Engineering, University of Ottawa, Ottawa, ON, Canada
| | - Eric Paquet
- National Research Council, 1200 Montreal Road, Ottawa, ON K1A 0R6, Canada
| | - Herna Viktor
- School of Electrical Engineering and Computer Science, University of Ottawa, ON, Canada
| | | | - Davide Spinello
- Department of Mechanical Engineering, University of Ottawa, Ottawa, ON, Canada
| |
Collapse
|
5
|
rsRNASP: A residue-separation-based statistical potential for RNA 3D structure evaluation. Biophys J 2022; 121:142-156. [PMID: 34798137 PMCID: PMC8758408 DOI: 10.1016/j.bpj.2021.11.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 10/23/2021] [Accepted: 11/10/2021] [Indexed: 01/07/2023] Open
Abstract
Knowledge-based statistical potentials have been shown to be rather effective in protein 3-dimensional (3D) structure evaluation and prediction. Recently, several statistical potentials have been developed for RNA 3D structure evaluation, while their performances are either still at a low level for the test datasets from structure prediction models or dependent on the "black-box" process through neural networks. In this work, we have developed an all-atom distance-dependent statistical potential based on residue separation for RNA 3D structure evaluation, namely rsRNASP, which is composed of short- and long-ranged potentials distinguished by residue separation. The extensive examinations against available RNA test datasets show that rsRNASP has apparently higher performance than the existing statistical potentials for the realistic test datasets with large RNAs from structure prediction models, including the newly released RNA-Puzzles dataset, and is comparable to the existing top statistical potentials for the test datasets with small RNAs or near-native decoys. In addition, rsRNASP is superior to RNA3DCNN, a recently developed scoring function through 3D convolutional neural networks. rsRNASP and the relevant databases are available to the public.
Collapse
|
6
|
Haratipour Z, Aldabagh H, Li Y, Greene LH. Network Connectivity, Centrality and Fragmentation in the Greek-Key Protein Topology. Protein J 2020; 38:497-505. [PMID: 31317305 DOI: 10.1007/s10930-019-09850-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Understanding and computationally predicting the protein folding process remains one of the most challenging scientific problems and has uniquely garnered the interdisciplinary efforts of researchers from both the biological, chemical, physical and computational disciplines. Previous studies have demonstrated the importance of long-range interactions in guiding the native structure. However, predicting how the native long-range interaction network forms to generate a specific topology from among all other conformations remains unresolved. The present research study conducts an exploratory study to identify amino acids and long-range interactions that have the potential to play a key role in building and maintaining the protein topology. Towards this end, the application of network science is utilized and developed to analyze the structures of a group of proteins that share a common Greek-key topology but differ in sequence, secondary structure and function. We investigate the idea that the residues with high betweeness centrality score are potentially significant in maintaining the protein network and in governing the Greek-key topology. This hypothesis is tested by two different computational methods: through a fragmentation test and by the analysis of diameter impacts. In summary, we find a subset of selected residues in similar geographical positions in all model proteins, which demonstrates the role of these specific residues and regions in governing the Greek-key topology from a network perspective.
Collapse
Affiliation(s)
- Zeinab Haratipour
- Department of Chemistry and Biochemistry, Old Dominion University, Norfolk, VA, 23529, USA
| | - Hind Aldabagh
- Department of Computer Science, Old Dominion University, Norfolk, VA, 23529, USA
| | - Yaohang Li
- Department of Computer Science, Old Dominion University, Norfolk, VA, 23529, USA
| | - Lesley H Greene
- Department of Chemistry and Biochemistry, Old Dominion University, Norfolk, VA, 23529, USA.
| |
Collapse
|
7
|
Culka M, Rulíšek L. Interplay between Conformational Strain and Intramolecular Interaction in Protein Structures: Which of Them Is Evolutionarily Conserved? J Phys Chem B 2020; 124:3252-3260. [PMID: 32237747 DOI: 10.1021/acs.jpcb.9b11784] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
By computing strain energies of peptide fragments within protein structures and their intramolecular interaction energies, we attempt to reveal general biophysical trends behind the secondary structure formation in the context of protein evolution. Our "protein basis set" consisted of 1143 representatives of different folds obtained from curated SCOPe database, and for each member of the set, the strain and intramolecular energy was calculated on the "rolling tripeptide" basis, employing the DFT-D3/COSMO-RS method for the former and the QM-calibrated force field method (MM) for the latter. The calculated data, strain and interactions, were correlated with the conservation of amino acid residues in secondary structure elements and also with the level of the residue burial within the protein three-dimensional structure. It allowed us to formulate several observations concerning fundamental differences between two main secondary structure motifs: α-helices and β-strands. We have shown that a strong interaction is one of the determining characteristics of the β-sheet formation, at least at the level of tripeptides (and likely penta- or heptapeptides, too), and that the β-strand is a prevailing secondary structure in the strongly-interacting regions of the protein folds conserved by evolution. On the other hand, low strain was neither proven to be an important physicochemical property conserved by evolution nor does it correlate with the propensity for the α-helix and β-strand. Finally, it has been demonstrated that the strong interaction has a certain level of connection with residue burial; however, we demonstrate that these two characteristics should be rather regarded as two complementary factors. These findings represent an important contribution to understanding protein folding from first principles, which is a complementary approach to ongoing efforts to solve the protein folding problem by knowledge-based approaches and machine-learning.
Collapse
Affiliation(s)
- Martin Culka
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha 6, Czech Republic
| | - Lubomír Rulíšek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha 6, Czech Republic
| |
Collapse
|
8
|
DistAA: Database of amino acid distances in proteins and web application for statistical review of distances. Comput Biol Chem 2019; 83:107130. [PMID: 31593887 DOI: 10.1016/j.compbiolchem.2019.107130] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2018] [Revised: 09/07/2019] [Accepted: 09/17/2019] [Indexed: 11/22/2022]
Abstract
Three-dimensional structure of a protein chain is determined by its amino acid interactions. One approach to the analysis of amino acid interactions refers to geometric distances of amino acid pairs in polypeptide chains. For a detailed analysis of the amino acid distances, the database with three types of amino acid distances in a set of chains was created. Web application Distances of Amino Acids has also been developed to enable scientists to explore interactions of amino acids with different properties based on distances stored in the database. Web application calculates and displays descriptive statistics and graphs of amino acid pair distances with selected properties, such as geometric distance threshold, corresponding SCOP class of proteins and secondary structure types. In addition to the analysis of pre-calculated distances stored in the database, the amino acid distances of a single protein with the specified PDB identifier can also be analyzed. The web application is available at http://andromeda.matf.bg.ac.rs/aadis_dynamic/.
Collapse
|
9
|
Mayol E, Campillo M, Cordomí A, Olivella M. Inter-residue interactions in alpha-helical transmembrane proteins. Bioinformatics 2019; 35:2578-2584. [PMID: 30566615 DOI: 10.1093/bioinformatics/bty978] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Revised: 10/19/2018] [Accepted: 12/17/2018] [Indexed: 01/23/2023] Open
Abstract
MOTIVATION The number of available membrane protein structures has markedly increased in the last years and, in parallel, the reliability of the methods to detect transmembrane (TM) segments. In the present report, we characterized inter-residue interactions in α-helical membrane proteins using a dataset of 3462 TM helices from 430 proteins. This is by far the largest analysis published to date. RESULTS Our analysis of residue-residue interactions in TM segments of membrane proteins shows that almost all interactions involve aliphatic residues and Phe. There is lack of polar-polar, polar-charged and charged-charged interactions except for those between Thr or Ser sidechains and the backbone carbonyl of aliphatic and Phe residues. The results are discussed in the context of the preferences of amino acids to be in the protein core or exposed to the lipid bilayer and to occupy specific positions along the TM segment. Comparison to datasets of β-barrel membrane proteins and of α-helical globular proteins unveils the specific patterns of interactions and residue composition characteristic of α-helical membrane proteins that are the clue to understanding their structure. AVAILABILITY AND IMPLEMENTATION Results data and datasets used are available at http://lmc.uab.cat/TMalphaDB/interactions.php. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Eduardo Mayol
- Laboratori de Medicina Computacional, Unitat de Bioestadística, Facultat de Medicina, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Mercedes Campillo
- Laboratori de Medicina Computacional, Unitat de Bioestadística, Facultat de Medicina, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Arnau Cordomí
- Laboratori de Medicina Computacional, Unitat de Bioestadística, Facultat de Medicina, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Mireia Olivella
- Bioinformatics Area, School of International Studies, ESCI-UPF, Barcelona, Spain.,Bioinformatics and Medical Statistics Group, U Science Tech, Central University of Catalonia, Vic, Barcelona, Spain
| |
Collapse
|
10
|
Bigman LS, Levy Y. Stability Effects of Protein Mutations: The Role of Long-Range Contacts. J Phys Chem B 2018; 122:11450-11459. [DOI: 10.1021/acs.jpcb.8b07379] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Lavi S. Bigman
- Department of Structural Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Yaakov Levy
- Department of Structural Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
11
|
Kulandaisamy A, Srivastava A, Nagarajan R, Gromiha MM. Dissecting and analyzing key residues in protein-DNA complexes. J Mol Recognit 2017; 31. [DOI: 10.1002/jmr.2692] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Revised: 11/06/2017] [Accepted: 11/06/2017] [Indexed: 02/03/2023]
Affiliation(s)
- A. Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences; Indian Institute of Technology Madras; Chennai 600 036 Tamilnadu India
| | - Ambuj Srivastava
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences; Indian Institute of Technology Madras; Chennai 600 036 Tamilnadu India
| | - R. Nagarajan
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences; Indian Institute of Technology Madras; Chennai 600 036 Tamilnadu India
| | - M. Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences; Indian Institute of Technology Madras; Chennai 600 036 Tamilnadu India
| |
Collapse
|
12
|
Towards designing new nano-scale protein architectures. Essays Biochem 2017; 60:315-324. [PMID: 27903819 DOI: 10.1042/ebc20160018] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Revised: 08/11/2016] [Accepted: 08/18/2016] [Indexed: 11/17/2022]
Abstract
The complexity of designed bionano-scale architectures is rapidly increasing mainly due to the expanding field of DNA-origami technology and accurate protein design approaches. The major advantage offered by polypeptide nanostructures compared with most other polymers resides in their highly programmable complexity. Proteins allow in vivo formation of well-defined structures with a precise spatial arrangement of functional groups, providing extremely versatile nano-scale scaffolds. Extending beyond existing proteins that perform a wide range of functions in biological systems, it became possible in the last few decades to engineer and predict properties of completely novel protein folds, opening the field of protein nanostructure design. This review offers an overview on rational and computational design approaches focusing on the main achievements of novel protein nanostructure design.
Collapse
|
13
|
Saravanan KM, Selvaraj S. Dihedral angle preferences of amino acid residues forming various non-local interactions in proteins. J Biol Phys 2017; 43:265-278. [PMID: 28577238 PMCID: PMC5471173 DOI: 10.1007/s10867-017-9451-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2016] [Accepted: 04/13/2017] [Indexed: 12/22/2022] Open
Abstract
In theory, a polypeptide chain can adopt a vast number of conformations, each corresponding to a set of backbone rotation angles. Many of these conformations are excluded due to steric overlaps. Ramachandran and coworkers were the first to look into this problem by plotting backbone dihedral angles in a two-dimensional plot. The conformational space in the Ramachandran map is further refined by considering the energetic contributions of various non-bonded interactions. Alternatively, the conformation adopted by a polypeptide chain may also be examined by investigating interactions between the residues. Since the Ramachandran map essentially focuses on local interactions (residues closer in sequence), out of interest, we have analyzed the dihedral angle preferences of residues that make non-local interactions (residues far away in sequence and closer in space) in the folded structures of proteins. The non-local interactions have been grouped into different types such as hydrogen bond, van der Waals interactions between hydrophobic groups, ion pairs (salt bridges), and ππ-stacking interactions. The results show the propensity of amino acid residues in proteins forming local and non-local interactions. Our results point to the vital role of different types of non-local interactions and their effect on dihedral angles in forming secondary and tertiary structural elements to adopt their native fold.
Collapse
Affiliation(s)
- Konda Mani Saravanan
- Centre of Advanced Study in Crystallography & Biophysics, University of Madras, Guindy Campus, Chennai, Tamil Nadu, 600 025, India
| | - Samuel Selvaraj
- Centre of Advanced Study in Crystallography & Biophysics, University of Madras, Guindy Campus, Chennai, Tamil Nadu, 600 025, India.
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India.
| |
Collapse
|
14
|
Assessing Predicted Contacts for Building Protein Three-Dimensional Models. Methods Mol Biol 2016. [PMID: 27787823 DOI: 10.1007/978-1-4939-6406-2_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Recent successes of contact-guided protein structure prediction methods have revived interest in solving the long-standing problem of ab initio protein structure prediction. With homology modeling failing for many protein sequences that do not have templates, contact-guided structure prediction has shown promise, and consequently, contact prediction has gained a lot of interest recently. Although a few dozen contact prediction tools are already currently available as web servers and downloadables, not enough research has been done towards using existing measures like precision and recall to evaluate these contacts with the goal of building three-dimensional models. Moreover, when we do not have a native structure for a set of predicted contacts, the only analysis we can perform is a simple contact map visualization of the predicted contacts. A wider and more rigorous assessment of the predicted contacts is needed, in order to build tertiary structure models. This chapter discusses instructions and protocols for using tools and applying techniques in order to assess predicted contacts for building three-dimensional models.
Collapse
|
15
|
Hu Y, Guo Y, Shi Y, Li M, Pu X. A consensus subunit-specific model for annotation of substrate specificity for ABC transporters. RSC Adv 2015. [DOI: 10.1039/c5ra05304h] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
A consensus classification model was built by considering three subunit proteins individually to predict the substrate specificity of ABC transporters.
Collapse
Affiliation(s)
- Yayun Hu
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Yanzhi Guo
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Yinan Shi
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Menglong Li
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Xuemei Pu
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| |
Collapse
|
16
|
Zhang J, Sun P, Zhao X, Ma Z. PECM: Prediction of extracellular matrix proteins using the concept of Chou’s pseudo amino acid composition. J Theor Biol 2014; 363:412-8. [DOI: 10.1016/j.jtbi.2014.08.002] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2014] [Revised: 07/30/2014] [Accepted: 08/01/2014] [Indexed: 12/11/2022]
|
17
|
Mishra NK, Chang J, Zhao PX. Prediction of membrane transport proteins and their substrate specificities using primary sequence information. PLoS One 2014; 9:e100278. [PMID: 24968309 PMCID: PMC4072671 DOI: 10.1371/journal.pone.0100278] [Citation(s) in RCA: 74] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2014] [Accepted: 05/23/2014] [Indexed: 11/18/2022] Open
Abstract
Background Membrane transport proteins (transporters) move hydrophilic substrates across hydrophobic membranes and play vital roles in most cellular functions. Transporters represent a diverse group of proteins that differ in topology, energy coupling mechanism, and substrate specificity as well as sequence similarity. Among the functional annotations of transporters, information about their transporting substrates is especially important. The experimental identification and characterization of transporters is currently costly and time-consuming. The development of robust bioinformatics-based methods for the prediction of membrane transport proteins and their substrate specificities is therefore an important and urgent task. Results Support vector machine (SVM)-based computational models, which comprehensively utilize integrative protein sequence features such as amino acid composition, dipeptide composition, physico-chemical composition, biochemical composition, and position-specific scoring matrices (PSSM), were developed to predict the substrate specificity of seven transporter classes: amino acid, anion, cation, electron, protein/mRNA, sugar, and other transporters. An additional model to differentiate transporters from non-transporters was also developed. Among the developed models, the biochemical composition and PSSM hybrid model outperformed other models and achieved an overall average prediction accuracy of 76.69% with a Mathews correlation coefficient (MCC) of 0.49 and a receiver operating characteristic area under the curve (AUC) of 0.833 on our main dataset. This model also achieved an overall average prediction accuracy of 78.88% and MCC of 0.41 on an independent dataset. Conclusions Our analyses suggest that evolutionary information (i.e., the PSSM) and the AAIndex are key features for the substrate specificity prediction of transport proteins. In comparison, similarity-based methods such as BLAST, PSI-BLAST, and hidden Markov models do not provide accurate predictions for the substrate specificity of membrane transport proteins. TrSSP: The Transporter Substrate Specificity Prediction Server, a web server that implements the SVM models developed in this paper, is freely available at http://bioinfo.noble.org/TrSSP.
Collapse
Affiliation(s)
- Nitish K. Mishra
- Plant Biology Division, The Samuel Roberts Noble Foundation, Ardmore, Oklahoma, United States of America
| | - Junil Chang
- Plant Biology Division, The Samuel Roberts Noble Foundation, Ardmore, Oklahoma, United States of America
| | - Patrick X. Zhao
- Plant Biology Division, The Samuel Roberts Noble Foundation, Ardmore, Oklahoma, United States of America
- * E-mail:
| |
Collapse
|
18
|
PSNO: predicting cysteine S-nitrosylation sites by incorporating various sequence-derived features into the general form of Chou's PseAAC. Int J Mol Sci 2014; 15:11204-19. [PMID: 24968264 PMCID: PMC4139777 DOI: 10.3390/ijms150711204] [Citation(s) in RCA: 76] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2014] [Revised: 05/26/2014] [Accepted: 05/27/2014] [Indexed: 11/16/2022] Open
Abstract
S-nitrosylation (SNO) is one of the most universal reversible post-translational modifications involved in many biological processes. Malfunction or dysregulation of SNO leads to a series of severe diseases, such as developmental abnormalities and various diseases. Therefore, the identification of SNO sites (SNOs) provides insights into disease progression and drug development. In this paper, a new bioinformatics tool, named PSNO, is proposed to identify SNOs from protein sequences. Firstly, we explore various promising sequence-derived discriminative features, including the evolutionary profile, the predicted secondary structure and the physicochemical properties. Secondly, rather than simply combining the features, which may bring about information redundancy and unwanted noise, we use the relative entropy selection and incremental feature selection approach to select the optimal feature subsets. Thirdly, we train our model by the technique of the k-nearest neighbor algorithm. Using both informative features and an elaborate feature selection scheme, our method, PSNO, achieves good prediction performance with a mean Mathews correlation coefficient (MCC) value of about 0.5119 on the training dataset using 10-fold cross-validation. These results indicate that PSNO can be used as a competitive predictor among the state-of-the-art SNOs prediction tools. A web-server, named PSNO, which implements the proposed method, is freely available at http://59.73.198.144:8088/PSNO/.
Collapse
|
19
|
Lavanya P, Ramaiah S, Anbarasu A. Computational analysis of N–H⋯π interactions and its impact on the structural stability of β-lactamases. Comput Biol Med 2014; 46:22-8. [DOI: 10.1016/j.compbiomed.2013.12.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2013] [Revised: 12/12/2013] [Accepted: 12/15/2013] [Indexed: 10/25/2022]
|
20
|
Gao J, Zhang N, Ruan J. Prediction of protein modification sites of gamma-carboxylation using position specific scoring matrices based evolutionary information. Comput Biol Chem 2013; 47:215-20. [DOI: 10.1016/j.compbiolchem.2013.09.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2013] [Revised: 09/12/2013] [Accepted: 09/12/2013] [Indexed: 11/28/2022]
|
21
|
Vaideeswaran S, Ramaiah S. Investigations on the role of π-π interactions and π-π networks in eNOS and nNOS proteins. Bioorg Chem 2013; 49:16-23. [PMID: 23845761 DOI: 10.1016/j.bioorg.2013.06.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2012] [Revised: 04/16/2013] [Accepted: 06/03/2013] [Indexed: 10/26/2022]
Abstract
π-π Interactions play an important role in the stability of protein structures. In the present study, we have analyzed the influence of π-π interactions in eNOS and nNOS proteins. The contribution of these π-π interacting residues in sequential separation, secondary structure involvement, solvent accessibility and stabilization centers has been evaluated. π-π interactions stabilize the core regions within eNOS and nNOS proteins. π-π interacting residues are evolutionary conserved. There is a significant number of π-π interactions in spite of the lesser natural occurrences of π-residues in eNOS and nNOS proteins. In addition to π-π interactions, π residues also form π-π networks in both eNOS and nNOS proteins which might play an important role in the structural stability of these protein structures.
Collapse
Affiliation(s)
- Sivasakthi Vaideeswaran
- Bioinformatics Division, School of Biosciences and Technology, VIT University, Vellore 632 014, Tamil Nadu, India
| | | |
Collapse
|
22
|
Saravanan KM, Selvaraj S. Performance of secondary structure prediction methods on proteins containing structurally ambivalent sequence fragments. Biopolymers 2013; 100:148-53. [DOI: 10.1002/bip.22178] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2012] [Revised: 09/04/2012] [Accepted: 09/23/2012] [Indexed: 11/12/2022]
Affiliation(s)
- K. Mani Saravanan
- Department of Bioinformatics; School of Life Sciences; Bharathidasan University; Tiruchirappalli; 620024; Tamil Nadu; India
| | - Samuel Selvaraj
- Department of Bioinformatics; School of Life Sciences; Bharathidasan University; Tiruchirappalli; 620024; Tamil Nadu; India
| |
Collapse
|
23
|
Cheng X, Xiao X, Wu ZC, Wang P, Lin WZ. Swfoldrate: predicting protein folding rates from amino acid sequence with sliding window method. Proteins 2012; 81:140-8. [PMID: 22933332 DOI: 10.1002/prot.24171] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2012] [Revised: 07/20/2012] [Accepted: 08/25/2012] [Indexed: 01/18/2023]
Abstract
Protein folding is the process by which a protein processes from its denatured state to its specific biologically active conformation. Understanding the relationship between sequences and the folding rates of proteins remains an important challenge. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. In this study, the long-range and short-range contact in protein were used to derive extended version of the pseudo amino acid composition based on sliding window method. This method is capable of predicting the protein folding rates just from the amino acid sequence without the aid of any structural class information. We systematically studied the contributions of individual features to folding rate prediction. The optimal feature selection procedures are adopted by means of combining the forward feature selection and sequential backward selection method. Using the jackknife cross validation test, the method was demonstrated on the large dataset. The predictor was achieved on the basis of multitudinous physicochemical features and statistical features from protein using nonlinear support vector machine (SVM) regression model, the method obtained an excellent agreement between predicted and experimentally observed folding rates of proteins. The correlation coefficient is 0.9313 and the standard error is 2.2692. The prediction server is freely available at http://www.jci-bioinfo.cn/swfrate/input.jsp.
Collapse
Affiliation(s)
- Xiang Cheng
- Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen 333403, China
| | | | | | | | | |
Collapse
|
24
|
Albrecht L, Boyd RJ. Visualizing Internal Stabilization in Weakly Bound Systems Using Atomic Energies: Hydrogen Bonding in Small Water Clusters. J Phys Chem A 2012; 116:3946-51. [DOI: 10.1021/jp301006g] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Laura Albrecht
- Department of Chemistry, Dalhousie University, Halifax, Nova
Scotia, Canada
B3H 4R2
| | - Russell J. Boyd
- Department of Chemistry, Dalhousie University, Halifax, Nova
Scotia, Canada
B3H 4R2
| |
Collapse
|
25
|
Sun W, He J. From isotropic to anisotropic side chain representations: comparison of three models for residue contact estimation. PLoS One 2011; 6:e19238. [PMID: 21552527 PMCID: PMC3084275 DOI: 10.1371/journal.pone.0019238] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Accepted: 03/29/2011] [Indexed: 11/19/2022] Open
Abstract
The criterion to determine residue contact is a fundamental problem in deriving knowledge-based mean-force potential energy calculations for protein structures. A frequently used criterion is to require the side chain center-to-center distance or the -to- atom distance to be within a pre-determined cutoff distance. However, the spatially anisotropic nature of the side chain determines that it is challenging to identify the contact pairs. This study compares three side chain contact models: the Atom Distance criteria (ADC) model, the Isotropic Sphere Side chain (ISS) model and the Anisotropic Ellipsoid Side chain (AES) model using 424 high resolution protein structures in the Protein Data Bank. The results indicate that the ADC model is the most accurate and ISS is the worst. The AES model eliminates about 95% of the incorrectly counted contact-pairs in the ISS model. Algorithm analysis shows that AES model is the most computational intensive while ADC model has moderate computational cost. We derived a dataset of the mis-estimated contact pairs by AES model. The most misjudged pairs are Arg-Glu, Arg-Asp and Arg-Tyr. Such a dataset can be useful for developing the improved AES model by incorporating the pair-specific information for the cutoff distance.
Collapse
Affiliation(s)
- Weitao Sun
- Zhou Pei-Yuan Center for Applied Mathematics, Tsinghua University, Beijing, China.
| | | |
Collapse
|
26
|
Chen SCC, Chuang TJ, Li WH. The relationships among microRNA regulation, intrinsically disordered regions, and other indicators of protein evolutionary rate. Mol Biol Evol 2011; 28:2513-20. [PMID: 21398349 DOI: 10.1093/molbev/msr068] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Many indicators of protein evolutionary rate have been proposed, but some of them are interrelated. The purpose of this study is to disentangle their correlations. We assess the strength of each indicator by controlling for the other indicators under study. We find that the number of microRNA (miRNA) types that regulate a gene is the strongest rate indicator (a negative correlation), followed by disorder content (the percentage of disordered regions in a protein, a positive correlation); the strength of disorder content as a rate indicator is substantially increased after controlling for the number of miRNA types. By dividing proteins into lowly and highly intrinsically disordered proteins (L-IDPs and H-IDPs), we find that proteins interacting with more H-IDPs tend to evolve more slowly, which largely explains the previous observation of a negative correlation between the number of protein-protein interactions and evolutionary rate. Moreover, all of the indicators examined here, except for the number of miRNA types, have different strengths in L-IDPs and in H-IDPs. Finally, the number of phosphorylation sites is weakly correlated with the number of miRNA types, and its strength as a rate indicator is substantially reduced when other indicators are considered. Our study reveals the relative strength of each rate indicator and increases our understanding of protein evolution.
Collapse
Affiliation(s)
- Sean Chun-Chang Chen
- Institute of BioMedical Informatics, National Yang-Ming University, Taipei, Taiwan
| | | | | |
Collapse
|
27
|
Esque J, Oguey C, de Brevern AG. Comparative Analysis of Threshold and Tessellation Methods for Determining Protein Contacts. J Chem Inf Model 2011; 51:493-507. [DOI: 10.1021/ci100195t] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Jeremy Esque
- LPTM, CNRS UMR 8089, Université de Cergy Pontoise, 2 av. Adolphe Chauvin, 95302 Cergy-Pontoise, France
- INSERM UMR-S 665, Dynamique des Structures et Interactions des Macromolécules Biologiques (DSIMB), Université Paris Diderot, Paris 7, INTS, 6, rue Alexandre Cabanel, 75739 Paris Cedex 15, France
| | - Christophe Oguey
- LPTM, CNRS UMR 8089, Université de Cergy Pontoise, 2 av. Adolphe Chauvin, 95302 Cergy-Pontoise, France
| | - Alexandre G. de Brevern
- INSERM UMR-S 665, Dynamique des Structures et Interactions des Macromolécules Biologiques (DSIMB), Université Paris Diderot, Paris 7, INTS, 6, rue Alexandre Cabanel, 75739 Paris Cedex 15, France
| |
Collapse
|
28
|
De Sancho D, Muñoz V. Integrated prediction of protein folding and unfolding rates from only size and structural class. Phys Chem Chem Phys 2011; 13:17030-43. [PMID: 21670826 DOI: 10.1039/c1cp20402e] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Affiliation(s)
- David De Sancho
- Centro de Investigaciones Biológicas, Spanish National Research Council (CSIC), Ramiro de Maeztu 9, Madrid 28040, Spain
| | | |
Collapse
|
29
|
Saravanan KM, Balasubramanian H, Nallusamy S, Samuel S. Sequence and structural analysis of two designed proteins with 88% identity adopting different folds. Protein Eng Des Sel 2010; 23:911-8. [PMID: 20952437 DOI: 10.1093/protein/gzq070] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Protein folding is a natural phenomenon by which a sequence of amino acids folds into a unique functional three-dimensional structure. Although the sequence code that governs folding remains a mystery, one can identify key inter-residue contacts responsible for a given topology. In nature, there are many pairs of proteins of a given length that share little or no sequence identity. Similarly, there are many proteins that share a common topology but lack significant evidence of homology. In order to tackle this problem, protein engineering studies have been used to determine the minimal number of amino acid residues that codes for a particular fold. In recent years, the coupling of theoretical models and experiments in the study of protein folding has resulted in providing some fruitful clues. He et al. have designed two proteins with 88% sequence identity, which adopt different folds and functions. In this work, we have systematically analysed these two proteins by performing pentapeptide search, secondary structure predictions, variation in inter-residue interactions and residue-residue pair preferences, surrounding hydrophobicity computations, conformational switching and energy computations. We conclude that the local secondary structural preference of the two designed proteins at the Nand C-terminal ends to adopt either coil or strand conformation may be a crucial factor in adopting the different folds. Early on during the process of folding, both proteins may choose different energetically favourable pathways to attain the different folds.
Collapse
Affiliation(s)
- K Mani Saravanan
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, TN 620024, India
| | | | | | | |
Collapse
|
30
|
Ou YY, Chen SA, Gromiha MM. Classification of transporters using efficient radial basis function networks with position-specific scoring matrices and biochemical properties. Proteins 2010; 78:1789-97. [DOI: 10.1002/prot.22694] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
31
|
Harihar B, Selvaraj S. Refinement of the long-range order parameter in predicting folding rates of two-state proteins. Biopolymers 2009; 91:928-35. [DOI: 10.1002/bip.21281] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
32
|
Ruvinsky AM, Vakser IA. The ruggedness of protein-protein energy landscape and the cutoff for 1/r(n) potentials. Bioinformatics 2009; 25:1132-6. [PMID: 19237445 DOI: 10.1093/bioinformatics/btp108] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Computational studies of the energetics of protein association are important for revealing the underlying fundamental principles and for designing better tools to model protein complexes. The interaction cutoff contribution to the ruggedness of protein-protein energy landscape is studied in terms of relative energy fluctuations for 1/r(n) potentials based on a simplistic model of a protein complex. This artificial ruggedness exists for short cutoffs and gradually disappears with the cutoff increase. RESULTS The critical values of the cutoff were calculated for each of 11 popular power-type potentials with n=0/9, 12 and for two thresholds of 5% and 10%. The artificial ruggedness decreases to tolerable thresholds for cutoffs larger than the critical ones. The results showed that for both thresholds the critical cutoff is a non-monotonic function of the potential power n. The functions reach the maximum at n=3/4 and then decrease with the increase of the potential power. The difference between two cutoffs for 5% and 10% artificial ruggedness becomes negligible for potentials decreasing faster than 1/r(12). The analytical results obtained for the simple model of protein complexes agree with the analysis of artificial ruggedness in a dataset of 62 protein-protein complexes, with different parameterizations of soft Lennard-Jones potential and two types of protein representations: all-atom and coarse-grained. The results suggest that cutoffs larger than the critical ones can be recommended for protein-protein potentials.
Collapse
Affiliation(s)
- Anatoly M Ruvinsky
- Center for Bioinformatics, The University of Kansas, Lawrence, KS 66047, USA
| | | |
Collapse
|
33
|
Huang LT, Gromiha MM. Analysis and prediction of protein folding rates using quadratic response surface models. J Comput Chem 2008; 29:1675-83. [PMID: 18351617 DOI: 10.1002/jcc.20925] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Understanding the relationship between amino acid sequences and folding rates of proteins is an important task in computational and molecular biology. In this work, we have systematically analyzed the composition of amino acid residues for proteins with different ranges of folding rates. We observed that the polar residues, Asn, Gln, Ser, and Lys, are dominant in fast folding proteins whereas the hydrophobic residues, Ala, Cys, Gly, and Leu, prefer to be in slow folding proteins. Further, we have developed a method based on quadratic response surface models for predicting the folding rates of 77 two- and three-state proteins. Our method showed a correlation of 0.90 between experimental and predicted protein folding rates using leave-one-out cross-validation method. The classification of proteins based on structural class improved the correlation to 0.98 and it is 0.99, 0.98, and 0.96, respectively, for all-alpha, all-beta, and mixed class proteins. In addition, we have utilized Baysean classification theory for discriminating two- and three-state proteins, which showed an accuracy of 90%. We have developed a web server for predicting protein folding rates and it is available at http://bioinformatics.myweb.hinet.net/foldrate.htm.
Collapse
Affiliation(s)
- Liang-Tsung Huang
- Department of Computer Science and Information Engineering, Ming-Dao University, Changhua 523, Taiwan
| | | |
Collapse
|
34
|
Faure G, Bornot A, de Brevern AG. Protein contacts, inter-residue interactions and side-chain modelling. Biochimie 2008; 90:626-39. [DOI: 10.1016/j.biochi.2007.11.007] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2007] [Accepted: 11/22/2007] [Indexed: 10/22/2022]
|
35
|
Abstract
Protein burying depth (BD) is a structural descriptor that is exploited not only to find whether a residue is exposed or buried, but also to determine how deep a residue is buried. The widely used solvent accessible surface area is mainly focusing on the study of protein surface residues, while protein BD can provide more detailed information about the arrangement of buried residues, which may be used to study protein deep level structure and the formation of protein folding nucleus. In this work, we analyse the relationship of protein BD and sequences, and describe it by nonlinear functions estimated by support vector machines. We examine the functions by crossvalidation tests and find strong correlation between residue BD and local sequence environment. By further taking account the size of the molecule where a residue is located, we find that the correlation coefficient between predicted and observed depths improves from 0.60 to 0.65. Moreover, nearly half of the deepest 10% residues in a protein sequence can be correctly predicted. Our study suggests that a residue's burying extent is able to be predicted, to some degree, by itself and its local neighbouring residues. The methods used to estimate the sequence-depth functions are expected to become more useful in the investigation of protein structures and folding mechanism.
Collapse
Affiliation(s)
- Zheng Yuan
- Institute for Molecular Bioscience and ARC Centre in Bioinformatics, The University of Queensland, Brisbane, Australia.
| | | |
Collapse
|
36
|
Gromiha MM, Selvaraj S, Thangakani AM. A Statistical Method for Predicting Protein Unfolding Rates from Amino Acid Sequence. J Chem Inf Model 2006; 46:1503-8. [PMID: 16711769 DOI: 10.1021/ci050417u] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The prediction of protein unfolding rates from amino acid sequences is one of the most important challenges in computational biology and chemistry. The analysis on the relationship between protein unfolding rates and physical-chemical, energetic, and conformational properties of amino acid residues provides valuable information to understand and predict the unfolding rates of two- and three-state proteins. We found that the classification of proteins into different structural classes shows an excellent correlation between amino acid properties and unfolding rates of two- and three-state proteins, indicating the importance of native-state topology in determining the protein unfolding rates. We have formulated three independent linear regression equations to different structural classes of proteins for predicting their unfolding rates from amino acid sequences and obtained an excellent agreement between predicted and experimentally observed unfolding rates of proteins; the correlation coefficients are 0.999, 0.990, and 0.992, respectively, for all-alpha, all-beta, and mixed-class proteins. Further, we have derived a general equation applicable to all structural classes of proteins, which can be used for predicting the unfolding rates for proteins of an unknown structural class. We observed a correlation of 0.987 and 0.930, respectively, for back-check and jack-knife tests. These accuracy levels are better than those of other methods in the literature.
Collapse
Affiliation(s)
- M Michael Gromiha
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), AIST Tokyo Waterfront Bio-IT Research Building, 2-42 Aomi, Tokyo 135-0064, Japan.
| | | | | |
Collapse
|
37
|
Chen C, Li L, Xiao Y. All-atom contact potential approach to protein thermostability analysis. Biopolymers 2006; 85:28-37. [PMID: 16964601 DOI: 10.1002/bip.20600] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In this paper we use all-atom potential energy to define and analyze the inter-residue contacts in mesophilic and thermophilic proteins. Fifteen families of proteins are selected and each family has two representative proteins with greatly different preferred environmental temperatures. We find that both the number and energy of the contacts defined in this way show stronger correlations with the preferred temperatures of proteins than other factors used before. We also find that the charged-polar and charged-nonpolar residue contacts not only have larger contact numbers but also have lower single contact energies. Furthermore, the most important is that most of the thermophilic proteins have more charged-polar and charged-nonpolar residue contacts than their mesophilic counterparts. This suggests that they may play an important role in the thermostability of proteins, except usual charged-charged and nonpolar-nonpolar residue contacts. Charged residues may exert their profound influence by forming contacts not only with other charged residues but also with polar or nonpolar residues, thus further increasing the strength of contact network and then the thermostability of proteins.
Collapse
Affiliation(s)
- Changjun Chen
- Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| | | | | |
Collapse
|
38
|
|
39
|
Faísca PFN, Telo da Gama MM, Nunes A. The Gō model revisited: Native structure and the geometric coupling between local and long-range contacts. Proteins 2005; 60:712-22. [PMID: 16021621 DOI: 10.1002/prot.20521] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Monte Carlo simulations show that long-range interactions play a major role in determining the folding rates of 48-mer three-dimensional lattice polymers modeled by the Gō potential. For three target structures with different native geometries we found a sharp increase in the folding time when the relative contribution of the long-range interactions to the native state's energy is decreased from approximately 50% towards zero. However, the dispersion of the simulated folding times is strongly dependent on native geometry and Gō polymers folding to one of the target structures exhibits folding times spanning three orders of magnitude. We have also found that, depending on the target geometry, a strong geometric coupling may exist between local and long-range contacts, which means that, when this coupling exists, the formation of long-range contacts is forced by the previous formation of local contacts. The absence of a strong geometric coupling results in a kinetics that is more sensitive to the interaction energy parameters; in this case, the formation of local contacts is not capable of promoting the establishment of long-range ones when the latter are strongly penalized energetically and this results in longer folding times.
Collapse
Affiliation(s)
- Patrícia F N Faísca
- Centro de Física Teórica e Computacional da Universidade de Lisboa, Lisboa Codex, Portugal.
| | | | | |
Collapse
|
40
|
Gromiha MM, Santhosh C, Ahmad S. Structural analysis of cation-pi interactions in DNA binding proteins. Int J Biol Macromol 2005; 34:203-11. [PMID: 15225993 DOI: 10.1016/j.ijbiomac.2004.04.003] [Citation(s) in RCA: 93] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Cation-pi interactions play an important role in the stability of protein structures. In this work, we have analyzed the influence of cation-pi interactions in DNA binding proteins. We observed cation-pi interactions in 45 out of 62 DNA binding proteins and there is no significant correlation between the number of amino acid residues and number of cation-pi interactions. These interactions are mainly formed by long-range contacts, and the role of short and medium-range contacts is minimal. The preference of Arg is higher than Lys to form cation-pi interactions. The pair-wise cation-pi interaction energy between aromatic and positively charged residues shows that Arg-Tyr energy is the strongest among the possible six pairs. The structural analysis of cation-pi interaction forming residues shows that Lys, Trp, and Tyr prefer to be in the binding site of protein-DNA complexes. Further, the accessible surface areas of cation-pi interaction forming cationic residues are significantly less than that of other residues. The preference of cation-pi interaction forming residues in different secondary structures shows that Lys prefers to be in strand and Phe prefers to be in turn regions. The results obtained in the present study will be useful in understanding the contribution of cation-pi interactions to the stability and specificity of protein-DNA complexes.
Collapse
Affiliation(s)
- M Michael Gromiha
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), Aomi Frontier Building 17F, 2-43 Aomi, Koto-ku, Tokyo 135-0064, Japan.
| | | | | |
Collapse
|
41
|
Gromiha MM. A Statistical Model for Predicting Protein Folding Rates from Amino Acid Sequence with Structural Class Information. J Chem Inf Model 2005; 45:494-501. [PMID: 15807515 DOI: 10.1021/ci049757q] [Citation(s) in RCA: 89] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Prediction of protein folding rates from amino acid sequences is one of the most important challenges in molecular biology. In this work, I have related the protein folding rates with physical-chemical, energetic and conformational properties of amino acid residues. I found that the classification of proteins into different structural classes shows an excellent correlation between amino acid properties and folding rates of two- and three-state proteins, indicating the importance of native state topology in determining the protein folding rates. I have formulated a simple linear regression model for predicting the protein folding rates from amino acid sequences along with structural class information and obtained an excellent agreement between predicted and experimentally observed folding rates of proteins; the correlation coefficients are 0.99, 0.96 and 0.95, respectively, for all-alpha, all-beta and mixed class proteins. This is the first available method, which is capable of predicting the protein folding rates just from the amino acid sequence with the aid of generic amino acid properties and structural class information.
Collapse
Affiliation(s)
- M Michael Gromiha
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), Aomi Frontier Building 17F, 2-43 Aomi, Koto-ku, Tokyo 135-0064, Japan.
| |
Collapse
|
42
|
Faisca PFN, Telo da Gama MM. Native geometry and the dynamics of protein folding. Biophys Chem 2004; 115:169-75. [PMID: 15752600 DOI: 10.1016/j.bpc.2004.12.022] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2004] [Revised: 10/28/2004] [Accepted: 12/10/2004] [Indexed: 11/24/2022]
Abstract
In this paper, we investigate the role of native geometry on the kinetics of protein folding based on simple lattice models and Monte Carlo simulations. Results obtained within the scope of the Miyazawa-Jernigan indicate the existence of two dynamical folding regimes depending on the protein chain length. For chains larger than 80 amino acids, the folding performance is sensitive to the native state's conformation. Smaller chains, with less than 80 amino acids, fold via two-state kinetics and exhibit a significant correlation between the contact order parameter and the logarithmic folding times. In particular, chains with N=48 amino acids were found to belong to two broad classes of folding, characterized by different cooperativity, depending on the contact order parameter. Preliminary results based on the Go model show that the effect of long-range contact interaction strength in the folding kinetics is largely dependent on the native state's geometry.
Collapse
Affiliation(s)
- P F N Faisca
- CFTC, Av. Prof. Gama Pinto 2, 1649-003 Lisboa Codex, Portugal.
| | | |
Collapse
|
43
|
Gromiha MM, Selvaraj S. Inter-residue interactions in protein folding and stability. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2004; 86:235-77. [PMID: 15288760 DOI: 10.1016/j.pbiomolbio.2003.09.003] [Citation(s) in RCA: 225] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
During the process of protein folding, the amino acid residues along the polypeptide chain interact with each other in a cooperative manner to form the stable native structure. The knowledge about inter-residue interactions in protein structures is very helpful to understand the mechanism of protein folding and stability. In this review, we introduce the classification of inter-residue interactions into short, medium and long range based on a simple geometric approach. The features of these interactions in different structural classes of globular and membrane proteins, and in various folds have been delineated. The development of contact potentials and the application of inter-residue contacts for predicting the structural class and secondary structures of globular proteins, solvent accessibility, fold recognition and ab initio tertiary structure prediction have been evaluated. Further, the relationship between inter-residue contacts and protein-folding rates has been highlighted. Moreover, the importance of inter-residue interactions in protein-folding kinetics and for understanding the stability of proteins has been discussed. In essence, the information gained from the studies on inter-residue interactions provides valuable insights for understanding protein folding and de novo protein design.
Collapse
Affiliation(s)
- M Michael Gromiha
- Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Aomi Frontier Building 17F, 2-43 Aomi, Koto-ku, Tokyo 135-0064, Japan.
| | | |
Collapse
|
44
|
Sun T, Zhang L. Effect of secondary structure on the conformations and folding behaviors of protein-like chains. POLYMER 2004. [DOI: 10.1016/j.polymer.2004.08.069] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
45
|
Gromiha MM, Parry DAD. Characteristic features of amino acid residues in coiled-coil protein structures. Biophys Chem 2004; 111:95-103. [PMID: 15381307 DOI: 10.1016/j.bpc.2004.05.001] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2004] [Revised: 05/06/2004] [Accepted: 05/06/2004] [Indexed: 11/21/2022]
Abstract
Detailed analyses of protein structures provide an opportunity to understand conformation and function in terms of amino acid sequence and composition. In this work, we have systematically analyzed the characteristic features of the amino acid residues found in alpha-helical coiled-coils and, in so doing, have developed indices for their properties, conformational parameters, surrounding hydrophobicity and flexibility. As expected, there is preference for hydrophobic (Ala, Leu), positive (Lys, Arg) and negatively (Glu) charged residues in coiled-coil domains. However, the surrounding hydrophobicity of residues in coiled-coil domains is significantly less than that for residues in other regions of coiled-coil proteins. The analysis of temperature factors in coiled-coil proteins shows that the residues in these domains are more stable than those in other regions. Further, we have delineated the medium- and long-range contacts in coiled-coil domains and compared the results with those obtained for other (non-coiled-coil) parts of the same proteins and non-coiled-coil helical segments of globular proteins. The residues in coiled-coil domains are largely influenced by medium-range contacts, whereas long-range interactions play a dominant role in other regions of these same proteins as well as in non-coiled-coil helices. We have also revealed the preference of amino acid residues to form cation-pi interactions and we found that Arg is more likely to form such interactions than Lys. The parameters developed in this work can be used to understand the folding and stability of coiled-coil proteins in general.
Collapse
Affiliation(s)
- M Michael Gromiha
- Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Aomi Frontier Building 17F, 2-43 Aomi, Koto, Tokyo 135-0064, Japan.
| | | |
Collapse
|
46
|
|
47
|
Nölting B, Schälike W, Hampel P, Grundig F, Gantert S, Sips N, Bandlow W, Qi PX. Structural determinants of the rate of protein folding. J Theor Biol 2003; 223:299-307. [PMID: 12850450 DOI: 10.1016/s0022-5193(03)00091-2] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
To understand the mechanism of protein folding and to assist rational design of fast-folding, non-aggregating and stable artificial enzymes, it is essential to determine the structural parameters which govern the rate constants of folding, kf. It has been found that -logkf is a linear function of the so-called chain topology parameter (CTP) within the range of 10(-1)s(-1)< or = kf < or =10(8)s(-1). The correlation between -logkf and CTP is much improved than using previously published contact order (CO) method. It has been further suggested that short sequence separations may be preferred for the establishment of stable interactions for the design of novel artificial enzymes and the modification of slow-folding proteins with aggregating intermediates.
Collapse
Affiliation(s)
- Bengt Nölting
- Prussian Private Institute of Technology at Berlin, Am Schlosspark 30, Berlin D-13187, Germany.
| | | | | | | | | | | | | | | |
Collapse
|
48
|
Chen J, Zhang L, Jing L, Wang Y, Jiang Z, Zhao D. Predicting protein structure from long-range contacts. Biophys Chem 2003; 105:11-21. [PMID: 12932575 DOI: 10.1016/s0301-4622(03)00033-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Short-range and long-range contacts are important in forming protein structure. The proteins can be grouped into four different structural classes according to the content and topology of alpha-helices and beta-strands, and there are all-alpha, all-beta, alpha/beta and alpha+beta proteins. However, there is much difference in statistical property for those classes of proteins. In this paper, we will discuss protein structure in the view of the relative number of long-range (short-range) contacts for each residue. We find the percentage of residues having a large number of long-range contacts in protein is small in all-alpha class of proteins, and large in all-beta class of proteins. However, the percentage of residues is almost the same in alpha/beta and alpha+beta classes of proteins. We calculate the percentage of residues having the number of long-range contacts greater than or equal to (>/=) N(L)=5, and 7 for 428 proteins. The average percentage is 13.3%, 54.8%, 41.4% and 37.0% for all-alpha, all-beta, alpha/beta and alpha+beta classes of proteins with N(L)=5, respectively. With N(L) increasing, the percentage decreases, especially for all-alpha class of proteins. In the meantime, the percentage of residues having the number of short-range contacts greater than or equal to N(S) (>/=N(S)) in protein samples is large for all-alpha class of proteins, and small for all-beta class of proteins, especially for large N(S). We also investigate the ability of amino residues in forming a large number of long-range and short-range contacts. Cys, Val, Ile, Tyr, Trp and Phe can form a large number of long-range contacts easily, and Glu, Lys, Asp, Gln, Arg and Asn can form a large number of long-range contacts, but with difficulty. We also discuss the relative ability in forming short-range contacts for 20 amino residues. Comparison with Fauchere-Pliska hydrophobicity scale and the percentage of residues having large number of long-range contacts is also made. This investigation can provide some insights into the protein structure.
Collapse
Affiliation(s)
- Jin Chen
- Department of Physics, Zhejiang University, Hangzhou 310028, PR China
| | | | | | | | | | | |
Collapse
|
49
|
Selvaraj S, Gromiha MM. Role of hydrophobic clusters and long-range contact networks in the folding of (alpha/beta)8 barrel proteins. Biophys J 2003; 84:1919-25. [PMID: 12609894 PMCID: PMC1302761 DOI: 10.1016/s0006-3495(03)75000-0] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2002] [Accepted: 11/13/2002] [Indexed: 10/21/2022] Open
Abstract
Analysis on the three dimensional structures of (alpha/beta)(8) barrel proteins provides ample light to understand the factors that are responsible for directing and maintaining their common fold. In this work, the hydrophobically enriched clusters are identified in 92% of the considered (alpha/beta)(8) barrel proteins. The residue segments with hydrophobic clusters have high thermal stability. Further, these clusters are formed and stabilized through long-range interactions. Specifically, a network of long-range contacts connects adjacent beta-strands of the (alpha/beta)(8) barrel domain and the hydrophobic clusters. The implications of hydrophobic clusters and long-range networks in providing a feasible common mechanism for the folding of (alpha/beta)(8) barrel proteins are proposed.
Collapse
Affiliation(s)
- S Selvaraj
- Computational Biology Research Center (CBRC), Institute of Advanced Industrial Science and Technology (AIST) 2-41-6 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | | |
Collapse
|
50
|
Kumarevel TS, Gromiha MM, Selvaraj S, Gayatri K, Kumar PKR. Influence of medium- and long-range interactions in different folding types of globular proteins. Biophys Chem 2002; 99:189-98. [PMID: 12377369 DOI: 10.1016/s0301-4622(02)00183-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Recognition of protein fold from amino acid sequence is a challenging task. The structure and stability of proteins from different fold are mainly dictated by inter-residue interactions. In our earlier work, we have successfully used the medium- and long-range contacts for predicting the protein folding rates, discriminating globular and membrane proteins and for distinguishing protein structural classes. In this work, we analyze the role of inter-residue interactions in commonly occurring folds of globular proteins in order to understand their folding mechanisms. In the medium-range contacts, the globin fold and four-helical bundle proteins have more contacts than that of DNA-RNA fold although they all belong to all-alpha class. In long-range contacts, only the ribonuclease fold prefers 4-10 range and the other folding types prefer the range 21-30 in alpha/beta class proteins. Further, the preferred residues and residue pairs influenced by these different folds are discussed. The information about the preference of medium- and long-range contacts exhibited by the 20 amino acid residues can be effectively used to predict the folding type of each protein.
Collapse
Affiliation(s)
- T S Kumarevel
- National Institute of Advanced Industrial Science and Technology (AIST), Institute of Molecular and Cell Biology, Functional Nucleic Acids Group, Tsukuba Central 6, 1-1 Higashi, Tsukuba Science City, Ibaraki, Japan.
| | | | | | | | | |
Collapse
|