51
|
Affiliation(s)
- Melissa Coates Ford
- Department of Biochemistry & Molecular Biology, Colorado State University, Fort Collins, Colorado 80523-1870, United States
| | - P. Shing Ho
- Department of Biochemistry & Molecular Biology, Colorado State University, Fort Collins, Colorado 80523-1870, United States
| |
Collapse
|
52
|
Ozbaykal G, Rana Atilgan A, Atilgan C. In silicomutational studies of Hsp70 disclose sites with distinct functional attributes. Proteins 2015; 83:2077-90. [DOI: 10.1002/prot.24925] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2015] [Revised: 08/31/2015] [Accepted: 09/02/2015] [Indexed: 12/24/2022]
Affiliation(s)
- Gizem Ozbaykal
- Faculty of Engineering and Natural Sciences; Sabanci University; Tuzla Istanbul 34956 Turkey
| | - Ali Rana Atilgan
- Faculty of Engineering and Natural Sciences; Sabanci University; Tuzla Istanbul 34956 Turkey
| | - Canan Atilgan
- Faculty of Engineering and Natural Sciences; Sabanci University; Tuzla Istanbul 34956 Turkey
| |
Collapse
|
53
|
Elhefnawy W, Chen L, Han Y, Li Y. ICOSA: A Distance-Dependent, Orientation-Specific Coarse-Grained Contact Potential for Protein Structure Modeling. J Mol Biol 2015; 427:2562-2576. [DOI: 10.1016/j.jmb.2015.05.022] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Accepted: 05/21/2015] [Indexed: 11/16/2022]
|
54
|
Kozma D, Tusnády GE. TMFoldRec: a statistical potential-based transmembrane protein fold recognition tool. BMC Bioinformatics 2015; 16:201. [PMID: 26123059 PMCID: PMC4486421 DOI: 10.1186/s12859-015-0638-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2015] [Accepted: 06/06/2015] [Indexed: 12/26/2022] Open
Abstract
Background Transmembrane proteins (TMPs) are the key components of signal transduction, cell-cell adhesion and energy and material transport into and out from the cells. For the deep understanding of these processes, structure determination of transmembrane proteins is indispensable. However, due to technical difficulties, only a few transmembrane protein structures have been determined experimentally. Large-scale genomic sequencing provides increasing amounts of sequence information on the proteins and whole proteomes of living organisms resulting in the challenge of bioinformatics; how the structural information should be gained from a sequence. Results Here, we present a novel method, TMFoldRec, for fold prediction of membrane segments in transmembrane proteins. TMFoldRec based on statistical potentials was tested on a benchmark set containing 124 TMP chains from the PDBTM database. Using a 10-fold jackknife method, the native folds were correctly identified in 77 % of the cases. This accuracy overcomes the state-of-the-art methods. In addition, a key feature of TMFoldRec algorithm is the ability to estimate the reliability of the prediction and to decide with an accuracy of 70 %, whether the obtained, lowest energy structure is the native one. Conclusion These results imply that the membrane embedded parts of TMPs dictate the TM structures rather than the soluble parts. Moreover, predictions with reliability scores make in this way our algorithm applicable for proteome-wide analyses. Availability The program is available upon request for academic use. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0638-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Dániel Kozma
- "Momentum" Membrane Protein Bioinformatics Research Group, Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7, , H 1518, Budapest, Hungary.
| | - Gábor E Tusnády
- "Momentum" Membrane Protein Bioinformatics Research Group, Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7, , H 1518, Budapest, Hungary.
| |
Collapse
|
55
|
Huang Q, You Z, Zhang X, Zhou Y. Prediction of protein-protein interactions with clustered amino acids and weighted sparse representation. Int J Mol Sci 2015; 16:10855-69. [PMID: 25984606 PMCID: PMC4463679 DOI: 10.3390/ijms160510855] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2015] [Revised: 05/06/2015] [Accepted: 05/07/2015] [Indexed: 01/22/2023] Open
Abstract
With the completion of the Human Genome Project, bioscience has entered into the era of the genome and proteome. Therefore, protein–protein interactions (PPIs) research is becoming more and more important. Life activities and the protein–protein interactions are inseparable, such as DNA synthesis, gene transcription activation, protein translation, etc. Though many methods based on biological experiments and machine learning have been proposed, they all spent a long time to learn and obtained an imprecise accuracy. How to efficiently and accurately predict PPIs is still a big challenge. To take up such a challenge, we developed a new predictor by incorporating the reduced amino acid alphabet (RAAA) information into the general form of pseudo-amino acid composition (PseAAC) and with the weighted sparse representation-based classification (WSRC). The remarkable advantages of introducing the reduced amino acid alphabet is being able to avoid the notorious dimensionality disaster or overfitting problem in statistical prediction. Additionally, experiments have proven that our method achieved good performance in both a low- and high-dimensional feature space. Among all of the experiments performed on the PPIs data of Saccharomyces cerevisiae, the best one achieved 90.91% accuracy, 94.17% sensitivity, 87.22% precision and a 83.43% Matthews correlation coefficient (MCC) value. In order to evaluate the prediction ability of our method, extensive experiments are performed to compare with the state-of-the-art technique, support vector machine (SVM). The achieved results show that the proposed approach is very promising for predicting PPIs, and it can be a helpful supplement for PPIs prediction.
Collapse
Affiliation(s)
- Qiaoying Huang
- Shenzhen Graduate School, Harbin Institute of Technology, HIT Campus of University Town of Shenzhen, Shenzhen 518055, China.
| | - Zhuhong You
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China.
| | - Xiaofeng Zhang
- Shenzhen Graduate School, Harbin Institute of Technology, HIT Campus of University Town of Shenzhen, Shenzhen 518055, China.
| | - Yong Zhou
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China.
| |
Collapse
|
56
|
Tang K, Wong SWK, Liu JS, Zhang J, Liang J. Conformational sampling and structure prediction of multiple interacting loops in soluble and β-barrel membrane proteins using multi-loop distance-guided chain-growth Monte Carlo method. Bioinformatics 2015; 31:2646-52. [PMID: 25861965 DOI: 10.1093/bioinformatics/btv198] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2014] [Accepted: 04/03/2015] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Loops in proteins are often involved in biochemical functions. Their irregularity and flexibility make experimental structure determination and computational modeling challenging. Most current loop modeling methods focus on modeling single loops. In protein structure prediction, multiple loops often need to be modeled simultaneously. As interactions among loops in spatial proximity can be rather complex, sampling the conformations of multiple interacting loops is a challenging task. RESULTS In this study, we report a new method called multi-loop Distance-guided Sequential chain-Growth Monte Carlo (M-DiSGro) for prediction of the conformations of multiple interacting loops in proteins. Our method achieves an average RMSD of 1.93 Å for lowest energy conformations of 36 pairs of interacting protein loops with the total length ranging from 12 to 24 residues. We further constructed a data set containing proteins with 2, 3 and 4 interacting loops. For the most challenging target proteins with four loops, the average RMSD of the lowest energy conformations is 2.35 Å. Our method is also tested for predicting multiple loops in β-barrel membrane proteins. For outer-membrane protein G, the lowest energy conformation has a RMSD of 2.62 Å for the three extracellular interacting loops with a total length of 34 residues (12, 12 and 10 residues in each loop). AVAILABILITY AND IMPLEMENTATION The software is freely available at: tanto.bioe.uic.edu/m-DiSGro. CONTACT jinfeng@stat.fsu.edu or jliang@uic.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ke Tang
- Richard and Loan Hill Department of Bioengineering, University of Illinois at Chicago, Chicago, IL
| | - Samuel W K Wong
- Department of Statistics, University of Florida, Gainesville, FL
| | - Jun S Liu
- Department of Statistics, Harvard University, Science Center, Cambridge, MA and
| | - Jinfeng Zhang
- Department of Statistics, Florida State University, Tallahassee, FL, USA
| | - Jie Liang
- Richard and Loan Hill Department of Bioengineering, University of Illinois at Chicago, Chicago, IL
| |
Collapse
|
57
|
Grinter SZ, Zou X. Challenges, applications, and recent advances of protein-ligand docking in structure-based drug design. Molecules 2014; 19:10150-76. [PMID: 25019558 PMCID: PMC6270832 DOI: 10.3390/molecules190710150] [Citation(s) in RCA: 123] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2014] [Revised: 06/13/2014] [Accepted: 07/02/2014] [Indexed: 11/16/2022] Open
Abstract
The docking methods used in structure-based virtual database screening offer the ability to quickly and cheaply estimate the affinity and binding mode of a ligand for the protein receptor of interest, such as a drug target. These methods can be used to enrich a database of compounds, so that more compounds that are subsequently experimentally tested are found to be pharmaceutically interesting. In addition, like all virtual screening methods used for drug design, structure-based virtual screening can focus on curated libraries of synthesizable compounds, helping to reduce the expense of subsequent experimental verification. In this review, we introduce the protein-ligand docking methods used for structure-based drug design and other biological applications. We discuss the fundamental challenges facing these methods and some of the current methodological topics of interest. We also discuss the main approaches for applying protein-ligand docking methods. We end with a discussion of the challenging aspects of evaluating or benchmarking the accuracy of docking methods for their improvement, and discuss future directions.
Collapse
Affiliation(s)
- Sam Z Grinter
- Informatics Institute, University of Missouri, Columbia, MO 65211, USA.
| | - Xiaoqin Zou
- Informatics Institute, University of Missouri, Columbia, MO 65211, USA.
| |
Collapse
|
58
|
Computational and experimental approaches to reveal the effects of single nucleotide polymorphisms with respect to disease diagnostics. Int J Mol Sci 2014; 15:9670-717. [PMID: 24886813 PMCID: PMC4100115 DOI: 10.3390/ijms15069670] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2014] [Revised: 05/15/2014] [Accepted: 05/16/2014] [Indexed: 12/25/2022] Open
Abstract
DNA mutations are the cause of many human diseases and they are the reason for natural differences among individuals by affecting the structure, function, interactions, and other properties of DNA and expressed proteins. The ability to predict whether a given mutation is disease-causing or harmless is of great importance for the early detection of patients with a high risk of developing a particular disease and would pave the way for personalized medicine and diagnostics. Here we review existing methods and techniques to study and predict the effects of DNA mutations from three different perspectives: in silico, in vitro and in vivo. It is emphasized that the problem is complicated and successful detection of a pathogenic mutation frequently requires a combination of several methods and a knowledge of the biological phenomena associated with the corresponding macromolecules.
Collapse
|
59
|
Jiang F, Zhou CY, Wu YD. Residue-Specific Force Field Based on the Protein Coil Library. RSFF1: Modification of OPLS-AA/L. J Phys Chem B 2014; 118:6983-98. [DOI: 10.1021/jp5017449] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Fan Jiang
- Laboratory
of Computational Chemistry and Drug Design, Laboratory of Chemical
Genomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China
| | - Chen-Yang Zhou
- College
of Chemistry, Peking University, Beijing 100871, China
| | - Yun-Dong Wu
- Laboratory
of Computational Chemistry and Drug Design, Laboratory of Chemical
Genomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China
- College
of Chemistry, Peking University, Beijing 100871, China
| |
Collapse
|
60
|
Tang K, Zhang J, Liang J. Fast protein loop sampling and structure prediction using distance-guided sequential chain-growth Monte Carlo method. PLoS Comput Biol 2014; 10:e1003539. [PMID: 24763317 PMCID: PMC3998890 DOI: 10.1371/journal.pcbi.1003539] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2013] [Accepted: 02/01/2014] [Indexed: 11/18/2022] Open
Abstract
Loops in proteins are flexible regions connecting regular secondary structures. They are often involved in protein functions through interacting with other molecules. The irregularity and flexibility of loops make their structures difficult to determine experimentally and challenging to model computationally. Conformation sampling and energy evaluation are the two key components in loop modeling. We have developed a new method for loop conformation sampling and prediction based on a chain growth sequential Monte Carlo sampling strategy, called Distance-guided Sequential chain-Growth Monte Carlo (DISGRO). With an energy function designed specifically for loops, our method can efficiently generate high quality loop conformations with low energy that are enriched with near-native loop structures. The average minimum global backbone RMSD for 1,000 conformations of 12-residue loops is 1:53 A° , with a lowest energy RMSD of 2:99 A° , and an average ensembleRMSD of 5:23 A° . A novel geometric criterion is applied to speed up calculations. The computational cost of generating 1,000 conformations for each of the x loops in a benchmark dataset is only about 10 cpu minutes for 12-residue loops, compared to ca 180 cpu minutes using the FALCm method. Test results on benchmark datasets show that DISGRO performs comparably or better than previous successful methods, while requiring far less computing time. DISGRO is especially effective in modeling longer loops (10-17 residues).
Collapse
Affiliation(s)
- Ke Tang
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - Jinfeng Zhang
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
- * E-mail: (JZ); (JL)
| | - Jie Liang
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois, United States of America
- * E-mail: (JZ); (JL)
| |
Collapse
|
61
|
Predicting the types of J-proteins using clustered amino acids. BIOMED RESEARCH INTERNATIONAL 2014; 2014:935719. [PMID: 24804260 PMCID: PMC3996952 DOI: 10.1155/2014/935719] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/24/2014] [Revised: 03/04/2014] [Accepted: 03/13/2014] [Indexed: 01/24/2023]
Abstract
J-proteins are molecular chaperones and present in a wide variety of organisms from prokaryote to eukaryote. Based on their domain organizations, J-proteins can be classified into 4 types, that is, Type I, Type II, Type III, and Type IV. Different types of J-proteins play distinct roles in influencing cancer properties and cell death. Thus, reliably annotating the types of J-proteins is essential to better understand their molecular functions. In the present work, a support vector machine based method was developed to identify the types of J-proteins using the tripeptide composition of reduced amino acid alphabet. In the jackknife cross-validation, the maximum overall accuracy of 94% was achieved on a stringent benchmark dataset. We also analyzed the amino acid compositions by using analysis of variance and found the distinct distributions of amino acids in each family of the J-proteins. To enhance the value of the practical applications of the proposed model, an online web server was developed and can be freely accessed.
Collapse
|
62
|
Abstract
By focusing on essential features, while averaging over less important details, coarse-grained (CG) models provide significant computational and conceptual advantages with respect to more detailed models. Consequently, despite dramatic advances in computational methodologies and resources, CG models enjoy surging popularity and are becoming increasingly equal partners to atomically detailed models. This perspective surveys the rapidly developing landscape of CG models for biomolecular systems. In particular, this review seeks to provide a balanced, coherent, and unified presentation of several distinct approaches for developing CG models, including top-down, network-based, native-centric, knowledge-based, and bottom-up modeling strategies. The review summarizes their basic philosophies, theoretical foundations, typical applications, and recent developments. Additionally, the review identifies fundamental inter-relationships among the diverse approaches and discusses outstanding challenges in the field. When carefully applied and assessed, current CG models provide highly efficient means for investigating the biological consequences of basic physicochemical principles. Moreover, rigorous bottom-up approaches hold great promise for further improving the accuracy and scope of CG models for biomolecular systems.
Collapse
Affiliation(s)
- W G Noid
- Department of Chemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
63
|
Grinter SZ, Zou X. A Bayesian statistical approach of improving knowledge-based scoring functions for protein-ligand interactions. J Comput Chem 2014; 35:932-43. [DOI: 10.1002/jcc.23579] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2013] [Revised: 01/29/2014] [Accepted: 02/11/2014] [Indexed: 01/06/2023]
Affiliation(s)
- Sam Z. Grinter
- Informatics Institute, University of Missouri; Columbia Missouri 65211
- Dalton Cardiovascular Research Center, University of Missouri; Columbia Missouri 65211
| | - Xiaoqin Zou
- Informatics Institute, University of Missouri; Columbia Missouri 65211
- Dalton Cardiovascular Research Center, University of Missouri; Columbia Missouri 65211
- Department of Physics and Astronomy; University of Missouri; Columbia Missouri 65211
- Department of Biochemistry; University of Missouri; Columbia Missouri 65211
| |
Collapse
|
64
|
Huang SY, Zou X. A knowledge-based scoring function for protein-RNA interactions derived from a statistical mechanics-based iterative method. Nucleic Acids Res 2014; 42:e55. [PMID: 24476917 PMCID: PMC3985650 DOI: 10.1093/nar/gku077] [Citation(s) in RCA: 94] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Protein-RNA interactions play important roles in many biological processes. Given the high cost and technique difficulties in experimental methods, computationally predicting the binding complexes from individual protein and RNA structures is pressingly needed, in which a reliable scoring function is one of the critical components. Here, we have developed a knowledge-based scoring function, referred to as ITScore-PR, for protein-RNA binding mode prediction by using a statistical mechanics-based iterative method. The pairwise distance-dependent atomic interaction potentials of ITScore-PR were derived from experimentally determined protein–RNA complex structures. For validation, we have compared ITScore-PR with 10 other scoring methods on four diverse test sets. For bound docking, ITScore-PR achieved a success rate of up to 86% if the top prediction was considered and up to 94% if the top 10 predictions were considered, respectively. For truly unbound docking, the respective success rates of ITScore-PR were up to 24 and 46%. ITScore-PR can be used stand-alone or easily implemented in other docking programs for protein–RNA recognition.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute, University of Missouri, Columbia, MO 65211, USA
| | | |
Collapse
|
65
|
Huang SY, Zou X. ITScorePro: an efficient scoring program for evaluating the energy scores of protein structures for structure prediction. Methods Mol Biol 2014; 1137:71-81. [PMID: 24573475 PMCID: PMC11121506 DOI: 10.1007/978-1-4939-0366-5_6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
One important component in protein structure prediction is to evaluate the free energy of a given conformation. Given the enormous number of possible conformations for a sequence, it is extremely challenging to quickly and accurately score the energies of these conformations and predict a reasonable structure within a practical computational time. Here, we describe an efficient program for energy evaluation, referred to as ITScorePro (Copyright © 2012). The energy scoring function in the ITScorePro program is based on the distance-dependent, pairwise atomic potentials for protein structure prediction that we recently derived by using statistical mechanics principles (Huang and Zou, Proteins 79:2648-2661, 2011). ITScorePro is a stand-alone program and can also be easily implemented in other software suites for protein structure prediction.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, Dalton Cardiovascular Research Center, Informatics Institute, University of Missouri, Columbia, MO, USA
| | | |
Collapse
|
66
|
Feng PM, Chen W, Lin H, Chou KC. iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal Biochem 2013; 442:118-25. [DOI: 10.1016/j.ab.2013.05.024] [Citation(s) in RCA: 230] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2013] [Revised: 05/21/2013] [Accepted: 05/22/2013] [Indexed: 01/22/2023]
|
67
|
Moal IH, Torchala M, Bates PA, Fernández-Recio J. The scoring of poses in protein-protein docking: current capabilities and future directions. BMC Bioinformatics 2013; 14:286. [PMID: 24079540 PMCID: PMC3850738 DOI: 10.1186/1471-2105-14-286] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2013] [Accepted: 09/25/2013] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Protein-protein docking, which aims to predict the structure of a protein-protein complex from its unbound components, remains an unresolved challenge in structural bioinformatics. An important step is the ranking of docked poses using a scoring function, for which many methods have been developed. There is a need to explore the differences and commonalities of these methods with each other, as well as with functions developed in the fields of molecular dynamics and homology modelling. RESULTS We present an evaluation of 115 scoring functions on an unbound docking decoy benchmark covering 118 complexes for which a near-native solution can be found, yielding top 10 success rates of up to 58%. Hierarchical clustering is performed, so as to group together functions which identify near-natives in similar subsets of complexes. Three set theoretic approaches are used to identify pairs of scoring functions capable of correctly scoring different complexes. This shows that functions in different clusters capture different aspects of binding and are likely to work together synergistically. CONCLUSIONS All functions designed specifically for docking perform well, indicating that functions are transferable between sampling methods. We also identify promising methods from the field of homology modelling. Further, differential success rates by docking difficulty and solution quality suggest a need for flexibility-dependent scoring. Investigating pairs of scoring functions, the set theoretic measures identify known scoring strategies as well as a number of novel approaches, indicating promising augmentations of traditional scoring methods. Such augmentation and parameter combination strategies are discussed in the context of the learning-to-rank paradigm.
Collapse
Affiliation(s)
- Iain H Moal
- Joint BSC-IRB Research Program in Computational Biology, Life Science Department, Barcelona Super computing Center, Barcelona 08034, Spain
| | - Mieczyslaw Torchala
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London WC2A 3LY, UK
| | - Paul A Bates
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London WC2A 3LY, UK
| | - Juan Fernández-Recio
- Joint BSC-IRB Research Program in Computational Biology, Life Science Department, Barcelona Super computing Center, Barcelona 08034, Spain
| |
Collapse
|
68
|
Yan Z, Wang J. Optimizing scoring function of protein-nucleic acid interactions with both affinity and specificity. PLoS One 2013; 8:e74443. [PMID: 24098651 PMCID: PMC3787031 DOI: 10.1371/journal.pone.0074443] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2013] [Accepted: 08/02/2013] [Indexed: 12/14/2022] Open
Abstract
Protein-nucleic acid (protein-DNA and protein-RNA) recognition is fundamental to the regulation of gene expression. Determination of the structures of the protein-nucleic acid recognition and insight into their interactions at molecular level are vital to understanding the regulation function. Recently, quantitative computational approach has been becoming an alternative of experimental technique for predicting the structures and interactions of biomolecular recognition. However, the progress of protein-nucleic acid structure prediction, especially protein-RNA, is far behind that of the protein-ligand and protein-protein structure predictions due to the lack of reliable and accurate scoring function for quantifying the protein-nucleic acid interactions. In this work, we developed an accurate scoring function (named as SPA-PN, SPecificity and Affinity of the Protein-Nucleic acid interactions) for protein-nucleic acid interactions by incorporating both the specificity and affinity into the optimization strategy. Specificity and affinity are two requirements of highly efficient and specific biomolecular recognition. Previous quantitative descriptions of the biomolecular interactions considered the affinity, but often ignored the specificity owing to the challenge of specificity quantification. We applied our concept of intrinsic specificity to connect the conventional specificity, which circumvents the challenge of specificity quantification. In addition to the affinity optimization, we incorporated the quantified intrinsic specificity into the optimization strategy of SPA-PN. The testing results and comparisons with other scoring functions validated that SPA-PN performs well on both the prediction of binding affinity and identification of native conformation. In terms of its performance, SPA-PN can be widely used to predict the protein-nucleic acid structures and quantify their interactions.
Collapse
Affiliation(s)
- Zhiqiang Yan
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, New York, United States of America
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, China
| | - Jin Wang
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, New York, United States of America
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, China
| |
Collapse
|
69
|
Truong HH, Kim BL, Schafer NP, Wolynes PG. Funneling and frustration in the energy landscapes of some designed and simplified proteins. J Chem Phys 2013; 139:121908. [PMID: 24089720 PMCID: PMC3732306 DOI: 10.1063/1.4813504] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Accepted: 06/26/2013] [Indexed: 11/15/2022] Open
Abstract
We explore the similarities and differences between the energy landscapes of proteins that have been selected by nature and those of some proteins designed by humans. Natural proteins have evolved to function as well as fold, and this is a source of energetic frustration. The sequence of Top7, on the other hand, was designed with architecture alone in mind using only native state stability as the optimization criterion. Its topology had not previously been observed in nature. Experimental studies show that the folding kinetics of Top7 is more complex than the kinetics of folding of otherwise comparable naturally occurring proteins. In this paper, we use structure prediction tools, frustration analysis, and free energy profiles to illustrate the folding landscapes of Top7 and two other proteins designed by Takada. We use both perfectly funneled (structure-based) and predictive (transferable) models to gain insight into the role of topological versus energetic frustration in these systems and show how they differ from those found for natural proteins. We also study how robust the folding of these designs would be to the simplification of the sequences using fewer amino acid types. Simplification using a five amino acid type code results in comparable quality of structure prediction to the full sequence in some cases, while the two-letter simplification scheme dramatically reduces the quality of structure prediction.
Collapse
Affiliation(s)
- Ha H Truong
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
| | | | | | | |
Collapse
|
70
|
Liu Y, Xu Z, Yang Z, Chen K, Zhu W. A knowledge-based halogen bonding scoring function for predicting protein-ligand interactions. J Mol Model 2013; 19:5015-30. [PMID: 24072554 DOI: 10.1007/s00894-013-2005-7] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2013] [Accepted: 09/08/2013] [Indexed: 11/30/2022]
Abstract
Halogen bonding, a non-covalent interaction between the halogen σ-hole and Lewis bases, could not be properly characterized by majority of current scoring functions. In this study, a knowledge-based halogen bonding scoring function, termed XBPMF, was developed by an iterative method for predicting protein-ligand interactions. Three sets of pairwise potentials were derived from two training sets of protein-ligand complexes from the Protein Data Bank. It was found that two-dimensional pairwise potentials could characterize appropriately the distance and angle profiles of halogen bonding, which is superior to one-dimensional pairwise potentials. With comparison to six widely used scoring functions, XBPMF was evaluated to have moderate power for predicting protein-ligand interactions in terms of "docking power", "ranking power" and "scoring power". Especially, it has a rather satisfactory performance for the systems with typical halogen bonds. To the best of our knowledge, XBPMF is the first halogen bonding scoring function that is not dependent on any dummy atom, and is practical for high-throughput virtual screening. Therefore, this scoring function should be useful for the study and application of halogen bonding interactions like molecular docking and lead optimization.
Collapse
Affiliation(s)
- Yingtao Liu
- Drug Discovery and Design Center, CAS Key Laboratory of Receptor Structure and Function, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China,
| | | | | | | | | |
Collapse
|
71
|
Mashaghi A, Kramer G, Bechtluft P, Zachmann-Brand B, Driessen AJM, Bukau B, Tans SJ. Reshaping of the conformational search of a protein by the chaperone trigger factor. Nature 2013; 500:98-101. [DOI: 10.1038/nature12293] [Citation(s) in RCA: 100] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2012] [Accepted: 05/14/2013] [Indexed: 12/20/2022]
|
72
|
Stephenson JD, Freeland SJ. Unearthing the root of amino acid similarity. J Mol Evol 2013; 77:159-69. [PMID: 23743923 PMCID: PMC6763418 DOI: 10.1007/s00239-013-9565-0] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2013] [Accepted: 05/08/2013] [Indexed: 12/31/2022]
Abstract
Similarities and differences between amino acids define the rates at which they substitute for one another within protein sequences and the patterns by which these sequences form protein structures. However, there exist many ways to measure similarity, whether one considers the molecular attributes of individual amino acids, the roles that they play within proteins, or some nuanced contribution of each. One popular approach to representing these relationships is to divide the 20 amino acids of the standard genetic code into groups, thereby forming a simplified amino acid alphabet. Here, we develop a method to compare or combine different simplified alphabets, and apply it to 34 simplified alphabets from the scientific literature. We use this method to show that while different suggestions vary and agree in non-intuitive ways, they combine to reveal a consensus view of amino acid similarity that is clearly rooted in physico-chemistry.
Collapse
Affiliation(s)
- James D Stephenson
- NASA Astrobiology Institute, University of Hawaii, Honolulu, HI, 96822, USA,
| | | |
Collapse
|
73
|
Capturing native/native like structures with a physico-chemical metric (pcSM) in protein folding. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2013; 1834:1520-31. [PMID: 23665455 DOI: 10.1016/j.bbapap.2013.04.023] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2013] [Revised: 04/12/2013] [Accepted: 04/15/2013] [Indexed: 12/15/2022]
Abstract
Specification of the three dimensional structure of a protein from its amino acid sequence, also called a "Grand Challenge" problem, has eluded a solution for over six decades. A modestly successful strategy has evolved over the last couple of decades based on development of scoring functions (e.g. mimicking free energy) that can capture native or native-like structures from an ensemble of decoys generated as plausible candidates for the native structure. A scoring function must be fast enough in discriminating the native from unfolded/misfolded structures, and requires validation on a large data set(s) to generate sufficient confidence in the score. Here we develop a scoring function called pcSM that detects true native structure in the top 5 with 93% accuracy from an ensemble of candidate structures. If we eliminate the native from ensemble of decoys then pcSM is able to capture near native structure (RMSD<=5Ǻ) in top 10 with 86% accuracy. The parameters considered in pcSM are a C-alpha Euclidean metric, secondary structural propensity, surface areas and an intramolecular energy function. pcSM has been tested on 415 systems consisting 142,698 decoys (public and CASP-largest reported hitherto in literature). The average rank for the native is 2.38, a significant improvement over that existing in literature. In-silico protein structure prediction requires robust scoring technique(s). Therefore, pcSM is easily amenable to integration into a successful protein structure prediction strategy. The tool is freely available at http://www.scfbio-iitd.res.in/software/pcsm.jsp.
Collapse
|
74
|
Zheng Z, Merz KM. Development of the knowledge-based and empirical combined scoring algorithm (KECSA) to score protein-ligand interactions. J Chem Inf Model 2013; 53:1073-83. [PMID: 23560465 DOI: 10.1021/ci300619x] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
We describe a novel knowledge-based protein-ligand scoring function that employs a new definition for the reference state, allowing us to relate a statistical potential to a Lennard-Jones (LJ) potential. In this way, the LJ potential parameters were generated from protein-ligand complex structural data contained in the Protein Databank (PDB). Forty-nine (49) types of atomic pairwise interactions were derived using this method, which we call the knowledge-based and empirical combined scoring algorithm (KECSA). Two validation benchmarks were introduced to test the performance of KECSA. The first validation benchmark included two test sets that address the training set and enthalpy/entropy of KECSA. The second validation benchmark suite included two large-scale and five small-scale test sets, to compare the reproducibility of KECSA, with respect to two empirical score functions previously developed in our laboratory (LISA and LISA+), as well as to other well-known scoring methods. Validation results illustrate that KECSA shows improved performance in all test sets when compared with other scoring methods, especially in its ability to minimize the root mean square error (RMSE). LISA and LISA+ displayed similar performance using the correlation coefficient and Kendall τ as the metric of quality for some of the small test sets. Further pathways for improvement are discussed for which would allow KECSA to be more sensitive to subtle changes in ligand structure.
Collapse
Affiliation(s)
- Zheng Zheng
- Department of Chemistry and the Quantum Theory Project, University of Florida, Gainesville, Florida 32611-8435, United States
| | | |
Collapse
|
75
|
Yan Z, Guo L, Hu L, Wang J. Specificity and affinity quantification of protein-protein interactions. Bioinformatics 2013; 29:1127-33. [DOI: 10.1093/bioinformatics/btt121] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
76
|
Wu Y, Dai X, Huang N, Zhao L. A partition function-based weighting scheme in force field parameter development usingab initiocalculation results in global configurational space. J Comput Chem 2013; 34:1271-82. [DOI: 10.1002/jcc.23249] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Revised: 01/16/2013] [Accepted: 01/20/2013] [Indexed: 11/09/2022]
|
77
|
Abstract
Empirical protein folding potentialfunctions should have a global minimum nearthe native conformationof globular proteins that fold stably, andthey should give the correct free energy offolding. We demonstrate that otherwise verysuccessful potentials fail to have even alocal minimumanywhere near the native conformation, anda seemingly well validated method ofestimatingthe thermodynamic stability of the nativestate is extremely sensitive to smallperturbations inatomic coordinates. These are bothindicative of fitting a great deal ofirrelevant detail. Here weshow how to devise a robust potentialfunction that succeeds very well at bothtasks, at least for alimited set of proteins, and this involvesdeveloping a novel representation of thedenatured state.Predicted free energies of unfolding for 25mutants of barnase are in close agreementwith theexperimental values, while for 17 mutantsthere are substantial discrepancies.
Collapse
Affiliation(s)
- M Chhajer
- Department of Chemistry, University of North Carolina, Chapel Hill, NC 27599 U.S.A
| | | |
Collapse
|
78
|
Pasi M, Lavery R, Ceres N. PaLaCe: A Coarse-Grain Protein Model for Studying Mechanical Properties. J Chem Theory Comput 2012; 9:785-93. [DOI: 10.1021/ct3007925] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Affiliation(s)
- Marco Pasi
- Bases Moléculaires
et Structurales des Systèmes
Infectieux, Univ. Lyon I/CNRS UMR 5086, IBCP, 7 Passage du Vercors,
69367 Lyon, France
| | - Richard Lavery
- Bases Moléculaires
et Structurales des Systèmes
Infectieux, Univ. Lyon I/CNRS UMR 5086, IBCP, 7 Passage du Vercors,
69367 Lyon, France
| | - Nicoletta Ceres
- Bases Moléculaires
et Structurales des Systèmes
Infectieux, Univ. Lyon I/CNRS UMR 5086, IBCP, 7 Passage du Vercors,
69367 Lyon, France
| |
Collapse
|
79
|
Abstract
One of the key issues in the theoretical prediction of RNA folding is the prediction of loop structure from the sequence. RNA loop free energies are dependent on the loop sequence content. However, most current models account only for the loop length-dependence. The previously developed “Vfold” model (a coarse-grained RNA folding model) provides an effective method to generate the complete ensemble of coarse-grained RNA loop and junction conformations. However, due to the lack of sequence-dependent scoring parameters, the method is unable to identify the native and near-native structures from the sequence. In this study, using a previously developed iterative method for extracting the knowledge-based potential parameters from the known structures, we derive a set of dinucleotide-based statistical potentials for RNA loops and junctions. A unique advantage of the approach is its ability to go beyond the the (known) native structures by accounting for the full free energy landscape, including all the nonnative folds. The benchmark tests indicate that for given loop/junction sequences, the statistical potentials enable successful predictions for the coarse-grained 3D structures from the complete conformational ensemble generated by the Vfold model. The predicted coarse-grained structures can provide useful initial folds for further detailed structural refinement.
Collapse
Affiliation(s)
- Liang Liu
- Department of Physics and Department of Biochemistry, University of Missouri, Columbia, Missouri, United States of America
| | - Shi-Jie Chen
- Department of Physics and Department of Biochemistry, University of Missouri, Columbia, Missouri, United States of America
- * E-mail:
| |
Collapse
|
80
|
Narasimhan SL, Rajarajan AK, Vardharaj L. HP-sequence design for lattice proteins—An exact enumeration study on diamond as well as square lattice. J Chem Phys 2012; 137:115102. [DOI: 10.1063/1.4752479] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
|
81
|
Atilgan AR, Atilgan C. Local motifs in proteins combine to generate global functional moves. Brief Funct Genomics 2012; 11:479-88. [PMID: 22811517 DOI: 10.1093/bfgp/els027] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Literature on the topological properties of folded proteins that has emerged as a field in its own right in the past decade is reviewed. Physics-based construction of coarse-grained models of proteins from knowledge of all-atom coordinates of the average structure is discussed. Once network is thus obtained with the node and link information, local motifs provide plethora of information on protein function. The hierarchical structure of the proteins manifested in the interrelations of local motifs is emphasized. Motifs are also related to modularity of the structure, and they quantify shifts in the landscapes upon conformational changes induced by, e.g. ligand binding. Redundancy emerges as a balance between local and global network descriptors and is related to the collectivity of the protein motions. Introducing weight on links followed by sequential removal of least cohesive contacts allows interactions in proteins to be represented as the superposition of essential and redundant sets. Lack of the former makes the network non-functional, while the latter ensures robust functioning under a wide range of perturbation scenarios.
Collapse
Affiliation(s)
- Ali Rana Atilgan
- Faculty of Engineering and Natural Sciences, Sabanci University, 34956 Istanbul, Turkey
| | | |
Collapse
|
82
|
Pancsa R, Fuxreiter M. Interactions via intrinsically disordered regions: what kind of motifs? IUBMB Life 2012; 64:513-20. [PMID: 22535488 DOI: 10.1002/iub.1034] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2012] [Accepted: 03/06/2012] [Indexed: 12/22/2022]
Abstract
Proteins containing intrinsically disordered (ID) regions are widespread in eukaryotic organisms and are mostly utilized in regulatory processes. ID regions can mediate binary interactions of proteins or promote organization of large assemblies. Post-translational modifications of ID regions often serve as decision points in signaling pathways. Why Nature distinguished ID proteins in molecular recognition functions? In a simple view, binding of ID regions is accompanied by a large entropic penalty as compared to folded proteins. Even in complexes however, ID regions can preserve their conformational freedom, thereby recruit further partners and perform various functions. What sort of benefits ID regions offer for molecular interactions and which properties are exploited in the corresponding complexes? Here, we review models explaining the recognition mechanisms of ID proteins. Motif-based interactions are central to all proposed scenarios, including prestructured elements, anchoring sites and linear motifs. We aim to extract consensus features of the models, which could be used to predict ID-binding sites for a variety of partners.
Collapse
Affiliation(s)
- Rita Pancsa
- VIB Department of Structural Biology, Vrije Universiteit Brussel, Brussels, Belgium
| | | |
Collapse
|
83
|
Specificity quantification of biomolecular recognition and its implication for drug discovery. Sci Rep 2012; 2:309. [PMID: 22413060 PMCID: PMC3298884 DOI: 10.1038/srep00309] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2012] [Accepted: 02/09/2012] [Indexed: 11/09/2022] Open
Abstract
Highly efficient and specific biomolecular recognition requires both affinity and specificity. Previous quantitative descriptions of biomolecular recognition were mostly driven by improving the affinity prediction, but lack of quantification of specificity. We developed a novel method SPA (SPecificity and Affinity) based on our funneled energy landscape theory. The strategy is to simultaneously optimize the quantified specificity of the "native" protein-ligand complex discriminating against "non-native" binding modes and the affinity prediction. The benchmark testing of SPA shows the best performance against 16 other popular scoring functions in industry and academia on both prediction of binding affinity and "native" binding pose. For the target COX-2 of nonsteroidal anti-inflammatory drugs, SPA successfully discriminates the drugs from the diversity set, and the selective drugs from non-selective drugs. The remarkable performance demonstrates that SPA has significant potential applications in identifying lead compounds for drug discovery.
Collapse
|
84
|
Atilgan C, Okan OB, Atilgan AR. Network-based models as tools hinting at nonevident protein functionality. Annu Rev Biophys 2012; 41:205-25. [PMID: 22404685 DOI: 10.1146/annurev-biophys-050511-102305] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Network-based models of proteins are popular tools employed to determine dynamic features related to the folded structure. They encompass all topological and geometric computational approaches idealizing proteins as directly interacting nodes. Topology makes use of neighborhood information of residues, and geometry includes relative placement of neighbors. Coarse-grained approaches efficiently predict alternative conformations because of inherent collectivity in the protein structure. Such collectivity is moderated by topological characteristics that also tune neighborhood structure: That rich residues have richer neighbors secures robustness toward random loss of interactions/nodes due to environmental fluctuations/mutations. Geometry conveys the additional information of force balance to network models, establishing the local shape of the energy landscape. Here, residue and/or bond perturbations are critically evaluated to suggest new experiments, as network-based computational techniques prove useful in capturing domain movements and conformational shifts resulting from environmental alterations. Evolutionarily conserved residues are optimally connected, defining a subnetwork that may be utilized for further coarsening.
Collapse
Affiliation(s)
- Canan Atilgan
- Faculty of Engineering and Natural Sciences, Sabanci University, 34956 Istanbul, Turkey
| | | | | |
Collapse
|
85
|
Abstract
Proteins bind to other proteins efficiently and specifically to carry on many cell functions such as signaling, activation, transport, enzymatic reactions, and more. To determine the geometry and strength of binding of a protein pair, an energy function is required. An algorithm to design an optimal energy function, based on empirical data of protein complexes, is proposed and applied. Emphasis is made on negative design in which incorrect geometries are presented to the algorithm that learns to avoid them. For the docking problem the search for plausible geometries can be performed exhaustively. The possible geometries of the complex are generated on a grid with the help of a fast Fourier transform algorithm. A novel formulation of negative design makes it possible to investigate iteratively hundreds of millions of negative examples while monotonically improving the quality of the potential. Experimental structures for 640 protein complexes are used to generate positive and negative examples for learning parameters. The algorithm designed in this work finds the correct binding structure as the lowest energy minimum in 318 cases of the 640 examples. Further benchmarks on independent sets confirm the significant capacity of the scoring function to recognize correct modes of interactions.
Collapse
Affiliation(s)
- D V S Ravikant
- Department of Computer Science, Cornell University, 4130 Upson Hall, Ithaca, New York 14853, USA
| | | |
Collapse
|
86
|
Moughon SE, Samudrala R. LoCo: a novel main chain scoring function for protein structure prediction based on local coordinates. BMC Bioinformatics 2011; 12:368. [PMID: 21920038 PMCID: PMC3184297 DOI: 10.1186/1471-2105-12-368] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2011] [Accepted: 09/15/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Successful protein structure prediction requires accurate low-resolution scoring functions so that protein main chain conformations that are close to the native can be identified. Once that is accomplished, a more detailed and time-consuming treatment to produce all-atom models can be undertaken. The earliest low-resolution scoring used simple distance-based "contact potentials," but more recently, the relative orientations of interacting amino acids have been taken into account to improve performance. RESULTS We developed a new knowledge-based scoring function, LoCo, that locates the interaction partners of each individual residue within a local coordinate system based only on the position of its main chain N, Cα and C atoms. LoCo was trained on a large set of experimentally determined structures and optimized using standard sets of modeled structures, or "decoys." No structure used to train or optimize the function was included among those used to test it. When tested against 29 other published main chain functions on a group of 77 commonly used decoy sets, our function outperformed all others in Cα RMSD rank of the best-scoring decoy, with statistically significant p-values < 0.05 for 26 out of the 29 other functions considered. LoCo is fast, requiring on average less than 6 microseconds per residue for interaction and scoring on commonly-used computer hardware. CONCLUSIONS Our function demonstrates an unmatched combination of accuracy, speed, and simplicity and shows excellent promise for protein structure prediction. Broader applications may include protein-protein interactions and protein design.
Collapse
Affiliation(s)
- Stewart E Moughon
- Department of Microbiology, University of Washington, Box 357735, Seattle, Washington 98195-7242, USA.
| | | |
Collapse
|
87
|
Huang SY, Zou X. Statistical mechanics-based method to extract atomic distance-dependent potentials from protein structures. Proteins 2011; 79:2648-61. [PMID: 21732421 PMCID: PMC11108592 DOI: 10.1002/prot.23086] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2011] [Revised: 04/21/2011] [Accepted: 05/09/2011] [Indexed: 12/25/2022]
Abstract
In this study, we have developed a statistical mechanics-based iterative method to extract statistical atomic interaction potentials from known, nonredundant protein structures. Our method circumvents the long-standing reference state problem in deriving traditional knowledge-based scoring functions, by using rapid iterations through a physical, global convergence function. The rapid convergence of this physics-based method, unlike other parameter optimization methods, warrants the feasibility of deriving distance-dependent, all-atom statistical potentials to keep the scoring accuracy. The derived potentials, referred to as ITScore/Pro, have been validated using three diverse benchmarks: the high-resolution decoy set, the AMBER benchmark decoy set, and the CASP8 decoy set. Significant improvement in performance has been achieved. Finally, comparisons between the potentials of our model and potentials of a knowledge-based scoring function with a randomized reference state have revealed the reason for the better performance of our scoring function, which could provide useful insight into the development of other physical scoring functions. The potentials developed in this study are generally applicable for structural selection in protein structure prediction.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute, University of Missouri, Columbia, MO 65211
| | - Xiaoqin Zou
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute, University of Missouri, Columbia, MO 65211
| |
Collapse
|
88
|
Huang SY, Zou X. Scoring and lessons learned with the CSAR benchmark using an improved iterative knowledge-based scoring function. J Chem Inf Model 2011; 51:2097-106. [PMID: 21830787 DOI: 10.1021/ci2000727] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Based on a statistical mechanics-based iterative method, we have extracted a set of distance-dependent, all-atom pairwise potentials for protein-ligand interactions from the crystal structures of 1300 protein-ligand complexes. The iterative method circumvents the long-standing reference state problem in knowledge-based scoring functions. The resulted scoring function, referred to as ITScore 2.0, has been tested with the CSAR (Community Structure-Activity Resource, 2009 release) benchmark of 345 diverse protein-ligand complexes. ITScore 2.0 achieved a Pearson correlation of R(2) = 0.54 in binding affinity prediction. A comparative analysis has been done on the scoring performances of ITScore 2.0, the van der Waals (VDW) scoring function, the VDW with heavy atoms only, and the force field (FF) scoring function of DOCK which consists of a VDW term and an electrostatic term. The results reveal several important factors that affect the scoring performances, which could be helpful for the improvement of scoring functions.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri 65211, United States
| | | |
Collapse
|
89
|
Horejs C, Mitra MK, Pum D, Sleytr UB, Muthukumar M. Monte Carlo study of the molecular mechanisms of surface-layer protein self-assembly. J Chem Phys 2011; 134:125103. [PMID: 21456703 DOI: 10.1063/1.3565457] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
The molecular mechanisms guiding the self-assembly of proteins into functional or pathogenic large-scale structures can be only understood by studying the correlation between the structural details of the monomer and the eventual mesoscopic morphologies. Among the myriad structural details of protein monomers and their manifestations in the self-assembled morphologies, we seek to identify the most crucial set of structural features necessary for the spontaneous selection of desired morphologies. Using a combination of the structural information and a Monte Carlo method with a coarse-grained model, we have studied the functional protein self-assembly into S(surface)-layers, which constitute the crystallized outer most cell envelope of a great variety of bacterial cells. We discover that only few and mainly hydrophobic amino acids, located on the surface of the monomer, are responsible for the formation of a highly ordered anisotropic protein lattice. The coarse-grained model presented here reproduces accurately many experimentally observed features including the pore formation, chemical description of the pore structure, location of specific amino acid residues at the protein-protein interfaces, and surface accessibility of specific amino acid residues. In addition to elucidating the molecular mechanisms and explaining experimental findings in the S-layer assembly, the present work offers a tool, which is chemical enough to capture details of primary sequences and coarse-grained enough to explore morphological structures with thousands of protein monomers, to promulgate design rules for spontaneous formation of specific protein assemblies.
Collapse
Affiliation(s)
- Christine Horejs
- Department for Nanobiotechnology, University of Natural Resources and Applied Life Sciences, 1190 Vienna, Austria
| | | | | | | | | |
Collapse
|
90
|
Liang S, Zhou Y, Grishin N, Standley DM. Protein side chain modeling with orientation-dependent atomic force fields derived by series expansions. J Comput Chem 2011; 32:1680-6. [PMID: 21374632 PMCID: PMC3072444 DOI: 10.1002/jcc.21747] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2010] [Revised: 12/10/2010] [Accepted: 12/11/2010] [Indexed: 11/09/2022]
Abstract
We describe the development of new force fields for protein side chain modeling called optimized side chain atomic energy (OSCAR). The distance-dependent energy functions (OSCAR-d) and side-chain dihedral angle potential energy functions were represented as power and Fourier series, respectively. The resulting 802 adjustable parameters were optimized by discriminating the native side chain conformations from non-native conformations, using a training set of 12,000 side chains for each residue type. In the course of optimization, for every residue, its side chain was replaced by varying rotamers, whereas conformations for all other residues were kept as they appeared in the crystal structure. Then, the OSCAR-d were multiplied by an orientation-dependent function to yield OSCAR-o. A total of 1087 parameters of the orientation-dependent energy functions (OSCAR-o) were optimized by maximizing the energy gap between the native conformation and subrotamers calculated as low energy by OSCAR-d. When OSCAR-o with optimized parameters were used to model side chain conformations simultaneously for 218 recently released protein structures, the prediction accuracies were 88.8% for χ(1) , 79.7% for χ(1 + 2) , 1.24 Å overall root mean square deviation (RMSD), and 0.62 Å RMSD for core residues, respectively, compared with the next-best performing side-chain modeling program which achieved 86.6% for χ(1) , 75.7% for χ(1 + 2) , 1.40 Å overall RMSD, and 0.86 Å RMSD for core residues, respectively. The continuous energy functions obtained in this study are suitable for gradient-based optimization techniques for protein structure refinement. A program with built-in OSCAR for protein side chain prediction is available for download at http://sysimm.ifrec.osaka-u.ac.jp/OSCAR/.
Collapse
Affiliation(s)
- Shide Liang
- Systems Immunology Lab, Immunology Frontier Research Center, Osaka University, Suita, Osaka 565-0871, Japan.
| | | | | | | |
Collapse
|
91
|
Jones GJ, Bagaini F, Hewinson RG, Vordermeier HM. The use of binding-prediction models to identify M. bovis-specific antigenic peptides for screening assays in bovine tuberculosis. Vet Immunol Immunopathol 2011; 141:239-45. [DOI: 10.1016/j.vetimm.2011.03.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2010] [Revised: 02/08/2011] [Accepted: 03/02/2011] [Indexed: 11/28/2022]
|
92
|
Saravanan KM, Selvaraj S. Search for identical octapeptides in unrelated proteins: Structural plasticity revisited. Biopolymers 2011; 98:11-26. [PMID: 23325556 DOI: 10.1002/bip.21676] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2011] [Revised: 03/31/2011] [Accepted: 05/10/2011] [Indexed: 12/22/2022]
Abstract
Since proteins are dynamic in nature, they can alter their local structure in response to changes in their environment factors such as temperature, pH, phosphorylation, and binding of other small molecules. These conformational changes are extremely important for the correct folding and functioning of proteins. There are also a number of diseases associated with protein conformational change such as amyloid diseases. To stimulate research into the above factors which specify one conformation over another, different theoretical models have been proposed and tested against sequence similar distant structure protein fragments. In order to simplify the computational complexity of identifying conformational changes in proteins, various local sequence search algorithms were employed and the structural plasticity in unrelated proteins was examined by various research groups. In the present work, we revisit the mechanism of structural plasticity in unrelated proteins with increased number of structures in Protein Data Bank by comparing identical octapeptides in unrelated proteins with dictionary of protein secondary structure extracted from existing experimental data. Our goal is to bring out the influence of hydrophobic residues, hydrophilic residues, flanking residues, difference in secondary structural propensities of surrounding residues, difference in phi-psi angles and local and nonlocal interactions in identical octapeptides adopting different conformations. Also we have used surrounding hydrophobicity, environment dependent interaction energy, atomic mean force potential, structural unit contacts and difference profiles models to explore the factors which cause structural plasticity. The results discussed here may provide insights into protein folding, design and function.
Collapse
Affiliation(s)
- K M Saravanan
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli 620 024, Tamil Nadu, India
| | | |
Collapse
|
93
|
Song Y, Tyka M, Leaver-Fay A, Thompson J, Baker D. Structure-guided forcefield optimization. Proteins 2011; 79:1898-909. [PMID: 21488100 PMCID: PMC3457920 DOI: 10.1002/prot.23013] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2010] [Revised: 01/06/2011] [Accepted: 01/20/2011] [Indexed: 11/06/2022]
Abstract
Accurate modeling of biomolecular systems requires accurate forcefields. Widely used molecular mechanics (MM) forcefields obtain parameters from experimental data and quantum chemistry calculations on small molecules but do not have a clear way to take advantage of the information in high-resolution macromolecular structures. In contrast, knowledge-based methods largely ignore the physical chemistry of interatomic interactions, and instead derive parameters almost exclusively from macromolecular structures. This can involve considerable double counting of the same physical interactions. Here, we describe a method for forcefield improvement that combines the strengths of the two approaches. We use this method to improve the Rosetta all-atom forcefield, in which the total energy is expressed as the sum of terms representing different physical interactions as in MM forcefields and the parameters are tuned to reproduce the properties of macromolecular structures. To resolve inaccuracies resulting from possible double counting of interactions, we compare distribution functions from low-energy modeled structures to those from crystal structures. The structural and physical bases of the deviations between the modeled and reference structures are identified and used to guide forcefield improvements. We describe improvements resolving double counting between backbone hydrogen bond interactions and Lennard-Jones interactions in helices; between sidechain-backbone hydrogen bonds and the backbone torsion potential; and between the sidechain torsion potential and Lennard-Jones interactions. Discrepancies between computed and observed distributions are also used to guide the incorporation of an explicit Cα-hydrogen bond in β sheets. The method can be used generally to integrate different sources of information for forcefield improvement.
Collapse
Affiliation(s)
- Yifan Song
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
| | | | | | | | | |
Collapse
|
94
|
Wu C, Shea JE. Coarse-grained models for protein aggregation. Curr Opin Struct Biol 2011; 21:209-20. [DOI: 10.1016/j.sbi.2011.02.002] [Citation(s) in RCA: 148] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2010] [Revised: 02/03/2011] [Accepted: 02/07/2011] [Indexed: 01/09/2023]
|
95
|
Mittal A, Jayaram B. Backbones of Folded Proteins Reveal Novel Invariant Amino Acid Neighborhoods. J Biomol Struct Dyn 2011; 28:443-54. [DOI: 10.1080/073911011010524954] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
96
|
Shen Q, Xiong B, Zheng M, Luo X, Luo C, Liu X, Du Y, Li J, Zhu W, Shen J, Jiang H. Knowledge-Based Scoring Functions in Drug Design: 2. Can the Knowledge Base Be Enriched? J Chem Inf Model 2010; 51:386-97. [DOI: 10.1021/ci100343j] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Qiancheng Shen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Bing Xiong
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Xiaomin Luo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Cheng Luo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Xian Liu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Yun Du
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Jing Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Weiliang Zhu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Jingkang Shen
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| | - Hualiang Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
97
|
Hamelryck T, Borg M, Paluszewski M, Paulsen J, Frellsen J, Andreetta C, Boomsma W, Bottaro S, Ferkinghoff-Borg J. Potentials of mean force for protein structure prediction vindicated, formalized and generalized. PLoS One 2010; 5:e13714. [PMID: 21103041 PMCID: PMC2978081 DOI: 10.1371/journal.pone.0013714] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2010] [Accepted: 10/04/2010] [Indexed: 11/26/2022] Open
Abstract
Understanding protein structure is of crucial importance in science, medicine and biotechnology. For about two decades, knowledge-based potentials based on pairwise distances – so-called “potentials of mean force” (PMFs) – have been center stage in the prediction and design of protein structure and the simulation of protein folding. However, the validity, scope and limitations of these potentials are still vigorously debated and disputed, and the optimal choice of the reference state – a necessary component of these potentials – is an unsolved problem. PMFs are loosely justified by analogy to the reversible work theorem in statistical physics, or by a statistical argument based on a likelihood function. Both justifications are insightful but leave many questions unanswered. Here, we show for the first time that PMFs can be seen as approximations to quantities that do have a rigorous probabilistic justification: they naturally arise when probability distributions over different features of proteins need to be combined. We call these quantities “reference ratio distributions” deriving from the application of the “reference ratio method.” This new view is not only of theoretical relevance but leads to many insights that are of direct practical use: the reference state is uniquely defined and does not require external physical insights; the approach can be generalized beyond pairwise distances to arbitrary features of protein structure; and it becomes clear for which purposes the use of these quantities is justified. We illustrate these insights with two applications, involving the radius of gyration and hydrogen bonding. In the latter case, we also show how the reference ratio method can be iteratively applied to sculpt an energy funnel. Our results considerably increase the understanding and scope of energy functions derived from known biomolecular structures.
Collapse
Affiliation(s)
- Thomas Hamelryck
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- * E-mail: (TH); (JFB)
| | - Mikael Borg
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Martin Paluszewski
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Jonas Paulsen
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Jes Frellsen
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Christian Andreetta
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Wouter Boomsma
- Biomedical Engineering, Technical University of Denmark (DTU) Elektro, Technical University of Denmark, Lyngby, Denmark
- Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Sandro Bottaro
- Biomedical Engineering, Technical University of Denmark (DTU) Elektro, Technical University of Denmark, Lyngby, Denmark
| | - Jesper Ferkinghoff-Borg
- Biomedical Engineering, Technical University of Denmark (DTU) Elektro, Technical University of Denmark, Lyngby, Denmark
- * E-mail: (TH); (JFB)
| |
Collapse
|
98
|
Abstract
We extend PRIME, an intermediate-resolution protein model previously used in simulations of the aggregation of polyalanine and polyglutamine, to the description of the geometry and energetics of peptides containing all 20 amino acid residues. The 20 amino acid side chains are classified into 14 groups according to their hydrophobicity, polarity, size, charge, and potential for side chain hydrogen bonding. The parameters for extended PRIME, called PRIME 20, include hydrogen-bonding energies, side chain interaction range and energy, and excluded volume. The parameters are obtained by applying a perceptron-learning algorithm and a modified stochastic learning algorithm that optimizes the energy gap between 711 known native states from the PDB and decoy structures generated by gapless threading. The number of independent pair interaction parameters is chosen to be small enough to be physically meaningful yet large enough to give reasonably accurate results in discriminating decoys from native structures. The most physically meaningful results are obtained with 19 energy parameters.
Collapse
Affiliation(s)
- Mookyung Cheon
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, North Carolina, USA
| | | | | |
Collapse
|
99
|
Abstract
Knowledge-based approaches frequently employ empirical relations to determine effective potentials for coarse-grained protein models directly from protein databank structures. Although these approaches have enjoyed considerable success and widespread popularity in computational protein science, their fundamental basis has been widely questioned. It is well established that conventional knowledge-based approaches do not correctly treat many-body correlations between amino acids. Moreover, the physical significance of potentials determined by using structural statistics from different proteins has remained obscure. In the present work, we address both of these concerns by introducing and demonstrating a theory for calculating transferable potentials directly from a databank of protein structures. This approach assumes that the databank structures correspond to representative configurations sampled from equilibrium solution ensembles for different proteins. Given this assumption, this physics-based theory exactly treats many-body structural correlations and directly determines the transferable potentials that provide a variationally optimized approximation to the free energy landscape for each protein. We illustrate this approach by first constructing a databank of protein structures using a model potential and then quantitatively recovering this potential from the structure databank. The proposed framework will clarify the assumptions and physical significance of knowledge-based potentials, allow for their systematic improvement, and provide new insight into many-body correlations and cooperativity in folded proteins.
Collapse
|
100
|
Huang SY, Grinter SZ, Zou X. Scoring functions and their evaluation methods for protein-ligand docking: recent advances and future directions. Phys Chem Chem Phys 2010; 12:12899-908. [PMID: 20730182 PMCID: PMC11103779 DOI: 10.1039/c0cp00151a] [Citation(s) in RCA: 294] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The scoring function is one of the most important components in structure-based drug design. Despite considerable success, accurate and rapid prediction of protein-ligand interactions is still a challenge in molecular docking. In this perspective, we have reviewed three basic types of scoring functions (force-field, empirical, and knowledge-based) and the consensus scoring technique that are used for protein-ligand docking. The commonly-used assessment criteria and publicly available protein-ligand databases for performance evaluation of the scoring functions have also been presented and discussed. We end with a discussion of the challenges faced by existing scoring functions and possible future directions for developing improved scoring functions.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute University of Missouri Columbia, MO 65211
| | - Sam Z. Grinter
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute University of Missouri Columbia, MO 65211
| | - Xiaoqin Zou
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute University of Missouri Columbia, MO 65211
| |
Collapse
|