1
|
Barradas-Bautista D, Almajed A, Oliva R, Kalnis P, Cavallo L. Improving classification of correct and incorrect protein-protein docking models by augmenting the training set. BIOINFORMATICS ADVANCES 2023; 3:vbad012. [PMID: 36789292 PMCID: PMC9923443 DOI: 10.1093/bioadv/vbad012] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 01/20/2023] [Accepted: 02/01/2023] [Indexed: 02/04/2023]
Abstract
Motivation Protein-protein interactions drive many relevant biological events, such as infection, replication and recognition. To control or engineer such events, we need to access the molecular details of the interaction provided by experimental 3D structures. However, such experiments take time and are expensive; moreover, the current technology cannot keep up with the high discovery rate of new interactions. Computational modeling, like protein-protein docking, can help to fill this gap by generating docking poses. Protein-protein docking generally consists of two parts, sampling and scoring. The sampling is an exhaustive search of the tridimensional space. The caveat of the sampling is that it generates a large number of incorrect poses, producing a highly unbalanced dataset. This limits the utility of the data to train machine learning classifiers. Results Using weak supervision, we developed a data augmentation method that we named hAIkal. Using hAIkal, we increased the labeled training data to train several algorithms. We trained and obtained different classifiers; the best classifier has 81% accuracy and 0.51 Matthews' correlation coefficient on the test set, surpassing the state-of-the-art scoring functions. Availability and implementation Docking models from Benchmark 5 are available at https://doi.org/10.5281/zenodo.4012018. Processed tabular data are available at https://repository.kaust.edu.sa/handle/10754/666961. Google colab is available at https://colab.research.google.com/drive/1vbVrJcQSf6\_C3jOAmZzgQbTpuJ5zC1RP?usp=sharing. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | - Ali Almajed
- Computer, Electrical and Mathematical Science and Engineering Division, Kaust Extreme Computing Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Romina Oliva
- Department of Sciences and Technologies, University of Naples “Parthenope”, I-80143 Naples, Italy
| | - Panos Kalnis
- Computer, Electrical and Mathematical Science and Engineering Division, Kaust Extreme Computing Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Luigi Cavallo
- Physical Sciences and Engineering Division, Kaust Catalysis Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| |
Collapse
|
2
|
Wang X, Huang SY. Integrating Bonded and Nonbonded Potentials in the Knowledge-Based Scoring Function for Protein Structure Prediction. J Chem Inf Model 2019; 59:3080-3090. [DOI: 10.1021/acs.jcim.9b00057] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Affiliation(s)
- Xinxiang Wang
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
3
|
Jiang F, Wu HN, Kang W, Wu YD. Developments and Applications of Coil-Library-Based Residue-Specific Force Fields for Molecular Dynamics Simulations of Peptides and Proteins. J Chem Theory Comput 2019; 15:2761-2773. [DOI: 10.1021/acs.jctc.8b00794] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Fan Jiang
- Laboratory of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China
| | - Hao-Nan Wu
- Laboratory of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China
| | - Wei Kang
- Laboratory of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China
- College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Yun-Dong Wu
- Laboratory of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China
- College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| |
Collapse
|
4
|
Wang X, Zhang D, Huang SY. New Knowledge-Based Scoring Function with Inclusion of Backbone Conformational Entropies from Protein Structures. J Chem Inf Model 2018; 58:724-732. [DOI: 10.1021/acs.jcim.7b00601] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Xinxiang Wang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Di Zhang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
5
|
Khoury GA, Smadbeck J, Kieslich CA, Koskosidis AJ, Guzman YA, Tamamis P, Floudas CA. Princeton_TIGRESS 2.0: High refinement consistency and net gains through support vector machines and molecular dynamics in double-blind predictions during the CASP11 experiment. Proteins 2017; 85:1078-1098. [PMID: 28241391 DOI: 10.1002/prot.25274] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Revised: 02/01/2017] [Accepted: 02/14/2017] [Indexed: 12/28/2022]
Abstract
Protein structure refinement is the challenging problem of operating on any protein structure prediction to improve its accuracy with respect to the native structure in a blind fashion. Although many approaches have been developed and tested during the last four CASP experiments, a majority of the methods continue to degrade models rather than improve them. Princeton_TIGRESS (Khoury et al., Proteins 2014;82:794-814) was developed previously and utilizes separate sampling and selection stages involving Monte Carlo and molecular dynamics simulations and classification using an SVM predictor. The initial implementation was shown to consistently refine protein structures 76% of the time in our own internal benchmarking on CASP 7-10 targets. In this work, we improved the sampling and selection stages and tested the method in blind predictions during CASP11. We added a decomposition of physics-based and hybrid energy functions, as well as a coordinate-free representation of the protein structure through distance-binning Cα-Cα distances to capture fine-grained movements. We performed parameter estimation to optimize the adjustable SVM parameters to maximize precision while balancing sensitivity and specificity across all cross-validated data sets, finding enrichment in our ability to select models from the populations of similar decoys generated for targets in CASPs 7-10. The MD stage was enhanced such that larger structures could be further refined. Among refinement methods that are currently implemented as web-servers, Princeton_TIGRESS 2.0 demonstrated the most consistent and most substantial net refinement in blind predictions during CASP11. The enhanced refinement protocol Princeton_TIGRESS 2.0 is freely available as a web server at http://atlas.engr.tamu.edu/refinement/. Proteins 2017; 85:1078-1098. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- George A Khoury
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey
| | - James Smadbeck
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey
| | - Chris A Kieslich
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| | - Alexandra J Koskosidis
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| | - Yannis A Guzman
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey.,Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| | - Phanourios Tamamis
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| | - Christodoulos A Floudas
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| |
Collapse
|
6
|
Grudinin S, Kadukova M, Eisenbarth A, Marillet S, Cazals F. Predicting binding poses and affinities for protein - ligand complexes in the 2015 D3R Grand Challenge using a physical model with a statistical parameter estimation. J Comput Aided Mol Des 2016; 30:791-804. [DOI: 10.1007/s10822-016-9976-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Accepted: 09/19/2016] [Indexed: 12/14/2022]
|
7
|
Grudinin S, Popov P, Neveu E, Cheremovskiy G. Predicting Binding Poses and Affinities in the CSAR 2013–2014 Docking Exercises Using the Knowledge-Based Convex-PL Potential. J Chem Inf Model 2015; 56:1053-62. [DOI: 10.1021/acs.jcim.5b00339] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Affiliation(s)
- Sergei Grudinin
- University
Grenoble Alpes, LJK, F-38000 Grenoble, France
- CNRS, LJK, F-38000 Grenoble, France
- Inria, F-38000 Grenoble, France
| | - Petr Popov
- University
Grenoble Alpes, LJK, F-38000 Grenoble, France
- CNRS, LJK, F-38000 Grenoble, France
- Inria, F-38000 Grenoble, France
- Moscow Institute of Physics and Technology, Dolgoprudniy, 141700, Russia
| | - Emilie Neveu
- University
Grenoble Alpes, LJK, F-38000 Grenoble, France
- CNRS, LJK, F-38000 Grenoble, France
- Inria, F-38000 Grenoble, France
| | - Georgy Cheremovskiy
- University
Grenoble Alpes, LJK, F-38000 Grenoble, France
- CNRS, LJK, F-38000 Grenoble, France
- Inria, F-38000 Grenoble, France
- Moscow Institute of Physics and Technology, Dolgoprudniy, 141700, Russia
| |
Collapse
|
8
|
Popov P, Grudinin S. Knowledge of Native Protein–Protein Interfaces Is Sufficient To Construct Predictive Models for the Selection of Binding Candidates. J Chem Inf Model 2015; 55:2242-55. [DOI: 10.1021/acs.jcim.5b00372] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Petr Popov
- Université Grenoble Alpes, Laboratoire Jean Kuntzmann (LJK), F-38000 Grenoble, France
- CNRS, LJK, F-38000 Grenoble, France
- Inria, F-38000 Grenoble, France
- Moscow Institute
of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Sergei Grudinin
- Université Grenoble Alpes, Laboratoire Jean Kuntzmann (LJK), F-38000 Grenoble, France
- CNRS, LJK, F-38000 Grenoble, France
- Inria, F-38000 Grenoble, France
| |
Collapse
|
9
|
Carlsen M, Røgen P. Protein structure refinement by optimization. Proteins 2015; 83:1616-24. [DOI: 10.1002/prot.24846] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2015] [Revised: 06/02/2015] [Accepted: 06/08/2015] [Indexed: 12/28/2022]
Affiliation(s)
- Martin Carlsen
- Department of Applied Mathematics and Computer Science; Technical University of Denmark; Kongens Lyngby DK-2800 Denmark
| | - Peter Røgen
- Department of Applied Mathematics and Computer Science; Technical University of Denmark; Kongens Lyngby DK-2800 Denmark
| |
Collapse
|
10
|
Carlsen M, Koehl P, Røgen P. On the importance of the distance measures used to train and test knowledge-based potentials for proteins. PLoS One 2014; 9:e109335. [PMID: 25411785 PMCID: PMC4239004 DOI: 10.1371/journal.pone.0109335] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2014] [Accepted: 08/31/2014] [Indexed: 12/15/2022] Open
Abstract
Knowledge-based potentials are energy functions derived from the analysis of databases of protein structures and sequences. They can be divided into two classes. Potentials from the first class are based on a direct conversion of the distributions of some geometric properties observed in native protein structures into energy values, while potentials from the second class are trained to mimic quantitatively the geometric differences between incorrectly folded models and native structures. In this paper, we focus on the relationship between energy and geometry when training the second class of knowledge-based potentials. We assume that the difference in energy between a decoy structure and the corresponding native structure is linearly related to the distance between the two structures. We trained two distance-based knowledge-based potentials accordingly, one based on all inter-residue distances (PPD), while the other had the set of all distances filtered to reflect consistency in an ensemble of decoys (PPE). We tested four types of metric to characterize the distance between the decoy and the native structure, two based on extrinsic geometry (RMSD and GTD-TS*), and two based on intrinsic geometry (Q* and MT). The corresponding eight potentials were tested on a large collection of decoy sets. We found that it is usually better to train a potential using an intrinsic distance measure. We also found that PPE outperforms PPD, emphasizing the benefits of capturing consistent information in an ensemble. The relevance of these results for the design of knowledge-based potentials is discussed.
Collapse
Affiliation(s)
- Martin Carlsen
- Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Patrice Koehl
- Department of Computer Science and Genome Center, University of California Davis, Davis, CA, United States of America
| | - Peter Røgen
- Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark
- * E-mail:
| |
Collapse
|
11
|
Moal IH, Jiménez-García B, Fernández-Recio J. CCharPPI web server: computational characterization of protein-protein interactions from structure. Bioinformatics 2014; 31:123-5. [PMID: 25183488 DOI: 10.1093/bioinformatics/btu594] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
SUMMARY The atomic structures of protein-protein interactions are central to understanding their role in biological systems, and a wide variety of biophysical functions and potentials have been developed for their characterization and the construction of predictive models. These tools are scattered across a multitude of stand-alone programs, and are often available only as model parameters requiring reimplementation. This acts as a significant barrier to their widespread adoption. CCharPPI integrates many of these tools into a single web server. It calculates up to 108 parameters, including models of electrostatics, desolvation and hydrogen bonding, as well as interface packing and complementarity scores, empirical potentials at various resolutions, docking potentials and composite scoring functions. AVAILABILITY AND IMPLEMENTATION The server does not require registration by the user and is freely available for non-commercial academic use at http://life.bsc.es/pid/ccharppi.
Collapse
Affiliation(s)
- Iain H Moal
- Joint BSC-IRB Research Programme in Computational Biology, Department of Life Sciences, Barcelona Supercomputing Center, C/Jordi Girona 29, 08034 Barcelona, Spain
| | - Brian Jiménez-García
- Joint BSC-IRB Research Programme in Computational Biology, Department of Life Sciences, Barcelona Supercomputing Center, C/Jordi Girona 29, 08034 Barcelona, Spain
| | - Juan Fernández-Recio
- Joint BSC-IRB Research Programme in Computational Biology, Department of Life Sciences, Barcelona Supercomputing Center, C/Jordi Girona 29, 08034 Barcelona, Spain
| |
Collapse
|
12
|
Smadbeck J, Chan KH, Khoury GA, Xue B, Robinson RC, Hauser CAE, Floudas CA. De novo design and experimental characterization of ultrashort self-associating peptides. PLoS Comput Biol 2014; 10:e1003718. [PMID: 25010703 PMCID: PMC4091692 DOI: 10.1371/journal.pcbi.1003718] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2014] [Accepted: 05/31/2014] [Indexed: 12/19/2022] Open
Abstract
Self-association is a common phenomenon in biology and one that can have positive and negative impacts, from the construction of the architectural cytoskeleton of cells to the formation of fibrils in amyloid diseases. Understanding the nature and mechanisms of self-association is important for modulating these systems and in creating biologically-inspired materials. Here, we present a two-stage de novo peptide design framework that can generate novel self-associating peptide systems. The first stage uses a simulated multimeric template structure as input into the optimization-based Sequence Selection to generate low potential energy sequences. The second stage is a computational validation procedure that calculates Fold Specificity and/or Approximate Association Affinity (K*association) based on metrics that we have devised for multimeric systems. This framework was applied to the design of self-associating tripeptides using the known self-associating tripeptide, Ac-IVD, as a structural template. Six computationally predicted tripeptides (Ac-LVE, Ac-YYD, Ac-LLE, Ac-YLD, Ac-MYD, Ac-VIE) were chosen for experimental validation in order to illustrate the self-association outcomes predicted by the three metrics. Self-association and electron microscopy studies revealed that Ac-LLE formed bead-like microstructures, Ac-LVE and Ac-YYD formed fibrillar aggregates, Ac-VIE and Ac-MYD formed hydrogels, and Ac-YLD crystallized under ambient conditions. An X-ray crystallographic study was carried out on a single crystal of Ac-YLD, which revealed that each molecule adopts a β-strand conformation that stack together to form parallel β-sheets. As an additional validation of the approach, the hydrogel-forming sequences of Ac-MYD and Ac-VIE were shuffled. The shuffled sequences were computationally predicted to have lower K*association values and were experimentally verified to not form hydrogels. This illustrates the robustness of the framework in predicting self-associating tripeptides. We expect that this enhanced multimeric de novo peptide design framework will find future application in creating novel self-associating peptides based on unnatural amino acids, and inhibitor peptides of detrimental self-aggregating biological proteins.
Collapse
Affiliation(s)
- James Smadbeck
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey, United States of America
| | - Kiat Hwa Chan
- Institute of Bioengineering and Nanotechnology, Singapore, Singapore
| | - George A. Khoury
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey, United States of America
| | - Bo Xue
- Institute of Molecular and Cell Biology, A*STAR (Agency of Science, Technology and Research), Biopolis, Singapore, Singapore
| | - Robert C. Robinson
- Institute of Molecular and Cell Biology, A*STAR (Agency of Science, Technology and Research), Biopolis, Singapore, Singapore
| | | | - Christodoulos A. Floudas
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey, United States of America
| |
Collapse
|
13
|
Krüger DM, Ignacio Garzón J, Chacón P, Gohlke H. DrugScorePPI knowledge-based potentials used as scoring and objective function in protein-protein docking. PLoS One 2014; 9:e89466. [PMID: 24586799 PMCID: PMC3931789 DOI: 10.1371/journal.pone.0089466] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2013] [Accepted: 01/20/2014] [Indexed: 02/06/2023] Open
Abstract
The distance-dependent knowledge-based DrugScorePPI potentials, previously developed for in silico alanine scanning and hot spot prediction on given structures of protein-protein complexes, are evaluated as a scoring and objective function for the structure prediction of protein-protein complexes. When applied for ranking “unbound perturbation” (“unbound docking”) decoys generated by Baker and coworkers a 4-fold (1.5-fold) enrichment of acceptable docking solutions in the top ranks compared to a random selection is found. When applied as an objective function in FRODOCK for bound protein-protein docking on 97 complexes of the ZDOCK benchmark 3.0, DrugScorePPI/FRODOCK finds up to 10% (15%) more high accuracy solutions in the top 1 (top 10) predictions than the original FRODOCK implementation. When used as an objective function for global unbound protein-protein docking, fair docking success rates are obtained, which improve by ∼2-fold to 18% (58%) for an at least acceptable solution in the top 10 (top 100) predictions when performing knowledge-driven unbound docking. This suggests that DrugScorePPI balances well several different types of interactions important for protein-protein recognition. The results are discussed in view of the influence of crystal packing and the type of protein-protein complex docked. Finally, a simple criterion is provided with which to estimate a priori if unbound docking with DrugScorePPI/FRODOCK will be successful.
Collapse
Affiliation(s)
- Dennis M. Krüger
- Institute for Pharmaceutical and Medicinal Chemistry, Heinrich-Heine-University, Düsseldorf, Germany
| | - José Ignacio Garzón
- Rocasolano Physical Chemistry Institute, Consejo Superior de Investigaciones Científicas, Madrid, Spain
| | - Pablo Chacón
- Rocasolano Physical Chemistry Institute, Consejo Superior de Investigaciones Científicas, Madrid, Spain
| | - Holger Gohlke
- Institute for Pharmaceutical and Medicinal Chemistry, Heinrich-Heine-University, Düsseldorf, Germany
- * E-mail:
| |
Collapse
|
14
|
Huang SY, Zou X. ITScorePro: an efficient scoring program for evaluating the energy scores of protein structures for structure prediction. Methods Mol Biol 2014; 1137:71-81. [PMID: 24573475 PMCID: PMC11121506 DOI: 10.1007/978-1-4939-0366-5_6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
One important component in protein structure prediction is to evaluate the free energy of a given conformation. Given the enormous number of possible conformations for a sequence, it is extremely challenging to quickly and accurately score the energies of these conformations and predict a reasonable structure within a practical computational time. Here, we describe an efficient program for energy evaluation, referred to as ITScorePro (Copyright © 2012). The energy scoring function in the ITScorePro program is based on the distance-dependent, pairwise atomic potentials for protein structure prediction that we recently derived by using statistical mechanics principles (Huang and Zou, Proteins 79:2648-2661, 2011). ITScorePro is a stand-alone program and can also be easily implemented in other software suites for protein structure prediction.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, Dalton Cardiovascular Research Center, Informatics Institute, University of Missouri, Columbia, MO, USA
| | | |
Collapse
|
15
|
Chakraborty S, Venkatramani R, Rao BJ, Asgeirsson B, Dandekar AM. The electrostatic profile of consecutive Cβ atoms applied to protein structure quality assessment. F1000Res 2013; 2:243. [PMID: 25506420 PMCID: PMC4257144 DOI: 10.12688/f1000research.2-243.v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/16/2014] [Indexed: 02/10/2024] Open
Abstract
The structure of a protein provides insight into its physiological interactions with other components of the cellular soup. Methods that predict putative structures from sequences typically yield multiple, closely-ranked possibilities. A critical component in the process is the model quality assessing program (MQAP), which selects the best candidate from this pool of structures. Here, we present a novel MQAP based on the physical properties of sidechain atoms. We propose a method for assessing the quality of protein structures based on the electrostatic potential difference (EPD) of Cβ atoms in consecutive residues. We demonstrate that the EPDs of Cβ atoms on consecutive residues provide unique signatures of the amino acid types. The EPD of Cβ atoms are learnt from a set of 1000 non-homologous protein structures with a resolution cuto of 1.6 Å obtained from the PISCES database. Based on the Boltzmann hypothesis that lower energy conformations are proportionately sampled more, and on Annsen's thermodynamic hypothesis that the native structure of a protein is the minimum free energy state, we hypothesize that the deviation of observed EPD values from the mean values obtained in the learning phase is minimized in the native structure. We achieved an average specificity of 0.91, 0.94 and 0.93 on hg_structal, 4state_reduced and ig_structal decoy sets, respectively, taken from the Decoys `R' Us database. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.
Collapse
Affiliation(s)
- Sandeep Chakraborty
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Ravindra Venkatramani
- Department of Chemical Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Basuthkar J. Rao
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Bjarni Asgeirsson
- Science Institute, Department of Biochemistry, University of Iceland, IS-107 Reykjavik, Iceland
| | - Abhaya M. Dandekar
- Plant Sciences Department, University of California,, Davis, CA, 95616, USA
| |
Collapse
|
16
|
Ghosh R, Roy S, Bagchi B. Solvent Sensitivity of Protein Unfolding: Dynamical Study of Chicken Villin Headpiece Subdomain in Water–Ethanol Binary Mixture. J Phys Chem B 2013; 117:15625-38. [DOI: 10.1021/jp406255z] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Affiliation(s)
- Rikhia Ghosh
- Solid State and Structural
Chemistry Unit, Indian Institute of Science, C. V. Raman Avenue, Bangalore 560012, India
| | - Susmita Roy
- Solid State and Structural
Chemistry Unit, Indian Institute of Science, C. V. Raman Avenue, Bangalore 560012, India
| | - Biman Bagchi
- Solid State and Structural
Chemistry Unit, Indian Institute of Science, C. V. Raman Avenue, Bangalore 560012, India
| |
Collapse
|
17
|
Moretti R, Fleishman SJ, Agius R, Torchala M, Bates PA, Kastritis PL, Rodrigues JPGLM, Trellet M, Bonvin AMJJ, Cui M, Rooman M, Gillis D, Dehouck Y, Moal I, Romero-Durana M, Perez-Cano L, Pallara C, Jimenez B, Fernandez-Recio J, Flores S, Pacella M, Kilambi KP, Gray JJ, Popov P, Grudinin S, Esquivel-Rodríguez J, Kihara D, Zhao N, Korkin D, Zhu X, Demerdash ONA, Mitchell JC, Kanamori E, Tsuchiya Y, Nakamura H, Lee H, Park H, Seok C, Sarmiento J, Liang S, Teraguchi S, Standley DM, Shimoyama H, Terashi G, Takeda-Shitaka M, Iwadate M, Umeyama H, Beglov D, Hall DR, Kozakov D, Vajda S, Pierce BG, Hwang H, Vreven T, Weng Z, Huang Y, Li H, Yang X, Ji X, Liu S, Xiao Y, Zacharias M, Qin S, Zhou HX, Huang SY, Zou X, Velankar S, Janin J, Wodak SJ, Baker D. Community-wide evaluation of methods for predicting the effect of mutations on protein-protein interactions. Proteins 2013; 81:1980-7. [PMID: 23843247 PMCID: PMC4143140 DOI: 10.1002/prot.24356] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2013] [Revised: 06/13/2013] [Accepted: 06/18/2013] [Indexed: 12/25/2022]
Abstract
Community-wide blind prediction experiments such as CAPRI and CASP provide an objective measure of the current state of predictive methodology. Here we describe a community-wide assessment of methods to predict the effects of mutations on protein-protein interactions. Twenty-two groups predicted the effects of comprehensive saturation mutagenesis for two designed influenza hemagglutinin binders and the results were compared with experimental yeast display enrichment data obtained using deep sequencing. The most successful methods explicitly considered the effects of mutation on monomer stability in addition to binding affinity, carried out explicit side-chain sampling and backbone relaxation, evaluated packing, electrostatic, and solvation effects, and correctly identified around a third of the beneficial mutations. Much room for improvement remains for even the best techniques, and large-scale fitness landscapes should continue to provide an excellent test bed for continued evaluation of both existing and new prediction methodologies.
Collapse
Affiliation(s)
- Rocco Moretti
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
| | - Sarel J. Fleishman
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Rudi Agius
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London, WC2A 3LY, UK
| | - Mieczyslaw Torchala
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London, WC2A 3LY, UK
| | - Paul A. Bates
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London, WC2A 3LY, UK
| | - Panagiotis L. Kastritis
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, 3584 CG, Utrecht, the Netherlands
| | - João P. G. L. M. Rodrigues
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, 3584 CG, Utrecht, the Netherlands
| | - Mikaël Trellet
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, 3584 CG, Utrecht, the Netherlands
| | - Alexandre M. J. J. Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, 3584 CG, Utrecht, the Netherlands
| | - Meng Cui
- Department of Physiology and Biophysics, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Marianne Rooman
- Department of BioModelling, BioInformatics and BioProcesses, Université Libre de Bruxelles (ULB), 1050 Brussels, Belgium
| | - Dimitri Gillis
- Department of BioModelling, BioInformatics and BioProcesses, Université Libre de Bruxelles (ULB), 1050 Brussels, Belgium
| | - Yves Dehouck
- Department of BioModelling, BioInformatics and BioProcesses, Université Libre de Bruxelles (ULB), 1050 Brussels, Belgium
| | - Iain Moal
- Joint BSC-IRB Research Program in Computational Biology, Life Sciences Department, Barcelona Supercomputing Center, C/Jordi Girona 29, 08034 Barcelona, Spain
| | - Miguel Romero-Durana
- Joint BSC-IRB Research Program in Computational Biology, Life Sciences Department, Barcelona Supercomputing Center, C/Jordi Girona 29, 08034 Barcelona, Spain
| | - Laura Perez-Cano
- Joint BSC-IRB Research Program in Computational Biology, Life Sciences Department, Barcelona Supercomputing Center, C/Jordi Girona 29, 08034 Barcelona, Spain
| | - Chiara Pallara
- Joint BSC-IRB Research Program in Computational Biology, Life Sciences Department, Barcelona Supercomputing Center, C/Jordi Girona 29, 08034 Barcelona, Spain
| | - Brian Jimenez
- Joint BSC-IRB Research Program in Computational Biology, Life Sciences Department, Barcelona Supercomputing Center, C/Jordi Girona 29, 08034 Barcelona, Spain
| | - Juan Fernandez-Recio
- Joint BSC-IRB Research Program in Computational Biology, Life Sciences Department, Barcelona Supercomputing Center, C/Jordi Girona 29, 08034 Barcelona, Spain
| | - Samuel Flores
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, 75124, Sweden
| | - Michael Pacella
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, USA
| | - Krishna Praneeth Kilambi
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, USA
| | - Jeffrey J. Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, USA
- Program in Molecular Biophysics, Johns Hopkins University, Baltimore, Maryland, USA
| | - Petr Popov
- NANO-D, INRIA Grenoble-Rhone-Alpes Research Center, 38334 Saint Ismier Cedex, Montbonnot, France; CNRS, Laboratoire Jean Kuntzmann, BP 53, Grenoble Cedex 9, France
| | - Sergei Grudinin
- NANO-D, INRIA Grenoble-Rhone-Alpes Research Center, 38334 Saint Ismier Cedex, Montbonnot, France; CNRS, Laboratoire Jean Kuntzmann, BP 53, Grenoble Cedex 9, France
| | | | - Daisuke Kihara
- Department of Computer Science, Purdue University ,West Lafayette, IN 47907, USA
- Department of Biological Sciences, Purdue University ,West Lafayette, IN 47907, USA
| | - Nan Zhao
- Informatics Institute and Department of Computer Science, University of Missouri-Columbia, MO 65211, USA
| | - Dmitry Korkin
- Informatics Institute and Department of Computer Science, University of Missouri-Columbia, MO 65211, USA
| | - Xiaolei Zhu
- Departments of Mathematics and Biochemistry, University of Wisconsin, Madison, WI 53706, USA
| | - Omar N. A. Demerdash
- Departments of Mathematics and Biochemistry, University of Wisconsin, Madison, WI 53706, USA
| | - Julie C. Mitchell
- Departments of Mathematics and Biochemistry, University of Wisconsin, Madison, WI 53706, USA
| | - Eiji Kanamori
- Japan Biological Informatics Consortium, Tokyo, Japan
| | - Yuko Tsuchiya
- Division of Life Sciences, Graduate School of Humanities and Sciences, Ochanomizu University, Tokyo, Japan
| | - Haruki Nakamura
- Institute for Protein Research, Osaka University, Osaka, Japan
| | - Hasup Lee
- Department of Chemistry, Seoul National University, Seoul 151-747, Korea
| | - Hahnbeom Park
- Department of Chemistry, Seoul National University, Seoul 151-747, Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul 151-747, Korea
| | - Jamica Sarmiento
- Systems Immunology Lab, WPI Immunology Frontier Research Center (IFReC), Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Shide Liang
- Systems Immunology Lab, WPI Immunology Frontier Research Center (IFReC), Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Shusuke Teraguchi
- Systems Immunology Lab, WPI Immunology Frontier Research Center (IFReC), Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Daron M. Standley
- Systems Immunology Lab, WPI Immunology Frontier Research Center (IFReC), Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | | | | | | | - Mitsuo Iwadate
- Department of Biological Sciences, Faculty of Science and Engineering, Chuo University
| | - Hideaki Umeyama
- Department of Biological Sciences, Faculty of Science and Engineering, Chuo University
| | - Dmitri Beglov
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - David R. Hall
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Dima Kozakov
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Brian G. Pierce
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - Howook Hwang
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - Thom Vreven
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - Yangyu Huang
- Huazhong University of Science and Technology, China
| | - Haotian Li
- Huazhong University of Science and Technology, China
| | - Xiufeng Yang
- Huazhong University of Science and Technology, China
| | - Xiaofeng Ji
- Huazhong University of Science and Technology, China
| | - Shiyong Liu
- Huazhong University of Science and Technology, China
| | - Yi Xiao
- Huazhong University of Science and Technology, China
| | - Martin Zacharias
- Physics Department, Technical University Munich, 85748 Garching, Germany
| | - Sanbo Qin
- Department of Physics and Institute of Molecular Biophysics, Florida State University, Tallahassee, FL 32306, USA
| | - Huan-Xiang Zhou
- Department of Physics and Institute of Molecular Biophysics, Florida State University, Tallahassee, FL 32306, USA
| | - Sheng-You Huang
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, Informatics Institute; University of Missouri-Columbia; Columbia, MO 65211, USA
| | - Xiaoqin Zou
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, Informatics Institute; University of Missouri-Columbia; Columbia, MO 65211, USA
| | - Sameer Velankar
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Joël Janin
- IBBMC, Université Paris-Sud, 91405-Orsay, France
| | - Shoshana J. Wodak
- Department of Biochemistry, University of Toronto, Ontario, Canada M5S 1A8
- Hospital for Sick Children, 555 University Avenue, Toronto, Ontario M5K 1X8, Canada
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
18
|
Mirzaie M, Sadeghi M. Delaunay-based nonlocal interactions are sufficient and accurate in protein fold recognition. Proteins 2013; 82:415-23. [DOI: 10.1002/prot.24407] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Revised: 08/12/2013] [Accepted: 08/21/2013] [Indexed: 01/05/2023]
Affiliation(s)
- Mehdi Mirzaie
- Department of Basic Sciences, Faculty of Paramedical Sciences; Shahid Beheshti University of Medical Sciences; Tehran Iran
- Department of Bioinformatics; School of Computer Science, Institute for Research in Fundamental Sciences (IPM); Tehran Iran
| | - Mehdi Sadeghi
- Department of Bioinformatics, National Institute of Genetic Engineering and Biotechnology; Tehran Iran
| |
Collapse
|
19
|
Chakraborty S, Venkatramani R, Rao BJ, Asgeirsson B, Dandekar AM. Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms. F1000Res 2013; 2:211. [PMID: 24555103 PMCID: PMC3892923 DOI: 10.12688/f1000research.2-211.v1#sthash.lfll9fko.snt845h1.dpuf] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/16/2013] [Indexed: 06/29/2024] Open
Abstract
Predicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling - both of which typically generate multiple possible native state structures. Model quality assessment programs (MQAP) validate these predicted structures in order to identify the correct native state structure. Here, we propose a MQAP for assessing the quality of protein structures based on the distances of consecutive Cα atoms. We hypothesize that the root-mean-square deviation of the distance of consecutive Cα (RDCC) atoms from the ideal value of 3.8 Å, derived from a statistical analysis of high quality protein structures (top100H database), is minimized in native structures. Based on tests with the top100H set, we propose a RDCC cutoff value of 0.012 Å, above which a structure can be filtered out as a non-native structure. We applied the RDCC discriminator on decoy sets from the Decoys 'R' Us database to show that the native structures in all decoy sets tested have RDCC below the 0.012 Å cutoff. While most decoy sets were either indistinguishable using this discriminator or had very few violations, all the decoy structures in the fisa decoy set were discriminated by applying the RDCC criterion. This highlights the physical non-viability of the fisa decoy set, and possible issues in benchmarking other methods using this set. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.
Collapse
Affiliation(s)
- Sandeep Chakraborty
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Ravindra Venkatramani
- Department of Chemical Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Basuthkar J. Rao
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Bjarni Asgeirsson
- Science Institute, Department of Biochemistry, University of Iceland, Reykjavik, IS-107, Iceland
| | - Abhaya M. Dandekar
- Plant Sciences Department, University of California, Davis, CA 95616, USA
| |
Collapse
|
20
|
Chakraborty S, Venkatramani R, Rao BJ, Asgeirsson B, Dandekar AM. Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms. F1000Res 2013; 2:211. [PMID: 24555103 DOI: 10.12688/f1000research.2-211.v1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/10/2013] [Indexed: 01/22/2023] Open
Abstract
Predicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling - both of which typically generate multiple possible native state structures. Model quality assessment programs (MQAP) validate these predicted structures in order to identify the correct native state structure. Here, we propose a MQAP for assessing the quality of protein structures based on the distances of consecutive Cα atoms. We hypothesize that the root-mean-square deviation of the distance of consecutive Cα (RDCC) atoms from the ideal value of 3.8 Å, derived from a statistical analysis of high quality protein structures (top100H database), is minimized in native structures. Based on tests with the top100H set, we propose a RDCC cutoff value of 0.012 Å, above which a structure can be filtered out as a non-native structure. We applied the RDCC discriminator on decoy sets from the Decoys 'R' Us database to show that the native structures in all decoy sets tested have RDCC below the 0.012 Å cutoff. While most decoy sets were either indistinguishable using this discriminator or had very few violations, all the decoy structures in the fisa decoy set were discriminated by applying the RDCC criterion. This highlights the physical non-viability of the fisa decoy set, and possible issues in benchmarking other methods using this set. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.
Collapse
Affiliation(s)
- Sandeep Chakraborty
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Ravindra Venkatramani
- Department of Chemical Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Basuthkar J Rao
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Bjarni Asgeirsson
- Science Institute, Department of Biochemistry, University of Iceland, Reykjavik, IS-107, Iceland
| | - Abhaya M Dandekar
- Plant Sciences Department, University of California, Davis, CA 95616, USA
| |
Collapse
|
21
|
Moal IH, Torchala M, Bates PA, Fernández-Recio J. The scoring of poses in protein-protein docking: current capabilities and future directions. BMC Bioinformatics 2013; 14:286. [PMID: 24079540 PMCID: PMC3850738 DOI: 10.1186/1471-2105-14-286] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2013] [Accepted: 09/25/2013] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Protein-protein docking, which aims to predict the structure of a protein-protein complex from its unbound components, remains an unresolved challenge in structural bioinformatics. An important step is the ranking of docked poses using a scoring function, for which many methods have been developed. There is a need to explore the differences and commonalities of these methods with each other, as well as with functions developed in the fields of molecular dynamics and homology modelling. RESULTS We present an evaluation of 115 scoring functions on an unbound docking decoy benchmark covering 118 complexes for which a near-native solution can be found, yielding top 10 success rates of up to 58%. Hierarchical clustering is performed, so as to group together functions which identify near-natives in similar subsets of complexes. Three set theoretic approaches are used to identify pairs of scoring functions capable of correctly scoring different complexes. This shows that functions in different clusters capture different aspects of binding and are likely to work together synergistically. CONCLUSIONS All functions designed specifically for docking perform well, indicating that functions are transferable between sampling methods. We also identify promising methods from the field of homology modelling. Further, differential success rates by docking difficulty and solution quality suggest a need for flexibility-dependent scoring. Investigating pairs of scoring functions, the set theoretic measures identify known scoring strategies as well as a number of novel approaches, indicating promising augmentations of traditional scoring methods. Such augmentation and parameter combination strategies are discussed in the context of the learning-to-rank paradigm.
Collapse
Affiliation(s)
- Iain H Moal
- Joint BSC-IRB Research Program in Computational Biology, Life Science Department, Barcelona Super computing Center, Barcelona 08034, Spain
| | - Mieczyslaw Torchala
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London WC2A 3LY, UK
| | - Paul A Bates
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London WC2A 3LY, UK
| | - Juan Fernández-Recio
- Joint BSC-IRB Research Program in Computational Biology, Life Science Department, Barcelona Super computing Center, Barcelona 08034, Spain
| |
Collapse
|
22
|
Smadbeck J, Peterson MB, Khoury GA, Taylor MS, Floudas CA. Protein WISDOM: a workbench for in silico de novo design of biomolecules. J Vis Exp 2013. [PMID: 23912941 DOI: 10.3791/50476] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open
Abstract
The aim of de novo protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity. To disseminate these methods for broader use we present Protein WISDOM (http://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Collapse
Affiliation(s)
- James Smadbeck
- Department of Chemical and Biological Engineering, Princeton University, USA
| | | | | | | | | |
Collapse
|
23
|
Kauffman C, Karypis G. Coarse- and fine-grained models for proteins: Evaluation by decoy discrimination. Proteins 2013. [DOI: 10.1002/prot.24222] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Affiliation(s)
- Chris Kauffman
- Department of Computer Science, George Mason University, Fairfax, Virginia 22030, USA.
| | | |
Collapse
|
24
|
Røgen P, Koehl P. Extracting knowledge from protein structure geometry. Proteins 2013; 81:841-51. [PMID: 23280479 DOI: 10.1002/prot.24242] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2012] [Revised: 11/28/2012] [Accepted: 12/08/2012] [Indexed: 11/06/2022]
Abstract
Protein structure prediction techniques proceed in two steps, namely the generation of many structural models for the protein of interest, followed by an evaluation of all these models to identify those that are native-like. In theory, the second step is easy, as native structures correspond to minima of their free energy surfaces. It is well known however that the situation is more complicated as the current force fields used for molecular simulations fail to recognize native states from misfolded structures. In an attempt to solve this problem, we follow an alternate approach and derive a new potential from geometric knowledge extracted from native and misfolded conformers of protein structures. This new potential, Metric Protein Potential (MPP), has two main features that are key to its success. Firstly, it is composite in that it includes local and nonlocal geometric information on proteins. At the short range level, it captures and quantifies the mapping between the sequences and structures of short (7-mer) fragments of protein backbones through the introduction of a new local energy term. The local energy term is then augmented with a nonlocal residue-based pairwise potential, and a solvent potential. Secondly, it is optimized to yield a maximized correlation between the energy of a structural model and its root mean square (RMS) to the native structure of the corresponding protein. We have shown that MPP yields high correlation values between RMS and energy and that it is able to retrieve the native structure of a protein from a set of high-resolution decoys.
Collapse
Affiliation(s)
- Peter Røgen
- Department of Mathematics, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark.
| | | |
Collapse
|
25
|
Chakraborty S, Venkatramani R, Rao BJ, Asgeirsson B, Dandekar AM. The electrostatic profile of consecutive Cβ atoms applied to protein structure quality assessment. F1000Res 2013; 2:243. [PMID: 25506420 PMCID: PMC4257144 DOI: 10.12688/f1000research.2-243.v3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/16/2014] [Indexed: 12/23/2022] Open
Abstract
The structure of a protein provides insight into its physiological interactions with other components of the cellular soup. Methods that predict putative structures from sequences typically yield multiple, closely-ranked possibilities. A critical component in the process is the model quality assessing program (MQAP), which selects the best candidate from this pool of structures. Here, we present a novel MQAP based on the physical properties of sidechain atoms. We propose a method for assessing the quality of protein structures based on the electrostatic potential difference (EPD) of Cβ atoms in consecutive residues. We demonstrate that the EPDs of Cβ atoms on consecutive residues provide unique signatures of the amino acid types. The EPD of Cβ atoms are learnt from a set of 1000 non-homologous protein structures with a resolution cuto of 1.6 Å obtained from the PISCES database. Based on the Boltzmann hypothesis that lower energy conformations are proportionately sampled more, and on Annsen's thermodynamic hypothesis that the native structure of a protein is the minimum free energy state, we hypothesize that the deviation of observed EPD values from the mean values obtained in the learning phase is minimized in the native structure. We achieved an average specificity of 0.91, 0.94 and 0.93 on hg_structal, 4state_reduced and ig_structal decoy sets, respectively, taken from the Decoys `R' Us database. The source code and manual is made available at
https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.
Collapse
Affiliation(s)
- Sandeep Chakraborty
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Ravindra Venkatramani
- Department of Chemical Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Basuthkar J Rao
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Bjarni Asgeirsson
- Science Institute, Department of Biochemistry, University of Iceland, IS-107 Reykjavik, Iceland
| | - Abhaya M Dandekar
- Plant Sciences Department, University of California,, Davis, CA, 95616, USA
| |
Collapse
|
26
|
Pape S, Hoffgaard F, Dür M, Hamacher K. Distance dependency and minimum amino acid alphabets for decoy scoring potentials. J Comput Chem 2012; 34:10-20. [DOI: 10.1002/jcc.23099] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2012] [Revised: 07/12/2012] [Accepted: 07/26/2012] [Indexed: 11/09/2022]
|
27
|
Abstract
The prediction of loop structures is considered one of the main challenges in the protein folding problem. Regardless of the dependence of the overall algorithm on the protein data bank, the flexibility of loop regions dictates the need for special attention to their structures. In this article, we present algorithms for loop structure prediction with fixed stem and flexible stem geometry. In the flexible stem geometry problem, only the secondary structure of three stem residues on either side of the loop is known. In the fixed stem geometry problem, the structure of the three stem residues on either side of the loop is also known. Initial loop structures are generated using a probability database for the flexible stem geometry problem, and using torsion angle dynamics for the fixed stem geometry problem. Three rotamer optimization algorithms are introduced to alleviate steric clashes between the generated backbone structures and the side chain rotamers. The structures are optimized by energy minimization using an all-atom force field. The optimized structures are clustered using a traveling salesman problem-based clustering algorithm. The structures in the densest clusters are then utilized to refine dihedral angle bounds on all amino acids in the loop. The entire procedure is carried out for a number of iterations, leading to improved structure prediction and refined dihedral angle bounds. The algorithms presented in this article have been tested on 3190 loops from the PDBSelect25 data set and on targets from the recently concluded CASP9 community-wide experiment.
Collapse
Affiliation(s)
- A. Subramani
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| | - C. A. Floudas
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| |
Collapse
|
28
|
Subramani A, Wei Y, Floudas CA. ASTRO-FOLD 2.0: an Enhanced Framework for Protein Structure Prediction. AIChE J 2012; 58:1619-1637. [PMID: 23049093 DOI: 10.1002/aic.12669] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
The three-dimensional (3-D) structure prediction of proteins, given their amino acid sequence, is addressed using the first principles-based approach ASTRO-FOLD 2.0. The key features presented are: (1) Secondary structure prediction using a novel optimization-based consensus approach, (2) β-sheet topology prediction using mixed-integer linear optimization (MILP), (3) Residue-to-residue contact prediction using a high-resolution distance-dependent force field and MILP formulation, (4) Tight dihedral angle and distance bound generation for loop residues using dihedral angle clustering and non-linear optimization (NLP), (5) 3-D structure prediction using deterministic global optimization, stochastic conformational space annealing, and the full-atomistic ECEPP/3 potential, (6) Near-native structure selection using a traveling salesman problem-based clustering approach, ICON, and (7) Improved bound generation using chemical shifts of subsets of heavy atoms, generated by SPARTA and CS23D. Computational results of ASTRO-FOLD 2.0 on 47 blind targets of the recently concluded CASP9 experiment are presented.
Collapse
Affiliation(s)
- A Subramani
- Dept. of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544
| | | | | |
Collapse
|
29
|
Tamamis P, de Victoria AL, Gorham RD, Bellows-Peterson ML, Pierou P, Floudas CA, Morikis D, Archontis G. Molecular dynamics in drug design: new generations of compstatin analogs. Chem Biol Drug Des 2012; 79:703-18. [PMID: 22233517 PMCID: PMC3319835 DOI: 10.1111/j.1747-0285.2012.01324.x] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
We report the computational and rational design of new generations of potential peptide-based inhibitors of the complement protein C3 from the compstatin family. The binding efficacy of the peptides is tested by extensive molecular dynamics-based structural and physicochemical analysis, using 32 atomic detail trajectories in explicit water for 22 peptides bound to human, rat or mouse target protein C3, with a total of 257 ns. The criteria for the new design are: (i) optimization for C3 affinity and for the balance between hydrophobicity and polarity to improve solubility compared to known compstatin analogs; and (ii) development of dual specificity, human-rat/mouse C3 inhibitors, which could be used in animal disease models. Three of the new analogs are analyzed in more detail as they possess strong and novel binding characteristics and are promising candidates for further optimization. This work paves the way for the development of an improved therapeutic for age-related macular degeneration, and other complement system-mediated diseases, compared to known compstatin variants.
Collapse
Affiliation(s)
- Phanourios Tamamis
- Department of Bioengineering, University of California, Riverside, California 92521, USA
- Department of Physics, University of Cyprus, PO20537, CY1678, Nicosia, Cyprus
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, USA
| | | | - Ronald D. Gorham
- Department of Bioengineering, University of California, Riverside, California 92521, USA
| | - Meghan L. Bellows-Peterson
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, USA
| | - Panayiota Pierou
- Department of Physics, University of Cyprus, PO20537, CY1678, Nicosia, Cyprus
| | - Christodoulos A. Floudas
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, USA
| | - Dimitrios Morikis
- Department of Bioengineering, University of California, Riverside, California 92521, USA
| | - Georgios Archontis
- Department of Physics, University of Cyprus, PO20537, CY1678, Nicosia, Cyprus
| |
Collapse
|
30
|
Koppole S, Schaefer M. A discriminative Ramachandran potential of mean force aimed at minimizing secondary structure bias. J Comput Chem 2012; 33:791-9. [DOI: 10.1002/jcc.22908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2011] [Revised: 10/24/2011] [Accepted: 11/20/2011] [Indexed: 11/12/2022]
|
31
|
Huang SY, Zou X. Statistical mechanics-based method to extract atomic distance-dependent potentials from protein structures. Proteins 2011; 79:2648-61. [PMID: 21732421 PMCID: PMC11108592 DOI: 10.1002/prot.23086] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2011] [Revised: 04/21/2011] [Accepted: 05/09/2011] [Indexed: 12/25/2022]
Abstract
In this study, we have developed a statistical mechanics-based iterative method to extract statistical atomic interaction potentials from known, nonredundant protein structures. Our method circumvents the long-standing reference state problem in deriving traditional knowledge-based scoring functions, by using rapid iterations through a physical, global convergence function. The rapid convergence of this physics-based method, unlike other parameter optimization methods, warrants the feasibility of deriving distance-dependent, all-atom statistical potentials to keep the scoring accuracy. The derived potentials, referred to as ITScore/Pro, have been validated using three diverse benchmarks: the high-resolution decoy set, the AMBER benchmark decoy set, and the CASP8 decoy set. Significant improvement in performance has been achieved. Finally, comparisons between the potentials of our model and potentials of a knowledge-based scoring function with a randomized reference state have revealed the reason for the better performance of our scoring function, which could provide useful insight into the development of other physical scoring functions. The potentials developed in this study are generally applicable for structural selection in protein structure prediction.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute, University of Missouri, Columbia, MO 65211
| | - Xiaoqin Zou
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute, University of Missouri, Columbia, MO 65211
| |
Collapse
|
32
|
Tian L, Wu A, Cao Y, Dong X, Hu Y, Jiang T. NCACO-score: an effective main-chain dependent scoring function for structure modeling. BMC Bioinformatics 2011; 12:208. [PMID: 21612673 PMCID: PMC3123610 DOI: 10.1186/1471-2105-12-208] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2011] [Accepted: 05/26/2011] [Indexed: 11/10/2022] Open
Abstract
Background Development of effective scoring functions is a critical component to the success of protein structure modeling. Previously, many efforts have been dedicated to the development of scoring functions. Despite these efforts, development of an effective scoring function that can achieve both good accuracy and fast speed still presents a grand challenge. Results Based on a coarse-grained representation of a protein structure by using only four main-chain atoms: N, Cα, C and O, we develop a knowledge-based scoring function, called NCACO-score, that integrates different structural information to rapidly model protein structure from sequence. In testing on the Decoys'R'Us sets, we found that NCACO-score can effectively recognize native conformers from their decoys. Furthermore, we demonstrate that NCACO-score can effectively guide fragment assembly for protein structure prediction, which has achieved a good performance in building the structure models for hard targets from CASP8 in terms of both accuracy and speed. Conclusions Although NCACO-score is developed based on a coarse-grained model, it is able to discriminate native conformers from decoy conformers with high accuracy. NCACO is a very effective scoring function for structure modeling.
Collapse
Affiliation(s)
- Liqing Tian
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | | | | | | | | | | |
Collapse
|
33
|
Bellows ML, Taylor MS, Cole PA, Shen L, Siliciano RF, Fung HK, Floudas CA. Discovery of entry inhibitors for HIV-1 via a new de novo protein design framework. Biophys J 2011; 99:3445-53. [PMID: 21081094 DOI: 10.1016/j.bpj.2010.09.050] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2010] [Revised: 09/23/2010] [Accepted: 09/27/2010] [Indexed: 12/11/2022] Open
Abstract
A new (to our knowledge) de novo design framework with a ranking metric based on approximate binding affinity calculations is introduced and applied to the discovery of what we believe are novel HIV-1 entry inhibitors. The framework consists of two stages: a sequence selection stage and a validation stage. The sequence selection stage produces a rank-ordered list of amino-acid sequences by solving an integer programming sequence selection model. The validation stage consists of fold specificity and approximate binding affinity calculations. The designed peptidic inhibitors are 12-amino-acids-long and target the hydrophobic core of gp41. A number of the best-predicted sequences were synthesized and their inhibition of HIV-1 was tested in cell culture. All peptides examined showed inhibitory activity when compared with no drug present, and the novel peptide sequences outperformed the native template sequence used for the design. The best sequence showed micromolar inhibition, which is a 3-15-fold improvement over the native sequence, depending on the donor. In addition, the best sequence equally inhibited wild-type and Enfuvirtide-resistant virus strains.
Collapse
Affiliation(s)
- M L Bellows
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ, USA
| | | | | | | | | | | | | |
Collapse
|
34
|
Pan SJ, Cheung WL, Fung HK, Floudas CA, Link AJ. Computational design of the lasso peptide antibiotic microcin J25. Protein Eng Des Sel 2011; 24:275-82. [PMID: 21106549 PMCID: PMC3038460 DOI: 10.1093/protein/gzq108] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2010] [Revised: 10/04/2010] [Accepted: 10/26/2010] [Indexed: 11/12/2022] Open
Abstract
Microcin J25 (MccJ25) is a 21 amino acid (aa) ribosomally synthesized antimicrobial peptide with an unusual structure in which the eight N-terminal residues form a covalently cyclized macrolactam ring through which the remaining 13 aa tail is fed. An open question is the extent of sequence space that can occupy such an extraordinary, highly constrained peptide fold. To begin answering this question, here we have undertaken a computational redesign of the MccJ25 peptide using a two-stage sequence selection procedure based on both energy minimization and fold specificity. Eight of the most highly ranked sequences from the design algorithm, each of which contained two or three amino acid substitutions, were expressed in Escherichia coli and tested for production and antimicrobial activity. Six of the eight variants were successfully produced by E.coli at production levels comparable with that of the wild-type peptide. Of these six variants, three retain detectable antimicrobial activity, although this activity is reduced relative to wild-type MccJ25. The results here build upon previous findings that even rigid, constrained structures like the lasso architecture are amenable to redesign. Furthermore, this work provides evidence that a large amount of amino acid variation is tolerated by the lasso peptide fold.
Collapse
Affiliation(s)
- Si Jia Pan
- Departments of Chemical and Biological Engineering and
| | | | - Ho Ki Fung
- Departments of Chemical and Biological Engineering and
| | | | - A. James Link
- Departments of Chemical and Biological Engineering and
- Molecular Biology, Princeton University, Princeton, NJ 08544, USA
| |
Collapse
|
35
|
Dong Q, Zhou S. Novel nonlinear knowledge-based mean force potentials based on machine learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:476-486. [PMID: 20820079 DOI: 10.1109/tcbb.2010.86] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
The prediction of 3D structures of proteins from amino acid sequences is one of the most challenging problems in molecular biology. An essential task for solving this problem with coarse-grained models is to deduce effective interaction potentials. The development and evaluation of new energy functions is critical to accurately modeling the properties of biological macromolecules. Knowledge-based mean force potentials are derived from statistical analysis of proteins of known structures. Current knowledge-based potentials are almost in the form of weighted linear sum of interaction pairs. In this study, a class of novel nonlinear knowledge-based mean force potentials is presented. The potential parameters are obtained by nonlinear classifiers, instead of relative frequencies of interaction pairs against a reference state or linear classifiers. The support vector machine is used to derive the potential parameters on data sets that contain both native structures and decoy structures. Five knowledge-based mean force Boltzmann-based or linear potentials are introduced and their corresponding nonlinear potentials are implemented. They are the DIH potential (single-body residue-level Boltzmann-based potential), the DFIRE-SCM potential (two-body residue-level Boltzmann-based potential), the FS potential (two-body atom-level Boltzmann-based potential), the HR potential (two-body residue-level linear potential), and the T32S3 potential (two-body atom-level linear potential). Experiments are performed on well-established decoy sets, including the LKF data set, the CASP7 data set, and the Decoys “R”Us data set. The evaluation metrics include the energy Z score and the ability of each potential to discriminate native structures from a set of decoy structures. Experimental results show that all nonlinear potentials significantly outperform the corresponding Boltzmann-based or linear potentials, and the proposed discriminative framework is effective in developing knowledge-based mean force potentials. The nonlinear potentials can be widely used for ab initio protein structure prediction, model quality assessment, protein docking, and other challenging problems in computational biology.
Collapse
Affiliation(s)
- Qiwen Dong
- Shanghai Key Lab of Intelligent Information Processing and the School of Computer Science, Fudan University, Old Yifu Building, Room 202-5, 220 Handan Road, Shanhai 200433, China.
| | | |
Collapse
|
36
|
Mittal A, Jayaram B. Backbones of Folded Proteins Reveal Novel Invariant Amino Acid Neighborhoods. J Biomol Struct Dyn 2011; 28:443-54. [DOI: 10.1080/073911011010524954] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
37
|
Abstract
We extend PRIME, an intermediate-resolution protein model previously used in simulations of the aggregation of polyalanine and polyglutamine, to the description of the geometry and energetics of peptides containing all 20 amino acid residues. The 20 amino acid side chains are classified into 14 groups according to their hydrophobicity, polarity, size, charge, and potential for side chain hydrogen bonding. The parameters for extended PRIME, called PRIME 20, include hydrogen-bonding energies, side chain interaction range and energy, and excluded volume. The parameters are obtained by applying a perceptron-learning algorithm and a modified stochastic learning algorithm that optimizes the energy gap between 711 known native states from the PDB and decoy structures generated by gapless threading. The number of independent pair interaction parameters is chosen to be small enough to be physically meaningful yet large enough to give reasonably accurate results in discriminating decoys from native structures. The most physically meaningful results are obtained with 19 energy parameters.
Collapse
Affiliation(s)
- Mookyung Cheon
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, North Carolina, USA
| | | | | |
Collapse
|
38
|
Zhang J, Zhang Y. A novel side-chain orientation dependent potential derived from random-walk reference state for protein fold selection and structure prediction. PLoS One 2010; 5:e15386. [PMID: 21060880 PMCID: PMC2965178 DOI: 10.1371/journal.pone.0015386] [Citation(s) in RCA: 173] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2010] [Accepted: 09/01/2010] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND An accurate potential function is essential to attack protein folding and structure prediction problems. The key to developing efficient knowledge-based potential functions is to design reference states that can appropriately counteract generic interactions. The reference states of many knowledge-based distance-dependent atomic potential functions were derived from non-interacting particles such as ideal gas, however, which ignored the inherent sequence connectivity and entropic elasticity of proteins. METHODOLOGY We developed a new pair-wise distance-dependent, atomic statistical potential function (RW), using an ideal random-walk chain as reference state, which was optimized on CASP models and then benchmarked on nine structural decoy sets. Second, we incorporated a new side-chain orientation-dependent energy term into RW (RWplus) and found that the side-chain packing orientation specificity can further improve the decoy recognition ability of the statistical potential. SIGNIFICANCE RW and RWplus demonstrate a significantly better ability than the best performing pair-wise distance-dependent atomic potential functions in both native and near-native model selections. It has higher energy-RMSD and energy-TM-score correlations compared with other potentials of the same type in real-life structure assembly decoys. When benchmarked with a comprehensive list of publicly available potentials, RW and RWplus shows comparable performance to the state-of-the-art scoring functions, including those combining terms from multiple resources. These data demonstrate the usefulness of random-walk chain as reference states which correctly account for sequence connectivity and entropic elasticity of proteins. It shows potential usefulness in structure recognition and protein folding simulations. The RW and RWplus potentials, as well as the newly generated I-TASSER decoys, are freely available in http://zhanglab.ccmb.med.umich.edu/RW.
Collapse
Affiliation(s)
- Jian Zhang
- Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Yang Zhang
- Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| |
Collapse
|
39
|
Potapov V, Cohen M, Inbar Y, Schreiber G. Protein structure modelling and evaluation based on a 4-distance description of side-chain interactions. BMC Bioinformatics 2010; 11:374. [PMID: 20624289 PMCID: PMC2912888 DOI: 10.1186/1471-2105-11-374] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2009] [Accepted: 07/12/2010] [Indexed: 11/11/2022] Open
Abstract
Background Accurate evaluation and modelling of residue-residue interactions within and between proteins is a key aspect of computational structure prediction including homology modelling, protein-protein docking, refinement of low-resolution structures, and computational protein design. Results Here we introduce a method for accurate protein structure modelling and evaluation based on a novel 4-distance description of residue-residue interaction geometry. Statistical 4-distance preferences were extracted from high-resolution protein structures and were used as a basis for a knowledge-based potential, called Hunter. We demonstrate that 4-distance description of side chain interactions can be used reliably to discriminate the native structure from a set of decoys. Hunter ranked the native structure as the top one in 217 out of 220 high-resolution decoy sets, in 25 out of 28 "Decoys 'R' Us" decoy sets and in 24 out of 27 high-resolution CASP7/8 decoy sets. The same concept was applied to side chain modelling in protein structures. On a set of very high-resolution protein structures the average RMSD was 1.47 Å for all residues and 0.73 Å for buried residues, which is in the range of attainable accuracy for a model. Finally, we show that Hunter performs as good or better than other top methods in homology modelling based on results from the CASP7 experiment. The supporting web site http://bioinfo.weizmann.ac.il/hunter/ was developed to enable the use of Hunter and for visualization and interactive exploration of 4-distance distributions. Conclusions Our results suggest that Hunter can be used as a tool for evaluation and for accurate modelling of residue-residue interactions in protein structures. The same methodology is applicable to other areas involving high-resolution modelling of biomolecules.
Collapse
Affiliation(s)
- Vladimir Potapov
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | | | | | | |
Collapse
|
40
|
Rajgaria R, Wei Y, Floudas CA. Contact prediction for beta and alpha-beta proteins using integer linear optimization and its impact on the first principles 3D structure prediction method ASTRO-FOLD. Proteins 2010; 78:1825-46. [PMID: 20225257 PMCID: PMC2858251 DOI: 10.1002/prot.22696] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
An integer linear optimization model is presented to predict residue contacts in beta, alpha + beta, and alpha/beta proteins. The total energy of a protein is expressed as sum of a C(alpha)-C(alpha) distance dependent contact energy contribution and a hydrophobic contribution. The model selects contact that assign lowest energy to the protein structure as satisfying a set of constraints that are included to enforce certain physically observed topological information. A new method based on hydrophobicity is proposed to find the beta-sheet alignments. These beta-sheet alignments are used as constraints for contacts between residues of beta-sheets. This model was tested on three independent protein test sets and CASP8 test proteins consisting of beta, alpha + beta, alpha/beta proteins and it was found to perform very well. The average accuracy of the predictions (separated by at least six residues) was approximately 61%. The average true positive and false positive distances were also calculated for each of the test sets and they are 7.58 A and 15.88 A, respectively. Residue contact prediction can be directly used to facilitate the protein tertiary structure prediction. This proposed residue contact prediction model is incorporated into the first principles protein tertiary structure prediction approach, ASTRO-FOLD. The effectiveness of the contact prediction model was further demonstrated by the improvement in the quality of the protein structure ensemble generated using the predicted residue contacts for a test set of 10 proteins.
Collapse
Affiliation(s)
- R. Rajgaria
- Department of Chemical Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| | - Y. Wei
- Department of Chemical Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| | - C. A. Floudas
- Department of Chemical Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| |
Collapse
|
41
|
Bordner AJ. Orientation-dependent backbone-only residue pair scoring functions for fixed backbone protein design. BMC Bioinformatics 2010; 11:192. [PMID: 20398384 PMCID: PMC2874805 DOI: 10.1186/1471-2105-11-192] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2009] [Accepted: 04/16/2010] [Indexed: 11/24/2022] Open
Abstract
Background Empirical scoring functions have proven useful in protein structure modeling. Most such scoring functions depend on protein side chain conformations. However, backbone-only scoring functions do not require computationally intensive structure optimization and so are well suited to protein design, which requires fast score evaluation. Furthermore, scoring functions that account for the distinctive relative position and orientation preferences of residue pairs are expected to be more accurate than those that depend only on the separation distance. Results Residue pair scoring functions for fixed backbone protein design were derived using only backbone geometry. Unlike previous studies that used spherical harmonics to fit 2D angular distributions, Gaussian Mixture Models were used to fit the full 3D (position only) and 6D (position and orientation) distributions of residue pairs. The performance of the 1D (residue separation only), 3D, and 6D scoring functions were compared by their ability to identify correct threading solutions for a non-redundant benchmark set of protein backbone structures. The threading accuracy was found to steadily increase with increasing dimension, with the 6D scoring function achieving the highest accuracy. Furthermore, the 3D and 6D scoring functions were shown to outperform side chain-dependent empirical potentials from three other studies. Next, two computational methods that take advantage of the speed and pairwise form of these new backbone-only scoring functions were investigated. The first is a procedure that exploits available sequence data by averaging scores over threading solutions for homologs. This was evaluated by applying it to the challenging problem of identifying interacting transmembrane alpha-helices and found to further improve prediction accuracy. The second is a protein design method for determining the optimal sequence for a backbone structure by applying Belief Propagation optimization using the 6D scoring functions. The sensitivity of this method to backbone structure perturbations was compared with that of fixed-backbone all-atom modeling by determining the similarities between optimal sequences for two different backbone structures within the same protein family. The results showed that the design method using 6D scoring functions was more robust to small variations in backbone structure than the all-atom design method. Conclusions Backbone-only residue pair scoring functions that account for all six relative degrees of freedom are the most accurate and including the scores of homologs further improves the accuracy in threading applications. The 6D scoring function outperformed several side chain-dependent potentials while avoiding time-consuming and error prone side chain structure prediction. These scoring functions are particularly useful as an initial filter in protein design problems before applying all-atom modeling.
Collapse
|
42
|
Rykunov D, Fiser A. New statistical potential for quality assessment of protein models and a survey of energy functions. BMC Bioinformatics 2010; 11:128. [PMID: 20226048 PMCID: PMC2853469 DOI: 10.1186/1471-2105-11-128] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2009] [Accepted: 03/12/2010] [Indexed: 11/30/2022] Open
Abstract
Background Scoring functions, such as molecular mechanic forcefields and statistical potentials are fundamentally important tools in protein structure modeling and quality assessment. Results The performances of a number of publicly available scoring functions are compared with a statistical rigor, with an emphasis on knowledge-based potentials. We explored the effect on accuracy of alternative choices for representing interaction center types and other features of scoring functions, such as using information on solvent accessibility, on torsion angles, accounting for secondary structure preferences and side chain orientation. Partially based on the observations made, we present a novel residue based statistical potential, which employs a shuffled reference state definition and takes into account the mutual orientation of residue side chains. Atom- and residue-level statistical potentials and Linux executables to calculate the energy of a given protein proposed in this work can be downloaded from http://www.fiserlab.org/potentials. Conclusions Among the most influential terms we observed a critical role of a proper reference state definition and the benefits of including information about the microenvironment of interaction centers. Molecular mechanical potentials were also tested and found to be over-sensitive to small local imperfections in a structure, requiring unfeasible long energy relaxation before energy scores started to correlate with model quality.
Collapse
Affiliation(s)
- Dmitry Rykunov
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Ave,, Bronx, NY 10461, USA
| | | |
Collapse
|
43
|
Arab S, Sadeghi M, Eslahchi C, Pezeshk H, Sheari A. A pairwise residue contact area-based mean force potential for discrimination of native protein structure. BMC Bioinformatics 2010; 11:16. [PMID: 20064218 PMCID: PMC2821318 DOI: 10.1186/1471-2105-11-16] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2009] [Accepted: 01/09/2010] [Indexed: 11/21/2022] Open
Abstract
Background Considering energy function to detect a correct protein fold from incorrect ones is very important for protein structure prediction and protein folding. Knowledge-based mean force potentials are certainly the most popular type of interaction function for protein threading. They are derived from statistical analyses of interacting groups in experimentally determined protein structures. These potentials are developed at the atom or the amino acid level. Based on orientation dependent contact area, a new type of knowledge-based mean force potential has been developed. Results We developed a new approach to calculate a knowledge-based potential of mean-force, using pairwise residue contact area. To test the performance of our approach, we performed it on several decoy sets to measure its ability to discriminate native structure from decoys. This potential has been able to distinguish native structures from the decoys in the most cases. Further, the calculated Z-scores were quite high for all protein datasets. Conclusions This knowledge-based potential of mean force can be used in protein structure prediction, fold recognition, comparative modelling and molecular recognition. The program is available at http://www.bioinf.cs.ipm.ac.ir/softwares/surfield
Collapse
Affiliation(s)
- Shahriar Arab
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | | | | | | | | |
Collapse
|
44
|
Subramani A, DiMaggio PA, Floudas CA. Selecting high quality protein structures from diverse conformational ensembles. Biophys J 2009; 97:1728-36. [PMID: 19751678 DOI: 10.1016/j.bpj.2009.06.046] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2009] [Revised: 06/15/2009] [Accepted: 06/30/2009] [Indexed: 01/01/2023] Open
Abstract
Protein structure prediction encompasses two major challenges: 1), the generation of a large ensemble of high resolution structures for a given amino-acid sequence; and 2), the identification of the structure closest to the native structure for a blind prediction. In this article, we address the second challenge, by proposing what is, to our knowledge, a novel iterative traveling-salesman problem-based clustering method to identify the structures of a protein, in a given ensemble, which are closest to the native structure. The method consists of an iterative procedure, which aims at eliminating clusters of structures at each iteration, which are unlikely to be of similar fold to the native, based on a statistical analysis of cluster density and average spherical radius. The method, denoted as ICON, has been tested on four data sets: 1), 1400 proteins with high resolution decoys; 2), medium-to-low resolution decoys from Decoys 'R' Us; 3), medium-to-low resolution decoys from the first-principles approach, ASTRO-FOLD; and 4), selected targets from CASP8. The extensive tests demonstrate that ICON can identify high-quality structures in each ensemble, regardless of the resolution of conformers. In a total of 1454 proteins, with an average of 1051 conformers per protein, the conformers selected by ICON are, on an average, in the top 3.5% of the conformers in the ensemble.
Collapse
Affiliation(s)
- Ashwin Subramani
- Department of Chemical Engineering, Princeton University, Princeton, New Jersey, USA
| | | | | |
Collapse
|
45
|
Ma J. Explicit orientation dependence in empirical potentials and its significance to side-chain modeling. Acc Chem Res 2009; 42:1087-96. [PMID: 19445451 DOI: 10.1021/ar900009e] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Protein structure modeling and prediction have important applications throughout the biological sciences, from the design of pharmaceuticals to the elucidation of enzyme mechanisms. At the core of most protein modeling is an energy function, the minimum of which represents the free energy "cost" for forming a correct protein structure. The most commonly used energy functions are knowledge-based statistical potential functions; that is, they are empirically derived from statistical analysis of a set of high-resolution protein structures. When that kind of potential function is constructed, the anisotropic orientation dependence between the interacting groups is a critical component for accurately representing key molecular interactions, such as those involved in protein side-chain packing. In the literature, however, many potential functions are limited in their ability to describe orientation dependence. In all-atom potentials, they typically ignore heterogeneous chemical-bond connectivity. In coarse-grained potentials, such as (semi)-residue-based potentials, the simplified representation of residues often reduces the sensitivity of the potential to side-chain orientation. Recently, in an effort to maximally capture the orientation dependence in side-chain interactions, a new type of all-atom statistical potential was developed: OPUS-PSP (potential derived from side-chain packing). The key feature of this potential is its explicit description of orientation dependence in molecular interactions, which is achieved with a basis set of 19 rigid-body blocks extracted from the chemical structures of 20 amino acid residues. This basis set is specifically designed to maximally capture the essential elements of orientation dependence in molecular packing interactions. The potential is constructed from the orientation-specific packing statistics of pairs of those blocks in a nonredundant structural database. On decoy set tests, OPUS-PSP significantly outperforms most of the existing knowledge-based potentials in terms of both its ability to recognize native structures and its consistency in achieving high Z scores across decoy sets. The application of OPUS-PSP to conformational modeling of side chains has led to another method, called OPUS-Rota. In terms of combined speed and accuracy, OPUS-Rota outperforms all of the other methods in modeling side-chain conformation. In this Account, we briefly outline the basic scheme of the OPUS-PSP potential and its application to side-chain modeling via OPUS-Rota. Future perspectives on the modeling of orientation dependence are also discussed. The computer programs for OPUS-PSP and OPUS-Rota can be downloaded at http://sigler.bioch.bcm.tmc.edu/MaLab . They are free for academic users.
Collapse
Affiliation(s)
- Jianpeng Ma
- Department of Biochemistry and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, and Department of Bioengineering, Rice University, Houston, Texas 77005
| |
Collapse
|
46
|
Gu J, Li H, Jiang H, Wang X. Optimizing energy potential for protein fold recognition with parametric evaluation function. J Comput Biol 2009; 16:427-42. [PMID: 19254182 DOI: 10.1089/cmb.2008.0128] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In this paper, a new optimization method is proposed to determine a simplified energy potential for protein fold recognition, which consists of the residue-residue contact, hydrophobicity, and pseudodihedral potentials. With a parametric evaluation function method, the Z-scores of all the proteins in a training set are optimized simultaneously to obtain the best parameter set of the potential. For this multi-objective and multi-constraint problem, the new optimization scheme is very effective. The derived potential is then tested on two high-quality decoy sets and compared with other classical fold recognition potentials. With the simplified energy potential, we achieve a high level of discrimination capability between correct and incorrect folds.
Collapse
Affiliation(s)
- Junfeng Gu
- Department of Engineering Mechanics, State Key Laboratory of Structural Analysis for Industrial Equipment, Dalian University of Technology, Dalian, China
| | | | | | | |
Collapse
|
47
|
Rajgaria R, McAllister SR, Floudas CA. Towards accurate residue-residue hydrophobic contact prediction for alpha helical proteins via integer linear optimization. Proteins 2009; 74:929-47. [PMID: 18767158 DOI: 10.1002/prot.22202] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
A new optimization-based method is presented to predict the hydrophobic residue contacts in alpha-helical proteins. The proposed approach uses a high resolution distance dependent force field to calculate the interaction energy between different residues of a protein. The formulation predicts the hydrophobic contacts by minimizing the sum of these contact energies. These residue contacts are highly useful in narrowing down the conformational space searched by protein structure prediction algorithms. The proposed algorithm also offers the algorithmic advantage of producing a rank ordered list of the best contact sets. This model was tested on four independent alpha-helical protein test sets and was found to perform very well. The average accuracy of the predictions (separated by at least six residues) obtained using the presented method was approximately 66% for single domain proteins. The average true positive and false positive distances were also calculated for each protein test set and they are 8.87 and 14.67 A, respectively.
Collapse
Affiliation(s)
- R Rajgaria
- Department of Chemical Engineering, Princeton University, Princeton, New Jersey 08544-5263, USA
| | | | | |
Collapse
|
48
|
Gu J, Li H, Jiang H, Wang X. A simple Calpha-SC potential with higher accuracy for protein fold recognition. Biochem Biophys Res Commun 2009; 379:610-5. [PMID: 19121621 DOI: 10.1016/j.bbrc.2008.12.131] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2008] [Accepted: 12/20/2008] [Indexed: 11/18/2022]
Abstract
In this paper, an improved C(alpha)-SC energy potential designed for protein fold recognition was reported. It consists of three extremely simple interaction terms which are supposed to be the dominant interactions in protein folding: residue-residue contact, hydrophobicity and pseudodihedral potentials. The potential function only contains 210 contacts, one hydrophobic and one torsion parameters, which have been optimized using an interior point algorithm of linear programming. Tests of the derived potential function on commonly used decoy sets illustrate that it outperforms most of the existing coarse-grained potentials in terms of its capabilities in recognizing native structures and consistency in achieving high Z-scores across decoy sets, and it has almost equivalent performance to the potentials which considered complex intra-molecular interactions. The results show that our scoring function is a generally prospective potential for protein structure prediction and modeling with regard to its recognition and computation efficacy.
Collapse
Affiliation(s)
- Junfeng Gu
- State Key Laboratory of Structural Analysis for Industrial Equipment, Department of Engineering Mechanics, Dalian University of Technology, Dalian 116024, China
| | | | | | | |
Collapse
|
49
|
Abstract
We present a new method for multiple sequence alignment (MSA), which we call MSACSA. The method is based on the direct application of a global optimization method called the conformational space annealing (CSA) to a consistency-based score function constructed from pairwise sequence alignments between constituting sequences. We applied MSACSA to two MSA databases, the 82 families from the BAliBASE reference set 1 and the 366 families from the HOMSTRAD set. In all 450 cases, we obtained well optimized alignments satisfying more pairwise constraints producing, in consequence, more accurate alignments on average compared with a recent alignment method SPEM. One of the advantages of MSACSA is that it provides not just the global minimum alignment but also many distinct low-lying suboptimal alignments for a given objective function. This is due to the fact that conformational space annealing can maintain conformational diversity while searching for the conformations with low energies. This characteristics can help us to alleviate the problem arising from using an inaccurate score function. The method was the key factor for our success in the recent blind protein structure prediction experiment.
Collapse
|
50
|
Rajgaria R, McAllister SR, Floudas CA. Distance dependent centroid to centroid force fields using high resolution decoys. Proteins 2008; 70:950-70. [PMID: 17847088 DOI: 10.1002/prot.21561] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Simplified force fields play an important role in protein structure prediction and de novo protein design by requiring less computational effort than detailed atomistic potentials. A side chain centroid based, distance dependent pairwise interaction potential has been developed. A linear programming based formulation was used in which non-native "decoy" conformers are forced to take a higher energy compared with the corresponding native structure. This model was trained on an enhanced and diverse protein set. High quality decoy structures were generated for approximately 1400 nonhomologous proteins using torsion angle dynamics along with restricted variations of the hydrophobic cores of the native structure. The resulting decoy set was used to train the model yielding two different side chain centroid based force fields that differ in the way distance dependence has been used to calculate energy parameters. These force fields were tested on an independent set of 148 test proteins with 500 decoy structures for each protein. The side chain centroid force fields were successful in correctly identifying approximately 86% native structures. The Z-scores produced by the proposed centroid-centroid distance dependent force fields improved compared with other distance dependent C(alpha)-C(alpha) or side chain based force fields.
Collapse
Affiliation(s)
- R Rajgaria
- Department of Chemical Engineering, Princeton University, Princeton, New Jersey 08544-5263, USA
| | | | | |
Collapse
|