1
|
Lyu N, Bergold P, Soley MB, Wang C, Batista VS. Holographic Gaussian Boson Sampling with Matrix Product States on 3D cQED Processors. J Chem Theory Comput 2024; 20:6402-6413. [PMID: 38968605 DOI: 10.1021/acs.jctc.4c00384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/07/2024]
Abstract
We introduce quantum circuits for simulations of multimode state vectors on 3D circuit quantum electrodynamics (cQED) processors using matrix product state representations. The circuits are demonstrated as applied to simulations of molecular docking based on holographic Gaussian boson sampling (GBS), as illustrated for the binding of a thiol-containing aryl sulfonamide ligand to the tumor necrosis factor-α converting enzyme receptor. We show that cQED devices with a modest number of modes could be employed to simulate multimode systems by repurposing working modes through measurement and reinitialization. We anticipate that a wide range of GBS applications could be implemented on compact 3D cQED processors analogously using the holographic approach. Simulations on qubit-based quantum computers could be implemented analogously using circuits that represent continuous variables in terms of truncated expansions of Fock states.
Collapse
Affiliation(s)
- Ningyi Lyu
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States
| | - Paul Bergold
- Department of Mathematics, University of Surrey, Guildford GU2 7XH, U.K
| | - Micheline B Soley
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States
- Yale Quantum Institute, Yale University, New Haven, Connecticut 06511, United States
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Chen Wang
- Department of Physics, University of Massachusetts-Amherst, Amherst, Massachusetts 01003, United States
| | - Victor S Batista
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States
- Yale Quantum Institute, Yale University, New Haven, Connecticut 06511, United States
| |
Collapse
|
2
|
Roca-Martínez J, Dhondge H, Sattler M, Vranken WF. Deciphering the RRM-RNA recognition code: A computational analysis. PLoS Comput Biol 2023; 19:e1010859. [PMID: 36689472 PMCID: PMC9894542 DOI: 10.1371/journal.pcbi.1010859] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 02/02/2023] [Accepted: 01/07/2023] [Indexed: 01/24/2023] Open
Abstract
RNA recognition motifs (RRM) are the most prevalent class of RNA binding domains in eucaryotes. Their RNA binding preferences have been investigated for almost two decades, and even though some RRM domains are now very well described, their RNA recognition code has remained elusive. An increasing number of experimental structures of RRM-RNA complexes has become available in recent years. Here, we perform an in-depth computational analysis to derive an RNA recognition code for canonical RRMs. We present and validate a computational scoring method to estimate the binding between an RRM and a single stranded RNA, based on structural data from a carefully curated multiple sequence alignment, which can predict RRM binding RNA sequence motifs based on the RRM protein sequence. Given the importance and prevalence of RRMs in humans and other species, this tool could help design RNA binding motifs with uses in medical or synthetic biology applications, leading towards the de novo design of RRMs with specific RNA recognition.
Collapse
Affiliation(s)
- Joel Roca-Martínez
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- Structural biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | | | - Michael Sattler
- Institute of Structural Biology, Molecular Targets and Therapeutics Center, Helmholtz Munich, Neuherberg, Germany
- Bavarian NMR Center, Department of Bioscience, School of Natural Sciences, Technical University of Munich, Garching, Germany
| | - Wim F. Vranken
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- Structural biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| |
Collapse
|
3
|
Delaunay M, Ha-Duong T. Computational Tools and Strategies to Develop Peptide-Based Inhibitors of Protein-Protein Interactions. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2405:205-230. [PMID: 35298816 DOI: 10.1007/978-1-0716-1855-4_11] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein-protein interactions play crucial and subtle roles in many biological processes and modifications of their fine mechanisms generally result in severe diseases. Peptide derivatives are very promising therapeutic agents for modulating protein-protein associations with sizes and specificities between those of small compounds and antibodies. For the same reasons, rational design of peptide-based inhibitors naturally borrows and combines computational methods from both protein-ligand and protein-protein research fields. In this chapter, we aim to provide an overview of computational tools and approaches used for identifying and optimizing peptides that target protein-protein interfaces with high affinity and specificity. We hope that this review will help to implement appropriate in silico strategies for peptide-based drug design that builds on available information for the systems of interest.
Collapse
Affiliation(s)
| | - Tâp Ha-Duong
- Université Paris-Saclay, CNRS, BioCIS, Châtenay-Malabry, France.
| |
Collapse
|
4
|
Liang S, Li Z, Zhan J, Zhou Y. De novo protein design by an energy function based on series expansion in distance and orientation dependence. Bioinformatics 2021; 38:86-93. [PMID: 34406339 DOI: 10.1093/bioinformatics/btab598] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 08/11/2021] [Accepted: 08/16/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Despite many successes, de novo protein design is not yet a solved problem as its success rate remains low. The low success rate is largely because we do not yet have an accurate energy function for describing the solvent-mediated interaction between amino acid residues in a protein chain. Previous studies showed that an energy function based on series expansions with its parameters optimized for side-chain and loop conformations can lead to one of the most accurate methods for side chain (OSCAR) and loop prediction (LEAP). Following the same strategy, we developed an energy function based on series expansions with the parameters optimized in four separate stages (recovering single-residue types without and with orientation dependence, selecting loop decoys and maintaining the composition of amino acids). We tested the energy function for de novo design by using Monte Carlo simulated annealing. RESULTS The method for protein design (OSCAR-Design) is found to be as accurate as OSCAR and LEAP for side-chain and loop prediction, respectively. In de novo design, it can recover native residue types ranging from 38% to 43% depending on test sets, conserve hydrophobic/hydrophilic residues at ∼75%, and yield the overall similarity in amino acid compositions at more than 90%. These performance measures are all statistically significantly better than several protein design programs compared. Moreover, the largest hydrophobic patch areas in designed proteins are near or smaller than those in native proteins. Thus, an energy function based on series expansion can be made useful for protein design. AVAILABILITY AND IMPLEMENTATION The Linux executable version is freely available for academic users at http://zhouyq-lab.szbl.ac.cn/resources/.
Collapse
Affiliation(s)
- Shide Liang
- Department of R & D, Bio-Thera Solutions, Guangzhou 510530, China
| | - Zhixiu Li
- Institute of Health and Biomedical Innovation, Queensland University of Technology at Translational Research Institute, Woolloongabba, QLD 3001, Australia
| | - Jian Zhan
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast Campus, Southport, QLD 4222, Australia.,Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Yaoqi Zhou
- Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China.,Peking University Shenzhen Graduate School, Shenzhen 518055, China
| |
Collapse
|
5
|
Chen X, Song S, Ji J, Tang Z, Todo Y. Incorporating a multiobjective knowledge-based energy function into differential evolution for protein structure prediction. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2020.06.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
6
|
Abstract
Atom pairwise potential functions make up an essential part of many scoring functions for protein decoy detection. With the development of machine learning (ML) tools, there are multiple ways to combine potential functions to create novel ML models and methods. Potential function parameters can be easily extracted; however, it is usually hard to directly obtain the calculated atom pairwise energies from scoring functions. Amber, as one of the most popular suites of modeling programs, has an extensive history and library of force field potential functions. In this work, we directly used the force field parameters in ff94 and ff14SB from Amber and encoded them to calculate atom pairwise energies for different interactions. Two sets of structures (single amino acid set and a dipeptide set) were used to evaluate the performance of our encoded Amber potentials. From the comparison results between energy terms obtained from our encoding and Amber, we find energy difference within ±0.06 kcal/mol for all tested structures. Previously we have shown that the Random Forest (RF) model can help to emphasize more important atom pairwise interactions and ignore insignificant ones [Pei, J.; Zheng, Z.; Merz, K. M. J. Chem. Inf. Model. 2019, 59, 1919-1929]. Here, as an example of combining ML methods with traditional potential functions, we followed the same work flow to combine the RF models with force field potential functions from Amber. To determine the performance of our RF models with force field potential functions, 224 different protein native-decoy systems were used as our training and testing sets We find that the RF models with ff94 and ff14SB force field parameters outperformed all other scoring functions (RF models with KECSA2, RWplus, DFIRE, dDFIRE, and GOAP) considered in this work for native structure detection, and they performed similarly in detecting the best decoy. Through inclusion of best decoy to decoy comparisons in building our RF models, we were able to generate models that outperformed the score functions tested herein both on accuracy and best decoy detection, again showing the performance and flexibility of our RF models to tackle this problem. Finally, the importance of the RF algorithm and force field parameters were also tested and the comparison results suggest that both the RF algorithm and force field potentials are important with the ML scoring function achieving its best performance only by combining them together. All code and data used in this work are available at https://github.com/JunPei000/FFENCODER_for_Protein_Folding_Pose_Selection.
Collapse
Affiliation(s)
- Jun Pei
- Department of Chemistry and the Department of Biochemistry and Molecular Biology, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Lin Frank Song
- Department of Chemistry and the Department of Biochemistry and Molecular Biology, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Kenneth M Merz
- Department of Chemistry and the Department of Biochemistry and Molecular Biology, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| |
Collapse
|
7
|
Sternke M, Tripp KW, Barrick D. The use of consensus sequence information to engineer stability and activity in proteins. Methods Enzymol 2020; 643:149-179. [PMID: 32896279 DOI: 10.1016/bs.mie.2020.06.001] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The goal of protein design is to create proteins that are stable, soluble, and active. Here we focus on one approach to protein design in which sequence information is used to create a "consensus" sequence. Such consensus sequences comprise the most common residue at each position in a multiple sequence alignment (MSA). After describing some general ideas that relate MSA and consensus sequences and presenting a statistical thermodynamic framework that relates consensus and non-consensus sequences to stability, we detail the process of designing a consensus sequence and survey reports of consensus design and characterization from the literature. Many of these consensus proteins retain native biological activities including ligand binding and enzyme activity. Remarkably, in most cases the consensus protein shows significantly higher stability than extant versions of the protein, as measured by thermal or chemical denaturation, consistent with the statistical thermodynamic model. To understand this stability increase, we compare various features of consensus sequences with the extant MSA sequences from which they were derived. Consensus sequences show enrichment in charged residues (most notably glutamate and lysine) and depletion of uncharged polar residues (glutamine, serine, and asparagine). Surprisingly, a survey of stability changes resulting from point substitutions show little correlation with residue frequencies at the corresponding positions within the MSA, suggesting that the high stability of consensus proteins may result from interactions among residue pairs or higher-order clusters. Whatever the source, the large number of reported successes demonstrates that consensus design is a viable route to generating active and in many cases highly stabilized proteins.
Collapse
Affiliation(s)
- Matt Sternke
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, MD, United States; Program in Molecular Biophysics, Johns Hopkins University, Baltimore, MD, United States
| | - Katherine W Tripp
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, MD, United States
| | - Doug Barrick
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, MD, United States.
| |
Collapse
|
8
|
Banchi L, Fingerhuth M, Babej T, Ing C, Arrazola JM. Molecular docking with Gaussian Boson Sampling. SCIENCE ADVANCES 2020; 6:eaax1950. [PMID: 32548251 PMCID: PMC7274809 DOI: 10.1126/sciadv.aax1950] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 04/03/2020] [Indexed: 05/28/2023]
Abstract
Gaussian Boson Samplers are photonic quantum devices with the potential to perform intractable tasks for classical systems. As with other near-term quantum technologies, an outstanding challenge is to identify specific problems of practical interest where these devices can prove useful. Here, we show that Gaussian Boson Samplers can be used to predict molecular docking configurations, a central problem for pharmaceutical drug design. We develop an approach where the problem is reduced to finding the maximum weighted clique in a graph, and show that Gaussian Boson Samplers can be programmed to sample large-weight cliques, i.e., stable docking configurations, with high probability, even with photon losses. We also describe how outputs from the device can be used to enhance the performance of classical algorithms. To benchmark our approach, we predict the binding mode of a ligand to the tumor necrosis factor-α converting enzyme, a target linked to immune system diseases and cancer.
Collapse
Affiliation(s)
| | - Mark Fingerhuth
- ProteinQure Inc., 192 Spadina Ave, Toronto, ON M5T 2C2, Canada
| | - Tomas Babej
- ProteinQure Inc., 192 Spadina Ave, Toronto, ON M5T 2C2, Canada
| | - Christopher Ing
- ProteinQure Inc., 192 Spadina Ave, Toronto, ON M5T 2C2, Canada
| | | |
Collapse
|
9
|
Chowdhury R, Maranas CD. From directed evolution to computational enzyme engineering—A review. AIChE J 2019. [DOI: 10.1002/aic.16847] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Ratul Chowdhury
- Department of Chemical Engineering The Pennsylvania State University University Park Pennsylvania
| | - Costas D. Maranas
- Department of Chemical Engineering The Pennsylvania State University University Park Pennsylvania
| |
Collapse
|
10
|
Wang X, Huang SY. Integrating Bonded and Nonbonded Potentials in the Knowledge-Based Scoring Function for Protein Structure Prediction. J Chem Inf Model 2019; 59:3080-3090. [DOI: 10.1021/acs.jcim.9b00057] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Affiliation(s)
- Xinxiang Wang
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
11
|
Xu G, Ma T, Wang Q, Ma J. OPUS-SSF: A side-chain-inclusive scoring function for ranking protein structural models. Protein Sci 2019; 28:1157-1162. [PMID: 30919509 DOI: 10.1002/pro.3608] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Revised: 03/21/2019] [Accepted: 03/27/2019] [Indexed: 12/21/2022]
Abstract
We introduce a side-chain-inclusive scoring function, named OPUS-SSF, for ranking protein structural models. The method builds a scoring function based on the native distributions of the coordinate components of certain anchoring points in a local molecular system for peptide segments of 5, 7, 9, and 11 residues in length. Differing from our previous OPUS-CSF [Xu et al., Protein Sci. 2018; 27: 286-292], which exclusively uses main chain information, OPUS-SSF employs anchoring points on side chains so that the effect of side chains is taken into account. The performance of OPUS-SSF was tested on 15 decoy sets containing totally 603 proteins, and 571 of them had their native structures recognized from their decoys. Similar to OPUS-CSF, OPUS-SSF does not employ the Boltzmann formula in constructing scoring functions. The results indicate that OPUS-SSF has achieved a significant improvement on decoy recognition and it should be a very useful tool for protein structural prediction and modeling.
Collapse
Affiliation(s)
- Gang Xu
- School of Life Sciences, Tsinghua University, Beijing 100084, People's Republic of China
| | - Tianqi Ma
- Applied Physics Program, Rice University, Houston, Texas 77005.,Department of Bioengineering, Rice University, Houston, Texas 77005
| | - Qinghua Wang
- Verna and Marrs Mclean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas 77030
| | - Jianpeng Ma
- School of Life Sciences, Tsinghua University, Beijing 100084, People's Republic of China.,Applied Physics Program, Rice University, Houston, Texas 77005.,Department of Bioengineering, Rice University, Houston, Texas 77005.,Verna and Marrs Mclean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas 77030
| |
Collapse
|
12
|
Pei J, Zheng Z, Merz KM. Random Forest Refinement of the KECSA2 Knowledge-Based Scoring Function for Protein Decoy Detection. J Chem Inf Model 2019; 59:1919-1929. [DOI: 10.1021/acs.jcim.8b00734] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jun Pei
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
| | - Zheng Zheng
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
| | - Kenneth M. Merz
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
- Institute for Cyber Enabled Research, Michigan State University, 567 Wilson Road, East Lansing, Michigan 48824, United States
| |
Collapse
|
13
|
López-Blanco JR, Chacón P. KORP: knowledge-based 6D potential for fast protein and loop modeling. Bioinformatics 2019; 35:3013-3019. [DOI: 10.1093/bioinformatics/btz026] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Revised: 01/03/2019] [Accepted: 01/08/2019] [Indexed: 12/18/2022] Open
Abstract
Abstract
Motivation
Knowledge-based statistical potentials constitute a simpler and easier alternative to physics-based potentials in many applications, including folding, docking and protein modeling. Here, to improve the effectiveness of the current approximations, we attempt to capture the six-dimensional nature of residue–residue interactions from known protein structures using a simple backbone-based representation.
Results
We have developed KORP, a knowledge-based pairwise potential for proteins that depends on the relative position and orientation between residues. Using a minimalist representation of only three backbone atoms per residue, KORP utilizes a six-dimensional joint probability distribution to outperform state-of-the-art statistical potentials for native structure recognition and best model selection in recent critical assessment of protein structure prediction and loop-modeling benchmarks. Compared with the existing methods, our side-chain independent potential has a lower complexity and better efficiency. The superior accuracy and robustness of KORP represent a promising advance for protein modeling and refinement applications that require a fast but highly discriminative energy function.
Availability and implementation
http://chaconlab.org/modeling/korp.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- José Ramón López-Blanco
- Department of Biological Chemical Physics, Rocasolano Institute of Physical Chemistry C.S.I.C, Madrid, Spain
| | - Pablo Chacón
- Department of Biological Chemical Physics, Rocasolano Institute of Physical Chemistry C.S.I.C, Madrid, Spain
| |
Collapse
|
14
|
Sormanni P, Aprile FA, Vendruscolo M. Third generation antibody discovery methods: in silico rational design. Chem Soc Rev 2018; 47:9137-9157. [PMID: 30298157 DOI: 10.1039/c8cs00523k] [Citation(s) in RCA: 76] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Owing to their outstanding performances in molecular recognition, antibodies are extensively used in research and applications in molecular biology, biotechnology and medicine. Recent advances in experimental and computational methods are making it possible to complement well-established in vivo (first generation) and in vitro (second generation) methods of antibody discovery with novel in silico (third generation) approaches. Here we describe the principles of computational antibody design and review the state of the art in this field. We then present Modular, a method that implements the rational design of antibodies in a modular manner, and describe the opportunities offered by this approach.
Collapse
Affiliation(s)
- Pietro Sormanni
- Centre for Misfolding Diseases, Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, UK.
| | | | | |
Collapse
|
15
|
Chu H, Liu H. TetraBASE: A Side Chain-Independent Statistical Energy for Designing Realistically Packed Protein Backbones. J Chem Inf Model 2018; 58:430-442. [PMID: 29314837 DOI: 10.1021/acs.jcim.7b00677] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
To construct backbone structures of high designability is a primary aspect of computational protein design. We report here a side chain-independent statistical energy that aims at realistic modeling of through-space packing of polypeptide backbones. To mitigate the lack of explicit amino acid side chains, the model treats the interbackbone site packing as being dependent on peptide local conformation. In addition, new variables suitable for statistical analysis, one for relative orientation and another for distance, have been introduced to represent the intersite geometry based on the asymmetrical tetrahedron organization of distinct chemical groups surrounding the Cα-carbon atoms. The resulting tetrahedron-based backbone statistical energy (tetraBASE) model has been used to optimize the tertiary organizations of secondary structure elements (SSEs) of designated types with Monte Caro simulated annealing, starting from artificial initial configurations. The tetraBASE minimum energy structures can reproduce SSE packing frequently observed in native proteins with atomic root-mean-square deviations of 1-2 Å. The model has also been tested by examining the stability of native SSE arrangements under tetraBASE. The results suggest that tetraBASE model can be used to effectively represent interbackbone packing when designing backbone structures without explicitly knowing side chain types.
Collapse
Affiliation(s)
- Huanyu Chu
- School of Life Sciences, University of Science and Technology of China , 230027 Hefei, Anhui China.,Hefei National Laboratory for Physical Sciences at the Microscales , 230027 Hefei, Anhui China
| | - Haiyan Liu
- School of Life Sciences, University of Science and Technology of China , 230027 Hefei, Anhui China.,Hefei National Laboratory for Physical Sciences at the Microscales , 230027 Hefei, Anhui China.,Collaborative Innovation Center of Chemistry for Life Sciences , 230027 Hefei, Anhui China
| |
Collapse
|
16
|
Wang X, Zhang D, Huang SY. New Knowledge-Based Scoring Function with Inclusion of Backbone Conformational Entropies from Protein Structures. J Chem Inf Model 2018; 58:724-732. [DOI: 10.1021/acs.jcim.7b00601] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Xinxiang Wang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Di Zhang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
17
|
Li B, Fooksa M, Heinze S, Meiler J. Finding the needle in the haystack: towards solving the protein-folding problem computationally. Crit Rev Biochem Mol Biol 2018; 53:1-28. [PMID: 28976219 PMCID: PMC6790072 DOI: 10.1080/10409238.2017.1380596] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Revised: 08/22/2017] [Accepted: 09/13/2017] [Indexed: 12/22/2022]
Abstract
Prediction of protein tertiary structures from amino acid sequence and understanding the mechanisms of how proteins fold, collectively known as "the protein folding problem," has been a grand challenge in molecular biology for over half a century. Theories have been developed that provide us with an unprecedented understanding of protein folding mechanisms. However, computational simulation of protein folding is still difficult, and prediction of protein tertiary structure from amino acid sequence is an unsolved problem. Progress toward a satisfying solution has been slow due to challenges in sampling the vast conformational space and deriving sufficiently accurate energy functions. Nevertheless, several techniques and algorithms have been adopted to overcome these challenges, and the last two decades have seen exciting advances in enhanced sampling algorithms, computational power and tertiary structure prediction methodologies. This review aims at summarizing these computational techniques, specifically conformational sampling algorithms and energy approximations that have been frequently used to study protein-folding mechanisms or to de novo predict protein tertiary structures. We hope that this review can serve as an overview on how the protein-folding problem can be studied computationally and, in cases where experimental approaches are prohibitive, help the researcher choose the most relevant computational approach for the problem at hand. We conclude with a summary of current challenges faced and an outlook on potential future directions.
Collapse
Affiliation(s)
- Bian Li
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Michaela Fooksa
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
- Chemical and Physical Biology Graduate Program, Vanderbilt University, Nashville, TN, USA
| | - Sten Heinze
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
18
|
Xu G, Ma T, Zang T, Wang Q, Ma J. OPUS-CSF: A C-atom-based scoring function for ranking protein structural models. Protein Sci 2017; 27:286-292. [PMID: 29047165 PMCID: PMC5734313 DOI: 10.1002/pro.3327] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2017] [Revised: 10/14/2017] [Accepted: 10/16/2017] [Indexed: 12/12/2022]
Abstract
We report a C‐atom‐based scoring function, named OPUS‐CSF, for ranking protein structural models. Rather than using traditional Boltzmann formula, we built a scoring function (CSF score) based on the native distributions (derived from the entire PDB) of coordinate components of mainchain C (carbonyl) atoms on selected residues of peptide segments of 5, 7, 9, and 11 residues in length. In testing OPUS‐CSF on decoy recognition, it maximally recognized 257 native structures out of 278 targets in 11 commonly used decoy sets, significantly outperforming other popular all‐atom empirical potentials. The average correlation coefficient with TM‐score was also comparable with those of other potentials. OPUS‐CSF is a highly coarse‐grained scoring function, which only requires input of partial mainchain information, and very fast. Thus, it is suitable for applications at early stage of structural building.
Collapse
Affiliation(s)
- Gang Xu
- School of Life Sciences, Tsinghua University, Beijing, China
| | - Tianqi Ma
- Applied Physics Program, Rice University, Houston, Texas.,Department of Bioengineering, Rice University, Houston, Texas
| | - Tianwu Zang
- Applied Physics Program, Rice University, Houston, Texas.,Department of Bioengineering, Rice University, Houston, Texas
| | - Qinghua Wang
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, BCM-125, Houston, Texas
| | - Jianpeng Ma
- School of Life Sciences, Tsinghua University, Beijing, China.,Applied Physics Program, Rice University, Houston, Texas.,Department of Bioengineering, Rice University, Houston, Texas.,Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, BCM-125, Houston, Texas
| |
Collapse
|
19
|
Xu G, Ma T, Zang T, Sun W, Wang Q, Ma J. OPUS-DOSP: A Distance- and Orientation-Dependent All-Atom Potential Derived from Side-Chain Packing. J Mol Biol 2017; 429:3113-3120. [PMID: 28864201 DOI: 10.1016/j.jmb.2017.08.013] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2017] [Revised: 07/27/2017] [Accepted: 08/22/2017] [Indexed: 01/18/2023]
Abstract
We report a new distance- and orientation-dependent, all-atom statistical potential derived from side-chain packing, named OPUS-DOSP, for protein structure modeling. The framework of OPUS-DOSP is based on OPUS-PSP, previously developed by us [JMB (2008), 376, 288-301], with refinement and new features. In particular, distance or orientation contribution is considered depending on the range of contact distance. A new auxiliary function in energy function is also introduced, in addition to the traditional Boltzmann term, in order to adjust the contributions of extreme cases. OPUS-DOSP was tested on 11 decoy sets commonly used for statistical potential benchmarking. Among 278 native structures, 239 and 249 native structures were recognized by OPUS-DOSP without and with the auxiliary function, respectively. The results show that OPUS-DOSP has an increased decoy recognition capability comparing with those of other relevant potentials to date.
Collapse
Affiliation(s)
- Gang Xu
- School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Tianqi Ma
- Applied Physics Program, Rice University, Houston, TX 77005, United States; Department of Bioengineering, Rice University, Houston, TX 77005, United States
| | - Tianwu Zang
- Applied Physics Program, Rice University, Houston, TX 77005, United States; Department of Bioengineering, Rice University, Houston, TX 77005, United States
| | - Weitao Sun
- Zhou Pei-Yuan Center for Applied Mathematics, Tsinghua University, Beijing 100084, China
| | - Qinghua Wang
- Verna and Marrs Mclean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030, United States
| | - Jianpeng Ma
- School of Life Sciences, Tsinghua University, Beijing 100084, China; Applied Physics Program, Rice University, Houston, TX 77005, United States; Department of Bioengineering, Rice University, Houston, TX 77005, United States; Verna and Marrs Mclean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030, United States.
| |
Collapse
|
20
|
Zhou X, Xiong P, Wang M, Ma R, Zhang J, Chen Q, Liu H. Proteins of well-defined structures can be designed without backbone readjustment by a statistical model. J Struct Biol 2016; 196:350-357. [DOI: 10.1016/j.jsb.2016.08.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2016] [Revised: 07/26/2016] [Accepted: 08/02/2016] [Indexed: 11/25/2022]
|
21
|
Maximova T, Moffatt R, Ma B, Nussinov R, Shehu A. Principles and Overview of Sampling Methods for Modeling Macromolecular Structure and Dynamics. PLoS Comput Biol 2016; 12:e1004619. [PMID: 27124275 PMCID: PMC4849799 DOI: 10.1371/journal.pcbi.1004619] [Citation(s) in RCA: 132] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Investigation of macromolecular structure and dynamics is fundamental to understanding how macromolecules carry out their functions in the cell. Significant advances have been made toward this end in silico, with a growing number of computational methods proposed yearly to study and simulate various aspects of macromolecular structure and dynamics. This review aims to provide an overview of recent advances, focusing primarily on methods proposed for exploring the structure space of macromolecules in isolation and in assemblies for the purpose of characterizing equilibrium structure and dynamics. In addition to surveying recent applications that showcase current capabilities of computational methods, this review highlights state-of-the-art algorithmic techniques proposed to overcome challenges posed in silico by the disparate spatial and time scales accessed by dynamic macromolecules. This review is not meant to be exhaustive, as such an endeavor is impossible, but rather aims to balance breadth and depth of strategies for modeling macromolecular structure and dynamics for a broad audience of novices and experts.
Collapse
Affiliation(s)
- Tatiana Maximova
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
| | - Ryan Moffatt
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
| | - Buyong Ma
- Basic Science Program, Leidos Biomedical Research, Inc. Cancer and Inflammation Program, National Cancer Institute, Frederick, Maryland, United States of America
| | - Ruth Nussinov
- Basic Science Program, Leidos Biomedical Research, Inc. Cancer and Inflammation Program, National Cancer Institute, Frederick, Maryland, United States of America
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
- Department of Biongineering, George Mason University, Fairfax, Virginia, United States of America
- School of Systems Biology, George Mason University, Manassas, Virginia, United States of America
| |
Collapse
|
22
|
Mori M, Ichikawa M, Kiguchi Y, Miyazaki T, Hattori M, Nishikawa A, Tonozuka T. A Surface Loop in the N-Terminal Domain of <i>Pedobacter heparinus </i>Heparin Lyase II is Important for Activity. J Appl Glycosci (1999) 2016; 63:7-11. [PMID: 34354475 PMCID: PMC8056909 DOI: 10.5458/jag.jag.jag-2015_019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2015] [Accepted: 09/15/2015] [Indexed: 12/02/2022] Open
Abstract
Pedobacter heparinus heparin lyase II (PhHepII) is composed of N-terminal, central, and C-terminal domains. A long surface loop, designated loop-A, is in the N-terminal domain and is composed of amino acids 84-89. In this study, we deleted two, three, or four residues in loop-A to create Δ86-87, Δ85-87, and Δ84-87 PhHepII deletion mutants. We hypothesized that the deletions would increase PhHepII thermostability. After heating purified PhHepII enzymes at 45 °C for 5 min, 6.1 % of the enzyme activity remained in wild-type PhHepII, whereas 10.6 % of the enzyme activity remained in Δ86-87 PhHepII. The results indicated that the deletion caused a significant decrease in the activity, although Δ86-87 PhHepII is slightly more thermostable than wild-type PhHepII. In addtion, Δ85-87 and Δ84-87 PhHepII had weak or no enzyme activity, even when unheated. Circular dichroism spectra showed that Δ85-87 PhHepII was properly folded. These results suggest that the flexibility of loop-A is important for PhHepII enzyme activity.
Collapse
Affiliation(s)
- Marina Mori
- Department of Applied Biological Science, Tokyo University of Agriculture and Technology
| | - Megumi Ichikawa
- Department of Applied Biological Science, Tokyo University of Agriculture and Technology
| | - Yumiko Kiguchi
- Department of Applied Biological Science, Tokyo University of Agriculture and Technology
| | - Takatsugu Miyazaki
- Department of Applied Biological Science, Tokyo University of Agriculture and Technology
| | - Makoto Hattori
- Department of Applied Biological Science, Tokyo University of Agriculture and Technology
| | - Atsushi Nishikawa
- Department of Applied Biological Science, Tokyo University of Agriculture and Technology
| | - Takashi Tonozuka
- Department of Applied Biological Science, Tokyo University of Agriculture and Technology
| |
Collapse
|
23
|
Shultis D, Dodge G, Zhang Y. Crystal structure of designed PX domain from cytokine-independent survival kinase and implications on evolution-based protein engineering. J Struct Biol 2015; 191:197-206. [PMID: 26073968 DOI: 10.1016/j.jsb.2015.06.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Revised: 05/13/2015] [Accepted: 06/10/2015] [Indexed: 01/03/2023]
Abstract
The Phox homology domain (PX domain) is a phosphoinositide-binding structural domain that is critical in mediating protein and cell membrane association and has been found in more than 100 eukaryotic proteins. The abundance of PX domains in nature offers an opportunity to redesign the protein using EvoDesign, a computational approach to design new sequences based on structure profiles of multiple evolutionarily related proteins. In this study, we report the X-ray crystallographic structure of a designed PX domain from the cytokine-independent survival kinase (CISK), which has been implicated as functioning in parallel with PKB/Akt in cell survival and insulin responses. Detailed data analysis of the designed CISK-PX protein demonstrates positive impacts of knowledge-based secondary structure and solvation predictions and structure-based sequence profiles on the efficiency of the evolutionary-based protein design method. The structure of the designed CISK-PX domain is close to the wild-type (1.54 Å in Cα RMSD), which was accurately predicted by I-TASSER based fragment assembly simulations (1.32 Å in Cα RMSD). This study represents the first successfully designed conditional peripheral membrane protein fold and has important implications in the examination and experimental validation of the evolution-based protein design approaches.
Collapse
Affiliation(s)
- David Shultis
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Gregory Dodge
- Department of Biological Chemistry, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA; Department of Biological Chemistry, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA.
| |
Collapse
|
24
|
Bayer T, Milker S, Wiesinger T, Rudroff F, Mihovilovic MD. Designer Microorganisms for Optimized Redox Cascade Reactions - Challenges and Future Perspectives. Adv Synth Catal 2015. [DOI: 10.1002/adsc.201500202] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
25
|
Computational tools for epitope vaccine design and evaluation. Curr Opin Virol 2015; 11:103-12. [PMID: 25837467 DOI: 10.1016/j.coviro.2015.03.013] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2014] [Revised: 03/13/2015] [Accepted: 03/16/2015] [Indexed: 12/15/2022]
Abstract
Rational approaches will be required to develop universal vaccines for viral pathogens such as human immunodeficiency virus, hepatitis C virus, and influenza, for which empirical approaches have failed. The main objective of a rational vaccine strategy is to design novel immunogens that are capable of inducing long-term protective immunity. In practice, this requires structure-based engineering of the target neutralizing epitopes and a quantitative readout of vaccine-induced immune responses. Therefore, computational tools that can facilitate these two areas have played increasingly important roles in rational vaccine design in recent years. Here we review the computational techniques developed for protein structure prediction and antibody repertoire analysis, and demonstrate how they can be applied to the design and evaluation of epitope vaccines.
Collapse
|
26
|
Xiong P, Wang M, Zhou X, Zhang T, Zhang J, Chen Q, Liu H. Protein design with a comprehensive statistical energy function and boosted by experimental selection for foldability. Nat Commun 2014; 5:5330. [PMID: 25345468 DOI: 10.1038/ncomms6330] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2014] [Accepted: 09/19/2014] [Indexed: 12/15/2022] Open
Abstract
The de novo design of amino acid sequences to fold into desired structures is a way to reach a more thorough understanding of how amino acid sequences encode protein structures and to supply methods for protein engineering. Notwithstanding significant breakthroughs, there are noteworthy limitations in current computational protein design. To overcome them needs computational models to complement current ones and experimental tools to provide extensive feedbacks to theory. Here we develop a comprehensive statistical energy function for protein design with a new general strategy and verify that it can complement and rival current well-established models. We establish that an experimental approach can be used to efficiently assess or improve the foldability of designed proteins. We report four de novo proteins for different targets, all experimentally verified to be well-folded, solved solution structures for two being in excellent agreement with respective design targets.
Collapse
Affiliation(s)
- Peng Xiong
- School of Life Sciences, University of Science and Technology of China, 443 Huangshan Road, Hefei, Anhui 230027, China
| | - Meng Wang
- School of Life Sciences, University of Science and Technology of China, 443 Huangshan Road, Hefei, Anhui 230027, China
| | - Xiaoqun Zhou
- School of Life Sciences, University of Science and Technology of China, 443 Huangshan Road, Hefei, Anhui 230027, China
| | - Tongchuan Zhang
- School of Life Sciences, University of Science and Technology of China, 443 Huangshan Road, Hefei, Anhui 230027, China
| | - Jiahai Zhang
- School of Life Sciences, University of Science and Technology of China, 443 Huangshan Road, Hefei, Anhui 230027, China
| | - Quan Chen
- School of Life Sciences, University of Science and Technology of China, 443 Huangshan Road, Hefei, Anhui 230027, China
| | - Haiyan Liu
- 1] School of Life Sciences, University of Science and Technology of China, 443 Huangshan Road, Hefei, Anhui 230027, China [2] Hefei National Laboratory for Physical Sciences at the Microscales, Hefei, Anhui 230027, China [3] Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, Anhui 230031, China
| |
Collapse
|
27
|
Pagan RF, Massey SE. A nonadaptive origin of a beneficial trait: in silico selection for free energy of folding leads to the neutral emergence of mutational robustness in single domain proteins. J Mol Evol 2013; 78:130-9. [PMID: 24362542 DOI: 10.1007/s00239-013-9606-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2013] [Accepted: 12/04/2013] [Indexed: 10/25/2022]
Abstract
Proteins are regarded as being robust to the deleterious effects of mutations. Here, the neutral emergence of mutational robustness in a population of single domain proteins is explored using computer simulations. A pairwise contact model was used to calculate the ΔG of folding (ΔG folding) using the three dimensional protein structure of leech eglin C. A random amino acid sequence with low mutational robustness, defined as the average ΔΔG resulting from a point mutation (ΔΔG average), was threaded onto the structure. A population of 1,000 threaded sequences was evolved under selection for stability, using an upper and lower energy threshold. Under these conditions, mutational robustness increased over time in the most common sequence in the population. In contrast, when the wild type sequence was used it did not show an increase in robustness. This implies that the emergence of mutational robustness is sequence specific and that wild type sequences may be close to maximal robustness. In addition, an inverse relationship between ∆∆G average and protein stability is shown, resulting partly from a larger average effect of point mutations in more stable proteins. The emergence of mutational robustness was also observed in the Escherichia coli colE1 Rop and human CD59 proteins, implying that the property may be common in single domain proteins under certain simulation conditions. The results indicate that at least a portion of mutational robustness in small globular proteins might have arisen by a process of neutral emergence, and could be an example of a beneficial trait that has not been directly selected for, termed a "pseudaptation."
Collapse
Affiliation(s)
- Rafael F Pagan
- Physics Department, University of Puerto Rico - Rio Piedras, San Juan, PR, USA
| | | |
Collapse
|
28
|
Compiani M, Capriotti E. Computational and theoretical methods for protein folding. Biochemistry 2013; 52:8601-24. [PMID: 24187909 DOI: 10.1021/bi4001529] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
A computational approach is essential whenever the complexity of the process under study is such that direct theoretical or experimental approaches are not viable. This is the case for protein folding, for which a significant amount of data are being collected. This paper reports on the essential role of in silico methods and the unprecedented interplay of computational and theoretical approaches, which is a defining point of the interdisciplinary investigations of the protein folding process. Besides giving an overview of the available computational methods and tools, we argue that computation plays not merely an ancillary role but has a more constructive function in that computational work may precede theory and experiments. More precisely, computation can provide the primary conceptual clues to inspire subsequent theoretical and experimental work even in a case where no preexisting evidence or theoretical frameworks are available. This is cogently manifested in the application of machine learning methods to come to grips with the folding dynamics. These close relationships suggested complementing the review of computational methods within the appropriate theoretical context to provide a self-contained outlook of the basic concepts that have converged into a unified description of folding and have grown in a synergic relationship with their computational counterpart. Finally, the advantages and limitations of current computational methodologies are discussed to show how the smart analysis of large amounts of data and the development of more effective algorithms can improve our understanding of protein folding.
Collapse
Affiliation(s)
- Mario Compiani
- School of Sciences and Technology, University of Camerino , Camerino, Macerata 62032, Italy
| | | |
Collapse
|
29
|
Maccari G, Spampinato GL, Tozzini V. SecStAnT: secondary structure analysis tool for data selection, statistics and models building. Bioinformatics 2013; 30:668-74. [PMID: 24130306 DOI: 10.1093/bioinformatics/btt586] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
30
|
Figueroa M, Oliveira N, Lejeune A, Kaufmann KW, Dorr BM, Matagne A, Martial JA, Meiler J, Van de Weerdt C. Octarellin VI: using rosetta to design a putative artificial (β/α)8 protein. PLoS One 2013; 8:e71858. [PMID: 23977165 PMCID: PMC3747059 DOI: 10.1371/journal.pone.0071858] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2012] [Accepted: 07/10/2013] [Indexed: 11/22/2022] Open
Abstract
The computational protein design protocol Rosetta has been applied successfully to a wide variety of protein engineering problems. Here the aim was to test its ability to design de novo a protein adopting the TIM-barrel fold, whose formation requires about twice as many residues as in the largest proteins successfully designed de novo to date. The designed protein, Octarellin VI, contains 216 residues. Its amino acid composition is similar to that of natural TIM-barrel proteins. When produced and purified, it showed a far-UV circular dichroism spectrum characteristic of folded proteins, with α-helical and β-sheet secondary structure. Its stable tertiary structure was confirmed by both tryptophan fluorescence and circular dichroism in the near UV. It proved heat stable up to 70°C. Dynamic light scattering experiments revealed a unique population of particles averaging 4 nm in diameter, in good agreement with our model. Although these data suggest the successful creation of an artificial α/β protein of more than 200 amino acids, Octarellin VI shows an apparent noncooperative chemical unfolding and low solubility.
Collapse
Affiliation(s)
- Maximiliano Figueroa
- GIGA-Research, Molecular Biology and Genetic Engineering Unit, University of Liège, Liège, Belgium
| | - Nicolas Oliveira
- GIGA-Research, Molecular Biology and Genetic Engineering Unit, University of Liège, Liège, Belgium
| | - Annabelle Lejeune
- GIGA-Research, Molecular Biology and Genetic Engineering Unit, University of Liège, Liège, Belgium
| | - Kristian W. Kaufmann
- Departments of Chemistry and Pharmacology, Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Brent M. Dorr
- Departments of Chemistry and Pharmacology, Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, United States of America
| | - André Matagne
- Laboratoire d’Enzymologie et Repliement des Protéines, Centre for Protein Engineering, University of Liège, Liège, Belgium
| | - Joseph A. Martial
- GIGA-Research, Molecular Biology and Genetic Engineering Unit, University of Liège, Liège, Belgium
| | - Jens Meiler
- Departments of Chemistry and Pharmacology, Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Cécile Van de Weerdt
- GIGA-Research, Molecular Biology and Genetic Engineering Unit, University of Liège, Liège, Belgium
- * E-mail:
| |
Collapse
|
31
|
Moal IH, Fernandez-Recio J. Intermolecular Contact Potentials for Protein-Protein Interactions Extracted from Binding Free Energy Changes upon Mutation. J Chem Theory Comput 2013; 9:3715-27. [PMID: 26584123 DOI: 10.1021/ct400295z] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Understanding and predicting the energetics of protein-protein interactions is fundamental to the structural modeling of protein complexes. Binding free energy can be approximated as a sum of pairwise atomic or residue contact energies, which are commonly inferred from contact frequencies observed in experimental protein structures. However, such statistically inferred potentials require certain assumptions and approximation. Here, we explore the possibility of deriving atomic and residue contact potentials directly from experimental binding free energy changes following mutation and present a number of such potentials. The first set of potentials is obtained by unweighted least-squares fitting and bootsrap aggregating. The second set is calculated using a weighting scheme optimized against absolute binding affinity data, so as to account for the over-representation of certain complexes, residues, and families of interactions. The congruence of the potentials with known physical chemistry is investigated. The potentials are further validated by ranking and clustering protein-protein docking poses.
Collapse
Affiliation(s)
- Iain H Moal
- Joint BSC-IRB Research Program in Computational Biology, Life Science Department, Barcelona Supercomputing Center , C/Jordi Girona 29, 08034 Barcelona, Spain
| | - Juan Fernandez-Recio
- Joint BSC-IRB Research Program in Computational Biology, Life Science Department, Barcelona Supercomputing Center , C/Jordi Girona 29, 08034 Barcelona, Spain
| |
Collapse
|
32
|
Use of anion-aromatic interactions to position the general base in the ketosteroid isomerase active site. Proc Natl Acad Sci U S A 2013; 110:11308-13. [PMID: 23798413 DOI: 10.1073/pnas.1206710110] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Although the cation-pi pair, formed between a side chain or substrate cation and the negative electrostatic potential of a pi system on the face of an aromatic ring, has been widely discussed and has been shown to be important in protein structure and protein-ligand interactions, there has been little discussion of the potential structural and functional importance in proteins of the related anion-aromatic pair (i.e., interaction of a negatively charged group with the positive electrostatic potential on the ring edge of an aromatic group). We posited, based on prior structural information, that anion-aromatic interactions between the anionic Asp general base and Phe54 and Phe116 might be used instead of a hydrogen-bond network to position the general base in the active site of ketosteroid isomerase from Comamonas testosteroni as there are no neighboring hydrogen-bonding groups. We have tested the role of the Phe residues using site-directed mutagenesis, double-mutant cycles, and high-resolution X-ray crystallography. These results indicate a catalytic role of these Phe residues. Extensive analysis of the Protein Data Bank provides strong support for a catalytic role of these and other Phe residues in providing anion-aromatic interactions that position anionic general bases within enzyme active sites. Our results further reveal a potential selective advantage of Phe in certain situations, relative to more traditional hydrogen-bonding groups, because it can simultaneously aid in the binding of hydrophobic substrates and positioning of a neighboring general base.
Collapse
|
33
|
Overview of regulatory strategies and molecular elements in metabolic engineering of bacteria. Mol Biotechnol 2013; 52:300-8. [PMID: 22359157 DOI: 10.1007/s12033-012-9514-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
From a viewpoint of biotechnology, metabolic engineering mainly aims to change the natural status of a pathway in a microorganism towards the overproduction of certain bioproducts. The biochemical nature of a pathway implies us that changed pathway is often the collective results of altered behavior of the metabolic enzymes encoded by corresponding genes. By finely modulating the expression of these genes or the properties of the enzyme, we can gain efficient control on the pathway. In this article, we reviewed the typical methods that have been applied to regulate the expression of genes in metabolic engineering. These methods are grouped according to the operation targets in a typical gene. The transcription of a gene is controlled by an indispensable promoter. By utilizing promoters with different strengths, expected levels of expression can be easily achieved, and screening a promoter library may find suitable mutant promoters that can provide tunable expression of a gene. Auto-responsive promoter (quorum sensing (QS)-based or oxygen-inducible) simplifies the induction process by driving the expression of a gene in an automated manner. Light responsive promoter enables reversible and noninvasive control on gene activity, providing a promising method in controlling gene expression with time and space resolution in metabolic engineering involving complicated genetic circuits. Through directed evolution and/or rational design, the encoding sequences of a gene can be altered, leading to the possibly most profound changes in properties of a metabolic enzyme. Introducing an engineered riboswitch in mRNA can make it a regulatory molecule at the same time; ribosomal binding site is commonly engineered to be more attractive for a ribosome through design. Terminator of a gene will affect the stability of an mRNA, and intergenic region will influence the expression of many related genes. Improving the performance of these elements are generally the main activities in metabolic engineering.
Collapse
|
34
|
Li Z, Yang Y, Zhan J, Dai L, Zhou Y. Energy functions in de novo protein design: current challenges and future prospects. Annu Rev Biophys 2013; 42:315-35. [PMID: 23451890 DOI: 10.1146/annurev-biophys-083012-130315] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
In the past decade, a concerted effort to successfully capture specific tertiary packing interactions produced specific three-dimensional structures for many de novo designed proteins that are validated by nuclear magnetic resonance and/or X-ray crystallographic techniques. However, the success rate of computational design remains low. In this review, we provide an overview of experimentally validated, de novo designed proteins and compare four available programs, RosettaDesign, EGAD, Liang-Grishin, and RosettaDesign-SR, by assessing designed sequences computationally. Computational assessment includes the recovery of native sequences, the calculation of sizes of hydrophobic patches and total solvent-accessible surface area, and the prediction of structural properties such as intrinsic disorder, secondary structures, and three-dimensional structures. This computational assessment, together with a recent community-wide experiment in assessing scoring functions for interface design, suggests that the next-generation protein-design scoring function will come from the right balance of complementary interaction terms. Such balance may be found when more negative experimental data become available as part of a training set.
Collapse
Affiliation(s)
- Zhixiu Li
- School of Informatics, Indiana University-Purdue University, Indianapolis, Indiana 46202, USA
| | | | | | | | | |
Collapse
|
35
|
Krueger B, Friedrich T, Förster F, Bernhardt J, Gross R, Dandekar T. Different evolutionary modifications as a guide to rewire two-component systems. Bioinform Biol Insights 2012; 6:97-128. [PMID: 22586357 PMCID: PMC3348925 DOI: 10.4137/bbi.s9356] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Two-component systems (TCS) are short signalling pathways generally occurring in prokaryotes. They frequently regulate prokaryotic stimulus responses and thus are also of interest for engineering in biotechnology and synthetic biology. The aim of this study is to better understand and describe rewiring of TCS while investigating different evolutionary scenarios. Based on large-scale screens of TCS in different organisms, this study gives detailed data, concrete alignments, and structure analysis on three general modification scenarios, where TCS were rewired for new responses and functions: (i) exchanges in the sequence within single TCS domains, (ii) exchange of whole TCS domains; (iii) addition of new components modulating TCS function. As a result, the replacement of stimulus and promotor cassettes to rewire TCS is well defined exploiting the alignments given here. The diverged TCS examples are non-trivial and the design is challenging. Designed connector proteins may also be useful to modify TCS in selected cases.
Collapse
Affiliation(s)
- Beate Krueger
- Dept of Bioinformatics, Biocenter, Am Hubland, University of Würzburg, D-97074 Würzburg, Germany
| | | | | | | | | | | |
Collapse
|
36
|
Mullins JGL. Structural modelling pipelines in next generation sequencing projects. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2012; 89:117-67. [PMID: 23046884 DOI: 10.1016/b978-0-12-394287-6.00005-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Our capacity to reliably predict protein structure from sequence is steadily improving due to the increased numbers and better targeting of protein structures being experimentally determined by structural genomics projects, along with the development of better modeling methodologies. Template-based (homology) modeling and de novo modeling methods are being combined to fill in remaining gaps in template coverage, and powerful automated structural modeling pipelines are being applied to large data sets of protein sequences. The improved quality of 3D models of proteins has led to their routine use in assessing the functional impact of nonsynonymous single nucleotide polymorphisms (nsSNPs) in specific protein systems, with the development of approaches that may be applied in a predictive fashion to nsSNPs emerging from next-generation sequencing projects. The challenges encountered in deriving functionally meaningful deductions from structural modeling can be quite different for proteins of different protein functional classes. The specific challenges to the assessment of the structural and functional impact of nsSNPs in globular proteins such as binding and regulatory proteins, structural proteins, and enzymes are discussed, as well as membrane transport proteins and ion channels. The mapping of reliable predictions of the structural and functional impact of SNPs, generated from automated modeling pipelines, on to protein-protein interaction networks will facilitate new approaches to understanding complex polygenic disorders and predisposition to disease.
Collapse
Affiliation(s)
- Jonathan G L Mullins
- Genome and Structural Bioinformatics, Institute of Life Science, College of Medicine, Swansea University, Singleton Park, Swansea, Wales, UK.
| |
Collapse
|
37
|
Sundaramurthy P, Sreenivasan R, Shameer K, Gakkhar S, Sowdhamini R. HORIBALFRE program: Higher Order Residue Interactions Based ALgorithm for Fold REcognition. Bioinformation 2011; 7:352-9. [PMID: 22355236 PMCID: PMC3280490 DOI: 10.6026/97320630007352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2011] [Accepted: 11/24/2011] [Indexed: 11/23/2022] Open
Abstract
Understanding the functional and structural implication of a protein encoded in novel genes using function association or fold recognition approaches remains to be a challenging task in the current era of genomes, metagenomes and personal genomes. In an attempt to enhance potential-based fold-recognition methods in recognizing remote homology between proteins, we propose a new approach "Higher Order Residue Interaction Based ALgorithm for Fold REcognition (HORIBALFRE)". Higher order residue interactions refer to a class of interactions in protein structures mediated by C(α) or C(β) atoms within a pre-defined distance cut-off. Higher order residue interactions (pairwise, triplet and quadruplet interactions) play a vital role in attaining the stable conformation of a protein structure. In HORIBALFRE, we incorporated the potential contributions from two body (pairwise) interactions, three body (triplet interactions) and four-body (quadruple interaction) interactions, to implement a new fold recognition algorithm. Core of HORIBALFRE algorithm includes the potentials generated from a library of protein structure derived from manually curated CAMPASS database of structure based sequence alignment. We used Fischer's dataset, with 68 templates and 56 target sequences, derived from SCOP database and performed one-against-all sequence alignment using TCoffee. Various potentials were derived using custom scripts and these potentials were incorporated in the HORIBALFRE algorithm. In this manuscript, we report outline of a novel fold recognition algorithm and initial results. Our results show that inclusion of quadruplet class of higher order residue interaction improves fold recognition.
Collapse
Affiliation(s)
- Pandurangan Sundaramurthy
- National Center for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore - 560065, India
- Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee -247667, India
| | - Raashi Sreenivasan
- National Center for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore - 560065, India
- Centre for Biotechnology, Anna University, Chennai - 600025, India
- University of Wisconsin-Madison, Madison, WI 53706-1481, USA; 5Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN 55901 USA
| | - Khader Shameer
- National Center for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore - 560065, India
- Authors contributed equally to this work
| | - Sunita Gakkhar
- Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee -247667, India
| | - Ramanathan Sowdhamini
- National Center for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore - 560065, India
| |
Collapse
|
38
|
Moughon SE, Samudrala R. LoCo: a novel main chain scoring function for protein structure prediction based on local coordinates. BMC Bioinformatics 2011; 12:368. [PMID: 21920038 PMCID: PMC3184297 DOI: 10.1186/1471-2105-12-368] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2011] [Accepted: 09/15/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Successful protein structure prediction requires accurate low-resolution scoring functions so that protein main chain conformations that are close to the native can be identified. Once that is accomplished, a more detailed and time-consuming treatment to produce all-atom models can be undertaken. The earliest low-resolution scoring used simple distance-based "contact potentials," but more recently, the relative orientations of interacting amino acids have been taken into account to improve performance. RESULTS We developed a new knowledge-based scoring function, LoCo, that locates the interaction partners of each individual residue within a local coordinate system based only on the position of its main chain N, Cα and C atoms. LoCo was trained on a large set of experimentally determined structures and optimized using standard sets of modeled structures, or "decoys." No structure used to train or optimize the function was included among those used to test it. When tested against 29 other published main chain functions on a group of 77 commonly used decoy sets, our function outperformed all others in Cα RMSD rank of the best-scoring decoy, with statistically significant p-values < 0.05 for 26 out of the 29 other functions considered. LoCo is fast, requiring on average less than 6 microseconds per residue for interaction and scoring on commonly-used computer hardware. CONCLUSIONS Our function demonstrates an unmatched combination of accuracy, speed, and simplicity and shows excellent promise for protein structure prediction. Broader applications may include protein-protein interactions and protein design.
Collapse
Affiliation(s)
- Stewart E Moughon
- Department of Microbiology, University of Washington, Box 357735, Seattle, Washington 98195-7242, USA.
| | | |
Collapse
|
39
|
Huang SY, Zou X. Statistical mechanics-based method to extract atomic distance-dependent potentials from protein structures. Proteins 2011; 79:2648-61. [PMID: 21732421 PMCID: PMC11108592 DOI: 10.1002/prot.23086] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2011] [Revised: 04/21/2011] [Accepted: 05/09/2011] [Indexed: 12/25/2022]
Abstract
In this study, we have developed a statistical mechanics-based iterative method to extract statistical atomic interaction potentials from known, nonredundant protein structures. Our method circumvents the long-standing reference state problem in deriving traditional knowledge-based scoring functions, by using rapid iterations through a physical, global convergence function. The rapid convergence of this physics-based method, unlike other parameter optimization methods, warrants the feasibility of deriving distance-dependent, all-atom statistical potentials to keep the scoring accuracy. The derived potentials, referred to as ITScore/Pro, have been validated using three diverse benchmarks: the high-resolution decoy set, the AMBER benchmark decoy set, and the CASP8 decoy set. Significant improvement in performance has been achieved. Finally, comparisons between the potentials of our model and potentials of a knowledge-based scoring function with a randomized reference state have revealed the reason for the better performance of our scoring function, which could provide useful insight into the development of other physical scoring functions. The potentials developed in this study are generally applicable for structural selection in protein structure prediction.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute, University of Missouri, Columbia, MO 65211
| | - Xiaoqin Zou
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute, University of Missouri, Columbia, MO 65211
| |
Collapse
|
40
|
Recent progress in protein drug design and discovery with a focus on novel approaches to the development of anti-cocaine medications. Future Med Chem 2011; 1:515-28. [PMID: 20161378 DOI: 10.4155/fmc.09.20] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Cocaine is highly addictive and no anti-cocaine medication is currently available. Accelerating cocaine metabolism, producing biologically inactive metabolites, is recognized as an ideal anti-cocaine medication strategy, especially for the treatment of acute cocaine toxicity. However, currently known wild-type enzymes have either too low a catalytic efficiency against the abused cocaine, in other words (-)-cocaine, or the in vivo half-life is too short. Novel computational strategies and design approaches have been developed recently to design and discover thermostable or high-activity mutants of enzymes based on detailed structures and catalytic/inactivation mechanisms. The structure- and mechanism-based computational design efforts have led to the discovery of high-activity mutants of butyrylcholinesterase and thermostable mutants of cocaine esterase as promising anti-cocaine therapeutics. The structure- and mechanism-based computational strategies and design approaches may be used to design high-activity and/or thermostable mutants of many other proteins that have clear therapeutic potentials and to design completely new therapeutic enzymes.
Collapse
|
41
|
Verschueren E, Vanhee P, van der Sloot AM, Serrano L, Rousseau F, Schymkowitz J. Protein design with fragment databases. Curr Opin Struct Biol 2011; 21:452-9. [PMID: 21684149 DOI: 10.1016/j.sbi.2011.05.002] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2011] [Accepted: 05/25/2011] [Indexed: 11/25/2022]
Abstract
Structure-based computational methods are popular tools for designing proteins and interactions between proteins because they provide the necessary insight and details required for rational engineering. Here, we first argue that large-scale databases of fragments contain a discrete but complete set of building blocks that can be used to design structures. We show that these structural alphabets can be saturated to provide conformational ensembles that sample the native structure space around energetic minima. Second, we show that catalogs of interaction patterns hold the key to overcome the lack of scaffolds when computationally designing protein interactions. Finally, we illustrate the power of database-driven computational protein design methods by recent successful applications and discuss what challenges remain to push this field forward.
Collapse
Affiliation(s)
- Erik Verschueren
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG) and UPF, Barcelona, Spain
| | | | | | | | | | | |
Collapse
|
42
|
Maddipati S, Nandigam R, Kim S, Venkatasubramanian V. Learning patterns in combinatorial protein libraries by Support Vector Machines. Comput Chem Eng 2011. [DOI: 10.1016/j.compchemeng.2011.01.017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
43
|
Tian L, Wu A, Cao Y, Dong X, Hu Y, Jiang T. NCACO-score: an effective main-chain dependent scoring function for structure modeling. BMC Bioinformatics 2011; 12:208. [PMID: 21612673 PMCID: PMC3123610 DOI: 10.1186/1471-2105-12-208] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2011] [Accepted: 05/26/2011] [Indexed: 11/10/2022] Open
Abstract
Background Development of effective scoring functions is a critical component to the success of protein structure modeling. Previously, many efforts have been dedicated to the development of scoring functions. Despite these efforts, development of an effective scoring function that can achieve both good accuracy and fast speed still presents a grand challenge. Results Based on a coarse-grained representation of a protein structure by using only four main-chain atoms: N, Cα, C and O, we develop a knowledge-based scoring function, called NCACO-score, that integrates different structural information to rapidly model protein structure from sequence. In testing on the Decoys'R'Us sets, we found that NCACO-score can effectively recognize native conformers from their decoys. Furthermore, we demonstrate that NCACO-score can effectively guide fragment assembly for protein structure prediction, which has achieved a good performance in building the structure models for hard targets from CASP8 in terms of both accuracy and speed. Conclusions Although NCACO-score is developed based on a coarse-grained model, it is able to discriminate native conformers from decoy conformers with high accuracy. NCACO is a very effective scoring function for structure modeling.
Collapse
Affiliation(s)
- Liqing Tian
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | | | | | | | | | | |
Collapse
|
44
|
Tuncbag N, Gursoy A, Keskin O. Prediction of protein-protein interactions: unifying evolution and structure at protein interfaces. Phys Biol 2011; 8:035006. [PMID: 21572173 DOI: 10.1088/1478-3975/8/3/035006] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
The vast majority of the chores in the living cell involve protein-protein interactions. Providing details of protein interactions at the residue level and incorporating them into protein interaction networks are crucial toward the elucidation of a dynamic picture of cells. Despite the rapid increase in the number of structurally known protein complexes, we are still far away from a complete network. Given experimental limitations, computational modeling of protein interactions is a prerequisite to proceed on the way to complete structural networks. In this work, we focus on the question 'how do proteins interact?' rather than 'which proteins interact?' and we review structure-based protein-protein interaction prediction approaches. As a sample approach for modeling protein interactions, PRISM is detailed which combines structural similarity and evolutionary conservation in protein interfaces to infer structures of complexes in the protein interaction network. This will ultimately help us to understand the role of protein interfaces in predicting bound conformations.
Collapse
Affiliation(s)
- Nurcan Tuncbag
- Koc University, Center for Computational Biology and Bioinformatics, and College of Engineering, Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| | | | | |
Collapse
|
45
|
Armenta-Medina D, Pérez-Rueda E, Segovia L. Identification of functional motions in the adenylate kinase (ADK) protein family by computational hybrid approaches. Proteins 2011; 79:1662-71. [PMID: 21365689 DOI: 10.1002/prot.22995] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2010] [Revised: 12/03/2010] [Accepted: 12/07/2010] [Indexed: 02/02/2023]
Abstract
Based on integrative computational hybrid approaches that combined statistical coupling analysis (SCA), molecular dynamics (MD), and normal mode analysis (NMA), evolutionarily coupled residues involved in functionally relevant motion in the adenylate kinase protein family were identified. The hybrids identified four top-ranking site pairs that belong to a conserved hydrogen bond network that is involved in the enzyme's flexibility. A second group of top-ranking site pairs was identified in critical regions for functional dynamics, such as those related to enzymatic turnover. The high consistency of the results obtained by SCA with NMA (SCA.NMA) and by SCA.MD hybrid analyses suggests that suitable replacement of the matrix of cross-correlation analysis of atomic fluctuations (derived by using NMA) with those based on MD contributes to the identification of such sites by means of a fast computational calculation. The analysis presented here strongly supports the hypothesis that evolutionary forces, such as coevolution at the sequence level, have promoted functional dynamic properties of the adenylate kinase protein family. Finally, these hybrid approaches can be used to identify, at the residue level, protein motion coordination patterns not previously observed, such as in hinge regions.
Collapse
Affiliation(s)
- Dagoberto Armenta-Medina
- Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México.
| | | | | |
Collapse
|
46
|
Wallnoefer HG, Lingott T, Gutiérrez JM, Merfort I, Liedl KR. Backbone flexibility controls the activity and specificity of a protein-protein interface: specificity in snake venom metalloproteases. J Am Chem Soc 2010; 132:10330-7. [PMID: 20617834 DOI: 10.1021/ja909908y] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Protein-protein interfaces have crucial functions in many biological processes. The large interaction areas of such interfaces show complex interaction motifs. Even more challenging is the understanding of (multi)specificity in protein-protein binding. Many proteins can bind several partners to mediate their function. A perfect paradigm to study such multispecific protein-protein interfaces are snake venom metalloproteases (SVMPs). Inherently, they bind to a variety of basement membrane proteins of capillaries, hydrolyze them, and induce profuse bleeding. However, despite having a high sequence homology, some SVMPs show a strong hemorrhagic activity, while others are (almost) inactive. We present computer simulations indicating that the activity to induce hemorrhage, and thus the capability to bind the potential reaction partners, is related to the backbone flexibility in a certain surface region. A subtle interplay between flexibility and rigidity of two loops seems to be the prerequisite for the proteins to carry out their damaging function. Presumably, a significant alteration in the backbone dynamics makes the difference between SVMPs that induce hemorrhage and the inactive ones.
Collapse
Affiliation(s)
- Hannes G Wallnoefer
- Institute of General, Inorganic and Theoretical Chemistry, Faculty of Chemistry and Pharmacy, University of Innsbruck, Innrain 52a, A-6020 Innsbruck, Austria
| | | | | | | | | |
Collapse
|
47
|
Hu X, Hu H, Beratan DN, Yang W. A gradient-directed Monte Carlo approach for protein design. J Comput Chem 2010; 31:2164-8. [PMID: 20186860 DOI: 10.1002/jcc.21506] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
We develop a new global optimization strategy, gradient-directed Monte Carlo (GDMC) sampling, to optimize protein sequence for a target structure using RosettaDesign. GDMC significantly improves the sampling of sequence space, compared to the classical Monte Carlo search protocol, for a fixed backbone conformation as well as for the simultaneous optimization of sequence and structure. As such, GDMC sampling enhances the efficiency of protein design.
Collapse
Affiliation(s)
- Xiangqian Hu
- Department of Chemistry, French Family Science Center, Duke University, Durham, North Carolina 27708-0346, USA
| | | | | | | |
Collapse
|
48
|
Abstract
Knowledge-based approaches frequently employ empirical relations to determine effective potentials for coarse-grained protein models directly from protein databank structures. Although these approaches have enjoyed considerable success and widespread popularity in computational protein science, their fundamental basis has been widely questioned. It is well established that conventional knowledge-based approaches do not correctly treat many-body correlations between amino acids. Moreover, the physical significance of potentials determined by using structural statistics from different proteins has remained obscure. In the present work, we address both of these concerns by introducing and demonstrating a theory for calculating transferable potentials directly from a databank of protein structures. This approach assumes that the databank structures correspond to representative configurations sampled from equilibrium solution ensembles for different proteins. Given this assumption, this physics-based theory exactly treats many-body structural correlations and directly determines the transferable potentials that provide a variationally optimized approximation to the free energy landscape for each protein. We illustrate this approach by first constructing a databank of protein structures using a model potential and then quantitatively recovering this potential from the structure databank. The proposed framework will clarify the assumptions and physical significance of knowledge-based potentials, allow for their systematic improvement, and provide new insight into many-body correlations and cooperativity in folded proteins.
Collapse
|
49
|
Potapov V, Cohen M, Inbar Y, Schreiber G. Protein structure modelling and evaluation based on a 4-distance description of side-chain interactions. BMC Bioinformatics 2010; 11:374. [PMID: 20624289 PMCID: PMC2912888 DOI: 10.1186/1471-2105-11-374] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2009] [Accepted: 07/12/2010] [Indexed: 11/11/2022] Open
Abstract
Background Accurate evaluation and modelling of residue-residue interactions within and between proteins is a key aspect of computational structure prediction including homology modelling, protein-protein docking, refinement of low-resolution structures, and computational protein design. Results Here we introduce a method for accurate protein structure modelling and evaluation based on a novel 4-distance description of residue-residue interaction geometry. Statistical 4-distance preferences were extracted from high-resolution protein structures and were used as a basis for a knowledge-based potential, called Hunter. We demonstrate that 4-distance description of side chain interactions can be used reliably to discriminate the native structure from a set of decoys. Hunter ranked the native structure as the top one in 217 out of 220 high-resolution decoy sets, in 25 out of 28 "Decoys 'R' Us" decoy sets and in 24 out of 27 high-resolution CASP7/8 decoy sets. The same concept was applied to side chain modelling in protein structures. On a set of very high-resolution protein structures the average RMSD was 1.47 Å for all residues and 0.73 Å for buried residues, which is in the range of attainable accuracy for a model. Finally, we show that Hunter performs as good or better than other top methods in homology modelling based on results from the CASP7 experiment. The supporting web site http://bioinfo.weizmann.ac.il/hunter/ was developed to enable the use of Hunter and for visualization and interactive exploration of 4-distance distributions. Conclusions Our results suggest that Hunter can be used as a tool for evaluation and for accurate modelling of residue-residue interactions in protein structures. The same methodology is applicable to other areas involving high-resolution modelling of biomolecules.
Collapse
Affiliation(s)
- Vladimir Potapov
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | | | | | | |
Collapse
|
50
|
Deng X, Lee J, Michael AJ, Tomchick DR, Goldsmith EJ, Phillips MA. Evolution of substrate specificity within a diverse family of beta/alpha-barrel-fold basic amino acid decarboxylases: X-ray structure determination of enzymes with specificity for L-arginine and carboxynorspermidine. J Biol Chem 2010; 285:25708-19. [PMID: 20534592 DOI: 10.1074/jbc.m110.121137] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Pyridoxal 5'-phosphate (PLP)-dependent basic amino acid decarboxylases from the beta/alpha-barrel-fold class (group IV) exist in most organisms and catalyze the decarboxylation of diverse substrates, essential for polyamine and lysine biosynthesis. Herein we describe the first x-ray structure determination of bacterial biosynthetic arginine decarboxylase (ADC) and carboxynorspermidine decarboxylase (CANSDC) to 2.3- and 2.0-A resolution, solved as product complexes with agmatine and norspermidine. Despite low overall sequence identity, the monomeric and dimeric structures are similar to other enzymes in the family, with the active sites formed between the beta/alpha-barrel domain of one subunit and the beta-barrel of the other. ADC contains both a unique interdomain insertion (4-helical bundle) and a C-terminal extension (3-helical bundle) and it packs as a tetramer in the asymmetric unit with the insertions forming part of the dimer and tetramer interfaces. Analytical ultracentrifugation studies confirmed that the ADC solution structure is a tetramer. Specificity for different basic amino acids appears to arise primarily from changes in the position of, and amino acid replacements in, a helix in the beta-barrel domain we refer to as the "specificity helix." Additionally, in CANSDC a key acidic residue that interacts with the distal amino group of other substrates is replaced by Leu(314), which interacts with the aliphatic portion of norspermidine. Neither product, agmatine in ADC nor norspermidine in CANSDC, form a Schiff base to pyridoxal 5'-phosphate, suggesting that the product complexes may promote product release by slowing the back reaction. These studies provide insight into the structural basis for the evolution of novel function within a common structural-fold.
Collapse
Affiliation(s)
- Xiaoyi Deng
- Department of Pharmacology, University of Texas Southwestern Medical Center, Dallas, Texas 75390-9041, USA
| | | | | | | | | | | |
Collapse
|