1
|
Ghoreyshi ZS, George JT. Quantitative approaches for decoding the specificity of the human T cell repertoire. Front Immunol 2023; 14:1228873. [PMID: 37781387 PMCID: PMC10539903 DOI: 10.3389/fimmu.2023.1228873] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 08/17/2023] [Indexed: 10/03/2023] Open
Abstract
T cell receptor (TCR)-peptide-major histocompatibility complex (pMHC) interactions play a vital role in initiating immune responses against pathogens, and the specificity of TCRpMHC interactions is crucial for developing optimized therapeutic strategies. The advent of high-throughput immunological and structural evaluation of TCR and pMHC has provided an abundance of data for computational approaches that aim to predict favorable TCR-pMHC interactions. Current models are constructed using information on protein sequence, structures, or a combination of both, and utilize a variety of statistical learning-based approaches for identifying the rules governing specificity. This review examines the current theoretical, computational, and deep learning approaches for identifying TCR-pMHC recognition pairs, placing emphasis on each method's mathematical approach, predictive performance, and limitations.
Collapse
Affiliation(s)
- Zahra S. Ghoreyshi
- Department of Biomedical Engineering, Texas A&M University, College Station, TX, United States
| | - Jason T. George
- Department of Biomedical Engineering, Texas A&M University, College Station, TX, United States
- Engineering Medicine Program, Texas A&M University, Houston, TX, United States
- Center for Theoretical Biological Physics, Rice University, Houston, TX, United States
| |
Collapse
|
2
|
Sykes J, Holland BR, Charleston MA. A review of visualisations of protein fold networks and their relationship with sequence and function. Biol Rev Camb Philos Soc 2023; 98:243-262. [PMID: 36210328 PMCID: PMC10092621 DOI: 10.1111/brv.12905] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 09/08/2022] [Accepted: 09/09/2022] [Indexed: 01/12/2023]
Abstract
Proteins form arguably the most significant link between genotype and phenotype. Understanding the relationship between protein sequence and structure, and applying this knowledge to predict function, is difficult. One way to investigate these relationships is by considering the space of protein folds and how one might move from fold to fold through similarity, or potential evolutionary relationships. The many individual characterisations of fold space presented in the literature can tell us a lot about how well the current Protein Data Bank represents protein fold space, how convergence and divergence may affect protein evolution, how proteins affect the whole of which they are part, and how proteins themselves function. A synthesis of these different approaches and viewpoints seems the most likely way to further our knowledge of protein structure evolution and thus, facilitate improved protein structure design and prediction.
Collapse
Affiliation(s)
- Janan Sykes
- School of Natural Sciences, University of Tasmania, Private Bag 37, Hobart, Tasmania, 7001, Australia
| | - Barbara R Holland
- School of Natural Sciences, University of Tasmania, Private Bag 37, Hobart, Tasmania, 7001, Australia
| | - Michael A Charleston
- School of Natural Sciences, University of Tasmania, Private Bag 37, Hobart, Tasmania, 7001, Australia
| |
Collapse
|
3
|
Bzówka M, Mitusińska K, Raczyńska A, Skalski T, Samol A, Bagrowska W, Magdziarz T, Góra A. Evolution of tunnels in α/β-hydrolase fold proteins—What can we learn from studying epoxide hydrolases? PLoS Comput Biol 2022; 18:e1010119. [PMID: 35580137 PMCID: PMC9140254 DOI: 10.1371/journal.pcbi.1010119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Revised: 05/27/2022] [Accepted: 04/19/2022] [Indexed: 12/27/2022] Open
Abstract
The evolutionary variability of a protein’s residues is highly dependent on protein region and function. Solvent-exposed residues, excluding those at interaction interfaces, are more variable than buried residues whereas active site residues are considered to be conserved. The abovementioned rules apply also to α/β-hydrolase fold proteins—one of the oldest and the biggest superfamily of enzymes with buried active sites equipped with tunnels linking the reaction site with the exterior. We selected soluble epoxide hydrolases as representative of this family to conduct the first systematic study on the evolution of tunnels. We hypothesised that tunnels are lined by mostly conserved residues, and are equipped with a number of specific variable residues that are able to respond to evolutionary pressure. The hypothesis was confirmed, and we suggested a general and detailed way of the tunnels’ evolution analysis based on entropy values calculated for tunnels’ residues. We also found three different cases of entropy distribution among tunnel-lining residues. These observations can be applied for protein reengineering mimicking the natural evolution process. We propose a ‘perforation’ mechanism for new tunnels design via the merging of internal cavities or protein surface perforation. Based on the literature data, such a strategy of new tunnel design could significantly improve the enzyme’s performance and can be applied widely for enzymes with buried active sites. So far very little is known about proteins tunnels evolution. The goal of this study is to evaluate the evolution of tunnels in the family of soluble epoxide hydrolases—representatives of numerous α/β-hydrolase fold enzymes. As a result two types of tunnels evolution analysis were proposed (a general and a detailed approach), as well as a ‘perforation’ mechanism which can mimic native evolution in proteins and can be used as an additional strategy for enzymes redesign.
Collapse
Affiliation(s)
- Maria Bzówka
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Karolina Mitusińska
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Agata Raczyńska
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Tomasz Skalski
- Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Aleksandra Samol
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Weronika Bagrowska
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Tomasz Magdziarz
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Artur Góra
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
- * E-mail:
| |
Collapse
|
4
|
Demharter S, Knapp B, Deane C, Minary P. HLA-DM Stabilizes the Empty MHCII Binding Groove: A Model Using Customized Natural Move Monte Carlo. J Chem Inf Model 2019; 59:2894-2899. [PMID: 31070900 PMCID: PMC7007188 DOI: 10.1021/acs.jcim.9b00104] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Indexed: 11/28/2022]
Abstract
MHC class II molecules bind peptides derived from extracellular proteins that have been ingested by antigen-presenting cells and display them to the immune system. Peptide loading occurs within the antigen-presenting cell and is facilitated by HLA-DM. HLA-DM stabilizes the open conformation of the MHCII binding groove when no peptide is bound. While a structure of the MHCII/HLA-DM complex exists, the mechanism of stabilization is still largely unknown. Here, we applied customized Natural Move Monte Carlo to investigate this interaction. We found a possible long-range mechanism that implicates the configuration of the membrane-proximal globular domains in stabilizing the open state of the empty MHCII binding groove.
Collapse
Affiliation(s)
- Samuel Demharter
- Biotech
Research and Innovation Centre, University
of Copenhagen, Copenhagen 2200, Denmark
- Department
of Computer Science, University of Oxford, Oxford OX1 3QD, United Kingdom
| | - Bernhard Knapp
- Bioinformatics
and Immunoinformatics Research Group, Department of Basic Sciences, International University of Catalonia, 08195 Barcelona, Spain
| | - Charlotte Deane
- Protein
Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, United Kingdom
| | - Peter Minary
- Department
of Computer Science, University of Oxford, Oxford OX1 3QD, United Kingdom
| |
Collapse
|
5
|
Maruca A, Ambrosio FA, Lupia A, Romeo I, Rocca R, Moraca F, Talarico C, Bagetta D, Catalano R, Costa G, Artese A, Alcaro S. Computer-based techniques for lead identification and optimization I: Basics. PHYSICAL SCIENCES REVIEWS 2019. [DOI: 10.1515/psr-2018-0113] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
AbstractThis chapter focuses on computational techniques for identifying and optimizing lead molecules, with a special emphasis on natural compounds. A number of case studies have been specifically discussed, such as the case of the naphthyridine scaffold, discovered through a structure-based virtual screening (SBVS) and proposed as the starting point for further lead optimization process, to enhance its telomeric RNA selectivity. Another example is the case of Liphagal, a tetracyclic meroterpenoid extracted fromAka coralliphaga, known as PI3Kα inhibitor, provide an evidence for the design of new active congeners against PI3Kα using molecular dynamics (MD) simulations. These are only two of the numerous examples of the computational techniques’ powerful in drug design and drug discovery fields. Finally, the design of drugs that can simultaneously interact with multiple targets as a promising approach for treating complicated diseases has been reported. An example of polypharmacological agents are the compounds extracted from mushrooms identified by means of molecular docking experiments. This chapter may be a useful manual of molecular modeling techniques used in the lead-optimization and lead identification processes.
Collapse
|
6
|
Knapp B, Ospina L, Deane CM. Avoiding False Positive Conclusions in Molecular Simulation: The Importance of Replicas. J Chem Theory Comput 2018; 14:6127-6138. [PMID: 30354113 DOI: 10.1021/acs.jctc.8b00391] [Citation(s) in RCA: 170] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Molecular simulations are a computational technique used to investigate the dynamics of proteins and other molecules. The free energy landscape of these simulations is often rugged, and minor differences in the initial velocities, floating-point precision, or underlying hardware can cause identical simulations (replicas) to take different paths in the landscape. In this study we investigated the magnitude of these effects based on 310 000 ns of simulation time. We performed 100 identically parametrized replicas of 3000 ns each for a small 10 amino acid system as well as 100 identically parametrized replicas of 100 ns each for an 827 residue T-cell receptor/MHC system. Comparing randomly chosen subgroups within these replica sets, we estimated the reproducibility and reliability that can be achieved by a given number of replicas at a given simulation time. These results demonstrate that conclusions drawn from single simulations are often not reproducible and that conclusions drawn from multiple shorter replicas are more reliable than those from a single longer simulation. The actual number of replicas needed will always depend on the question asked and the level of reliability sought. On the basis of our data, it appears that a good rule of thumb is to perform a minimum of five to 10 replicas.
Collapse
Affiliation(s)
- Bernhard Knapp
- Bioinformatics and Immunoinformatics Research Group, Department of Basic Sciences , International University of Catalonia , 08195 Barcelona , Spain.,Protein Informatics Group, Department of Statistics , University of Oxford , Oxford OX1 3LB , United Kingdom
| | - Luis Ospina
- Protein Informatics Group, Department of Statistics , University of Oxford , Oxford OX1 3LB , United Kingdom.,Alliance Manchester Business School , University of Manchester , Manchester M13 9SS , United Kingdom
| | - Charlotte M Deane
- Protein Informatics Group, Department of Statistics , University of Oxford , Oxford OX1 3LB , United Kingdom
| |
Collapse
|
7
|
Knapp B, Dunbar J, Alcala M, Deane CM. Variable Regions of Antibodies and T-Cell Receptors May Not Be Sufficient in Molecular Simulations Investigating Binding. J Chem Theory Comput 2017; 13:3097-3105. [PMID: 28617587 DOI: 10.1021/acs.jctc.7b00080] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Antibodies and T-cell receptors are important proteins of the immune system that share similar structures. Both contain variable and constant regions. Insight into the dynamics of their binding can be provided by computational simulations. For these simulations the constant regions are often removed to save runtime as binding occurs in the variable regions. Here we present the first study to investigate the effect of removing the constant regions from antibodies and T-cell receptors on such simulations. We performed simulations of an antibody/antigen and T-cell receptor/MHC system with and without constant regions using 10 replicas of 100 ns of each of the four setups. We found that simulations without constant regions show significantly different behavior compared to simulations with constant regions. If the constant regions are not included in the simulations alterations in the binding interface hydrogen bonds and even partial unbinding can occur. These results indicate that constant regions should be included in antibody and T-cell receptor simulations for reliable conclusions to be drawn.
Collapse
Affiliation(s)
- Bernhard Knapp
- Department of Statistics, Protein Informatics Group, University of Oxford , Oxford OX1 3BD, U.K.,Department of Basic Sciences, Faculty of Medicine and Health Sciences, International University of Catalonia , 08195 Sant Cugat del Vallès, Barcelona, Spain
| | - James Dunbar
- Department of Statistics, Protein Informatics Group, University of Oxford , Oxford OX1 3BD, U.K.,Pharma Research and Early Development, Large Molecule Research, Roche Innovation Center Munich 82377 Penzberg, Germany
| | - Marta Alcala
- Department of Basic Sciences, Faculty of Medicine and Health Sciences, International University of Catalonia , 08195 Sant Cugat del Vallès, Barcelona, Spain
| | - Charlotte M Deane
- Department of Statistics, Protein Informatics Group, University of Oxford , Oxford OX1 3BD, U.K
| |
Collapse
|
8
|
Demharter S, Knapp B, Deane CM, Minary P. Modeling Functional Motions of Biological Systems by Customized Natural Moves. Biophys J 2017; 111:710-721. [PMID: 27558715 PMCID: PMC5002067 DOI: 10.1016/j.bpj.2016.06.028] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Revised: 06/20/2016] [Accepted: 06/22/2016] [Indexed: 11/30/2022] Open
Abstract
Simulating the functional motions of biomolecular systems requires large computational resources. We introduce a computationally inexpensive protocol for the systematic testing of hypotheses regarding the dynamic behavior of proteins and nucleic acids. The protocol is based on natural move Monte Carlo, a highly efficient conformational sampling method with built-in customization capabilities that allows researchers to design and perform a large number of simulations to investigate functional motions in biological systems. We demonstrate the use of this protocol on both a protein and a DNA case study. Firstly, we investigate the plasticity of a class II major histocompatibility complex in the absence of a bound peptide. Secondly, we study the effects of the epigenetic mark 5-hydroxymethyl on cytosine on the structure of the Dickerson-Drew dodecamer. We show how our customized natural moves protocol can be used to investigate causal relationships of functional motions in biological systems.
Collapse
Affiliation(s)
- Samuel Demharter
- Department of Computer Science, University of Oxford, Oxford, UK
| | - Bernhard Knapp
- Department of Statistics, University of Oxford, Oxford, UK
| | | | - Peter Minary
- Department of Computer Science, University of Oxford, Oxford, UK.
| |
Collapse
|
9
|
Bywater RP, Middleton JN. Melody discrimination and protein fold classification. Heliyon 2016; 2:e00175. [PMID: 27812548 PMCID: PMC5079661 DOI: 10.1016/j.heliyon.2016.e00175] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2016] [Revised: 09/04/2016] [Accepted: 09/30/2016] [Indexed: 12/02/2022] Open
Abstract
One of the greatest challenges in theoretical biophysics and bioinformatics is the identification of protein folds from sequence data. This can be regarded as a pattern recognition problem. In this paper we report the use of a melody generation software where the inputs are derived from calculations of evolutionary information, secondary structure, flexibility, hydropathy and solvent accessibility from multiple sequence alignment data. The melodies so generated are derived from the sequence, and by inference, of the fold, in ways that give each fold a sound representation that may facilitate analysis, recognition, or comparison with other sequences.
Collapse
Affiliation(s)
| | - Jonathan N Middleton
- Department of Music, Eastern Washington University, Cheney, WA 99004, USA; School of Information Sciences, University of Tampere, 33041, Finland
| |
Collapse
|
10
|
Knapp B, Demharter S, Deane CM, Minary P. Exploring peptide/MHC detachment processes using hierarchical natural move Monte Carlo. Bioinformatics 2016; 32:181-6. [PMID: 26395770 PMCID: PMC4708099 DOI: 10.1093/bioinformatics/btv502] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Revised: 08/10/2015] [Accepted: 08/21/2015] [Indexed: 01/15/2023] Open
Abstract
MOTIVATION The binding between a peptide and a major histocompatibility complex (MHC) is one of the most important processes for the induction of an adaptive immune response. Many algorithms have been developed to predict peptide/MHC (pMHC) binding. However, no approach has yet been able to give structural insight into how peptides detach from the MHC. RESULTS In this study, we used a combination of coarse graining, hierarchical natural move Monte Carlo and stochastic conformational optimization to explore the detachment processes of 32 different peptides from HLA-A*02:01. We performed 100 independent repeats of each stochastic simulation and found that the presence of experimentally known anchor amino acids affects the detachment trajectories of our peptides. Comparison with experimental binding affinity data indicates the reliability of our approach (area under the receiver operating characteristic curve 0.85). We also compared to a 1000 ns molecular dynamics simulation of a non-binding peptide (AAAKTPVIV) and HLA-A*02:01. Even in this simulation, the longest published for pMHC, the peptide does not fully detach. Our approach is orders of magnitude faster and as such allows us to explore pMHC detachment processes in a way not possible with all-atom molecular dynamics simulations. AVAILABILITY AND IMPLEMENTATION The source code is freely available for download at http://www.cs.ox.ac.uk/mosaics/. CONTACT bernhard.knapp@stats.ox.ac.uk SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bernhard Knapp
- Department of Statistics, University of Oxford, 1 South Parks Road, Oxford, OX1 3TG, UK and
| | - Samuel Demharter
- Department of Computer Science, University of Oxford, Wolfson Building, Parks Road, Oxford, OX1 3QD, UK
| | - Charlotte M Deane
- Department of Statistics, University of Oxford, 1 South Parks Road, Oxford, OX1 3TG, UK and
| | - Peter Minary
- Department of Computer Science, University of Oxford, Wolfson Building, Parks Road, Oxford, OX1 3QD, UK
| |
Collapse
|
11
|
Machine Learnable Fold Space Representation based on Residue Cluster Classes. Comput Biol Chem 2015; 59 Pt A:1-7. [PMID: 26366526 DOI: 10.1016/j.compbiolchem.2015.07.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2014] [Revised: 07/17/2015] [Accepted: 07/25/2015] [Indexed: 11/21/2022]
Abstract
MOTIVATION Protein fold space is a conceptual framework where all possible protein folds exist and ideas about protein structure, function and evolution may be analyzed. Classification of protein folds in this space is commonly achieved by using similarity indexes and/or machine learning approaches, each with different limitations. RESULTS We propose a method for constructing a compact vector space model of protein fold space by representing each protein structure by its residues local contacts. We developed an efficient method to statistically test for the separability of points in a space and showed that our protein fold space representation is learnable by any machine-learning algorithm. AVAILABILITY An API is freely available at https://code.google.com/p/pyrcc/.
Collapse
|
12
|
Rackovsky S. Nonlinearities in protein space limit the utility of informatics in protein biophysics. Proteins 2015; 83:1923-8. [PMID: 26315852 DOI: 10.1002/prot.24916] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2015] [Revised: 08/12/2015] [Accepted: 08/20/2015] [Indexed: 11/08/2022]
Abstract
We examine the utility of informatic-based methods in computational protein biophysics. To do so, we use newly developed metric functions to define completely independent sequence and structure spaces for a large database of proteins. By investigating the relationship between these spaces, we demonstrate quantitatively the limits of knowledge-based correlation between the sequences and structures of proteins. It is shown that there are well-defined, nonlinear regions of protein space in which dissimilar structures map onto similar sequences (the conformational switch), and dissimilar sequences map onto similar structures (remote homology). These nonlinearities are shown to be quite common-almost half the proteins in our database fall into one or the other of these two regions. They are not anomalies, but rather intrinsic properties of structural encoding in amino acid sequences. It follows that extreme care must be exercised in using bioinformatic data as a basis for computational structure prediction. The implications of these results for protein evolution are examined.
Collapse
Affiliation(s)
- S Rackovsky
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York, 14853.,Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, New York, New York, 10029
| |
Collapse
|
13
|
Sikosek T, Chan HS. Biophysics of protein evolution and evolutionary protein biophysics. J R Soc Interface 2015; 11:20140419. [PMID: 25165599 DOI: 10.1098/rsif.2014.0419] [Citation(s) in RCA: 150] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The study of molecular evolution at the level of protein-coding genes often entails comparing large datasets of sequences to infer their evolutionary relationships. Despite the importance of a protein's structure and conformational dynamics to its function and thus its fitness, common phylogenetic methods embody minimal biophysical knowledge of proteins. To underscore the biophysical constraints on natural selection, we survey effects of protein mutations, highlighting the physical basis for marginal stability of natural globular proteins and how requirement for kinetic stability and avoidance of misfolding and misinteractions might have affected protein evolution. The biophysical underpinnings of these effects have been addressed by models with an explicit coarse-grained spatial representation of the polypeptide chain. Sequence-structure mappings based on such models are powerful conceptual tools that rationalize mutational robustness, evolvability, epistasis, promiscuous function performed by 'hidden' conformational states, resolution of adaptive conflicts and conformational switches in the evolution from one protein fold to another. Recently, protein biophysics has been applied to derive more accurate evolutionary accounts of sequence data. Methods have also been developed to exploit sequence-based evolutionary information to predict biophysical behaviours of proteins. The success of these approaches demonstrates a deep synergy between the fields of protein biophysics and protein evolution.
Collapse
Affiliation(s)
- Tobias Sikosek
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A8
| | - Hue Sun Chan
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A8
| |
Collapse
|
14
|
Moraga I, Wernig G, Wilmes S, Gryshkova V, Richter CP, Hong WJ, Sinha R, Guo F, Fabionar H, Wehrman TS, Krutzik P, Demharter S, Plo I, Weissman IL, Minary P, Majeti R, Constantinescu SN, Piehler J, Garcia KC. Tuning cytokine receptor signaling by re-orienting dimer geometry with surrogate ligands. Cell 2015; 160:1196-208. [PMID: 25728669 PMCID: PMC4766813 DOI: 10.1016/j.cell.2015.02.011] [Citation(s) in RCA: 120] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2014] [Revised: 01/22/2015] [Accepted: 02/03/2015] [Indexed: 01/07/2023]
Abstract
Most cell-surface receptors for cytokines and growth factors signal as dimers, but it is unclear whether remodeling receptor dimer topology is a viable strategy to "tune" signaling output. We utilized diabodies (DA) as surrogate ligands in a prototypical dimeric receptor-ligand system, the cytokine Erythropoietin (EPO) and its receptor (EpoR), to dimerize EpoR ectodomains in non-native architectures. Diabody-induced signaling amplitudes varied from full to minimal agonism, and structures of these DA/EpoR complexes differed in EpoR dimer orientation and proximity. Diabodies also elicited biased or differential activation of signaling pathways and gene expression profiles compared to EPO. Non-signaling diabodies inhibited proliferation of erythroid precursors from patients with a myeloproliferative neoplasm due to a constitutively active JAK2V617F mutation. Thus, intracellular oncogenic mutations causing ligand-independent receptor activation can be counteracted by extracellular ligands that re-orient receptors into inactive dimer topologies. This approach has broad applications for tuning signaling output for many dimeric receptor systems.
Collapse
Affiliation(s)
- Ignacio Moraga
- Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, California, 94305-5345, USA,Department of Molecular and Cellular Physiology, Stanford University School of Medicine, Stanford, California, 94305-5345, USA
| | - Gerlinde Wernig
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, California, 94305-5345, USA,Department of Pathology, Division of Hematopathology, Stanford University School of Medicine, Stanford, California, 94305-5345, USA
| | - Stephan Wilmes
- Division of Biophysics, Department of Biology, University of Osnabrück, 49076, Germany
| | - Vitalina Gryshkova
- Ludwig Institute For Cancer Research and de Duve Institute, Université catholique de Louvain, B-1200 Brussels, Belgium
| | | | - Wan-Jen Hong
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, California, 94305-5345, USA,Department of Internal Medicine, Division of Hematology, Stanford University School of Medicine, Stanford, California, 94305-5345, USA
| | - Rahul Sinha
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, California, 94305-5345, USA
| | - Feng Guo
- Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, California, 94305-5345, USA,Department of Molecular and Cellular Physiology, Stanford University School of Medicine, Stanford, California, 94305-5345, USA
| | - Hyna Fabionar
- DiscoveRx, 42501 Albrae St, Fremont, California, 94538, USA
| | - Tom S. Wehrman
- Primity Bio, 3350 Scott blvd ste 6101, Santa Clara, CA 95054
| | - Peter Krutzik
- Primity Bio, 3350 Scott blvd ste 6101, Santa Clara, CA 95054
| | - Samuel Demharter
- Department of Computer Science Wolfson Building, University of Oxford, Oxford OX1 3QD, United Kingdom
| | - Isabelle Plo
- Institut Gustave Roussy, INSERM U1009, 94805, Villejuif, France
| | - Irving L. Weissman
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, California, 94305-5345, USA
| | - Peter Minary
- Department of Computer Science Wolfson Building, University of Oxford, Oxford OX1 3QD, United Kingdom
| | - Ravindra Majeti
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, California, 94305-5345, USA,Department of Internal Medicine, Division of Hematology, Stanford University School of Medicine, Stanford, California, 94305-5345, USA
| | - Stefan N. Constantinescu
- Ludwig Institute For Cancer Research and de Duve Institute, Université catholique de Louvain, B-1200 Brussels, Belgium
| | - Jacob Piehler
- Division of Biophysics, Department of Biology, University of Osnabrück, 49076, Germany
| | - K. Christopher Garcia
- Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, California, 94305-5345, USA,Department of Molecular and Cellular Physiology, Stanford University School of Medicine, Stanford, California, 94305-5345, USA,Correspondence to:
| |
Collapse
|
15
|
Knapp B, Demharter S, Esmaielbeiki R, Deane CM. Current status and future challenges in T-cell receptor/peptide/MHC molecular dynamics simulations. Brief Bioinform 2015; 16:1035-44. [DOI: 10.1093/bib/bbv005] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2014] [Indexed: 11/12/2022] Open
|
16
|
Effect of methanol on the phase-transition properties of glycerol-monopalmitate lipid bilayers investigated using molecular dynamics simulations: In quest of the biphasic effect. J Mol Graph Model 2015; 55:85-104. [DOI: 10.1016/j.jmgm.2014.10.017] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2014] [Revised: 10/29/2014] [Accepted: 10/30/2014] [Indexed: 11/21/2022]
|
17
|
Yadahalli S, Hemanth Giri Rao VV, Gosavi S. Modeling Non-Native Interactions in Designed Proteins. Isr J Chem 2014. [DOI: 10.1002/ijch.201400035] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
18
|
Abstract
To explore protein space from a global perspective, we consider 9,710 SCOP (Structural Classification of Proteins) domains with up to 70% sequence identity and present all similarities among them as networks: In the "domain network," nodes represent domains, and edges connect domains that share "motifs," i.e., significantly sized segments of similar sequence and structure. We explore the dependence of the network on the thresholds that define the evolutionary relatedness of the domains. At excessively strict thresholds the network falls apart completely; for very lax thresholds, there are network paths between virtually all domains. Interestingly, at intermediate thresholds the network constitutes two regions that can be described as "continuous" versus "discrete." The continuous region comprises a large connected component, dominated by domains with alternating alpha and beta elements, and the discrete region includes the rest of the domains in isolated islands, each generally corresponding to a fold. We also construct the "motif network," in which nodes represent recurring motifs, and edges connect motifs that appear in the same domain. This network also features a large and highly connected component of motifs that originate from domains with alternating alpha/beta elements (and some all-alpha domains), and smaller isolated islands. Indeed, the motif network suggests that nature reuses such motifs extensively. The networks suggest evolutionary paths between domains and give hints about protein evolution and the underlying biophysics. They provide natural means of organizing protein space, and could be useful for the development of strategies for protein search and design.
Collapse
|
19
|
Shi JY, Yiu SM, Zhang YN, Chin FYL. Effective moment feature vectors for protein domain structures. PLoS One 2014; 8:e83788. [PMID: 24391828 DOI: 10.1371/journal.pone.0083788] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2013] [Accepted: 11/08/2013] [Indexed: 11/19/2022] Open
Abstract
Imaging processing techniques have been shown to be useful in studying protein domain structures. The idea is to represent the pairwise distances of any two residues of the structure in a 2D distance matrix (DM). Features and/or submatrices are extracted from this DM to represent a domain. Existing approaches, however, may involve a large number of features (100-400) or complicated mathematical operations. Finding fewer but more effective features is always desirable. In this paper, based on some key observations on DMs, we are able to decompose a DM image into four basic binary images, each representing the structural characteristics of a fundamental secondary structure element (SSE) or a motif in the domain. Using the concept of moments in image processing, we further derive 45 structural features based on the four binary images. Together with 4 features extracted from the basic images, we represent the structure of a domain using 49 features. We show that our feature vectors can represent domain structures effectively in terms of the following. (1) We show a higher accuracy for domain classification. (2) We show a clear and consistent distribution of domains using our proposed structural vector space. (3) We are able to cluster the domains according to our moment features and demonstrate a relationship between structural variation and functional diversity.
Collapse
Affiliation(s)
- Jian-Yu Shi
- School of Life Science, Northwestern Polytechnical University, Xi'an, Shaanxi Province, China ; Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Siu-Ming Yiu
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Yan-Ning Zhang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, Shaanxi Province, China
| | | |
Collapse
|
20
|
Faraggi E, Kloczkowski A. A global machine learning based scoring function for protein structure prediction. Proteins 2013; 82:752-9. [DOI: 10.1002/prot.24454] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2013] [Revised: 10/03/2013] [Accepted: 10/21/2013] [Indexed: 01/07/2023]
Affiliation(s)
- Eshel Faraggi
- Department of Biochemistry and Molecular Biology; Indiana University School of Medicine; Indianapolis Indiana 46202
- Battelle Center for Mathematical Medicine; Nationwide Children's Hospital; Columbus Ohio 43215
- Physics Division; Research and Information Systems, LLC; Carmel Indiana 46032
| | - Andrzej Kloczkowski
- Battelle Center for Mathematical Medicine; Nationwide Children's Hospital; Columbus Ohio 43215
- Department of Pediatrics; The Ohio State University; Columbus Ohio 43215
| |
Collapse
|
21
|
Hodak H. The Nobel Prize in chemistry 2013 for the development of multiscale models of complex chemical systems: a tribute to Martin Karplus, Michael Levitt and Arieh Warshel. J Mol Biol 2013; 426:1-3. [PMID: 24184197 DOI: 10.1016/j.jmb.2013.10.037] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Hélène Hodak
- Journal of Molecular Biology-Elsevier, 600 Technology Square, Cambridge, MA 02139, USA.
| |
Collapse
|
22
|
Chellapa GD, Rose GD. Reducing the dimensionality of the protein-folding search problem. Protein Sci 2012; 21:1231-40. [PMID: 22692765 DOI: 10.1002/pro.2106] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2012] [Revised: 06/04/2012] [Accepted: 06/05/2012] [Indexed: 11/10/2022]
Abstract
How does a folding protein negotiate a vast, featureless conformational landscape and adopt its native structure in biological real time? Motivated by this search problem, we developed a novel algorithm to compare protein structures. Procedures to identify structural analogs are typically conducted in three-dimensional space: the tertiary structure of a target protein is matched against each candidate in a database of structures, and goodness of fit is evaluated by a distance-based measure, such as the root-mean-square distance between target and candidate. This is an expensive approach because three-dimensional space is complex. Here, we transform the problem into a simpler one-dimensional procedure. Specifically, we identify and label the 11 most populated residue basins in a database of high-resolution protein structures. Using this 11-letter alphabet, any protein's three-dimensional structure can be transformed into a one-dimensional string by mapping each residue onto its corresponding basin. Similarity between the resultant basin strings can then be evaluated by conventional sequence-based comparison. The disorder → order folding transition is abridged on both sides. At the onset, folding conditions necessitate formation of hydrogen-bonded scaffold elements on which proteins are assembled, severely restricting the magnitude of accessible conformational space. Near the end, chain topology is established prior to emergence of the close-packed native state. At this latter stage of folding, the chain remains molten, and residues populate natural basins that are approximated by the 11 basins derived here. In essence, our algorithm reduces the protein-folding search problem to mapping the amino acid sequence onto a restricted basin string.
Collapse
Affiliation(s)
- George D Chellapa
- TC Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | | |
Collapse
|
23
|
Zhang J, Minary P, Levitt M. Multiscale natural moves refine macromolecules using single-particle electron microscopy projection images. Proc Natl Acad Sci U S A 2012; 109:9845-50. [PMID: 22665770 PMCID: PMC3382478 DOI: 10.1073/pnas.1205945109] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The method presented here refines molecular conformations directly against projections of single particles measured by electron microscopy. By optimizing the orientation of the projection at the same time as the conformation, the method is well-suited to two-dimensional class averages from cryoelectron microscopy. Such direct use of two-dimensional images circumvents the need for a three-dimensional density map, which may be difficult to reconstruct from projections due to structural heterogeneity or preferred orientations of the sample on the grid. Our refinement protocol exploits Natural Move Monte Carlo to model a macromolecule as a small number of segments connected by flexible loops, on multiple scales. After tests on artificial data from lysozyme, we applied the method to the Methonococcus maripaludis chaperonin. We successfully refined its conformation from a closed-state initial model to an open-state final model using just one class-averaged projection. We also used Natural Moves to iteratively refine against heterogeneous projection images of Methonococcus maripaludis chaperonin in a mix of open and closed states. Our results suggest a general method for electron microscopy refinement specially suited to macromolecules with significant conformational flexibility. The algorithm is available in the program Methodologies for Optimization and Sampling In Computational Studies.
Collapse
Affiliation(s)
- Junjie Zhang
- Department of Structural Biology, Stanford University School of Medicine, D100 Fairchild Building, Stanford, CA 94305
| | - Peter Minary
- Department of Structural Biology, Stanford University School of Medicine, D100 Fairchild Building, Stanford, CA 94305
| | - Michael Levitt
- Department of Structural Biology, Stanford University School of Medicine, D100 Fairchild Building, Stanford, CA 94305
| |
Collapse
|
24
|
Abstract
We develop a unique algorithm implemented in the program MOSAICS (Methodologies for Optimization and Sampling in Computational Studies) that is capable of nanoscale modeling without compromising the resolution of interest. This is achieved by modeling with customizable hierarchical degrees of freedom, thereby circumventing major limitations of conventional molecular modeling. With the emergence of RNA-based nanotechnology, large RNAs in all-atom representation are used here to benchmark our algorithm. Our method locates all favorable structural states of a model RNA of significant complexity while improving sampling accuracy and increasing speed many fold over existing all-atom RNA modeling methods. We also modeled the effects of sequence mutations on the structural building blocks of tRNA-based nanotechnology. With its flexibility in choosing arbitrary degrees of freedom as well as in allowing different all-atom energy functions, MOSAICS is an ideal tool to model and design biomolecules of the nanoscale.
Collapse
|
25
|
Salvado B, Karathia H, Chimenos AU, Vilaprinyo E, Omholt S, Sorribas A, Alves R. Methods for and results from the study of design principles in molecular systems. Math Biosci 2011; 231:3-18. [DOI: 10.1016/j.mbs.2011.02.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2010] [Revised: 01/24/2011] [Accepted: 02/10/2011] [Indexed: 12/27/2022]
|
26
|
Minary P, Levitt M. Conformational optimization with natural degrees of freedom: a novel stochastic chain closure algorithm. J Comput Biol 2010; 17:993-1010. [PMID: 20726792 DOI: 10.1089/cmb.2010.0016] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The present article introduces a set of novel methods that facilitate the use of "natural moves" or arbitrary degrees of freedom that can give rise to collective rearrangements in the structure of biological macromolecules. While such "natural moves" may spoil the stereochemistry and even break the bonded chain at multiple locations, our new method restores the correct chain geometry by adjusting bond and torsion angles in an arbitrary defined molten zone. This is done by successive stages of partial closure that propagate the location of the chain break backwards along the chain. At the end of these stages, the size of the chain break is generally reduced so much that it can be repaired by adjusting the position of a single atom. Our chain closure method is efficient with a computational complexity of O(N(d)), where N(d) is the number of degrees of freedom used to repair the chain break. The new method facilitates the use of arbitrary degrees of freedom including the "natural" degrees of freedom inferred from analyzing experimental (X-ray crystallography and nuclear magnetic resonance [NMR]) structures of nucleic acids and proteins. In terms of its ability to generate large conformational moves and its effectiveness in locating low energy states, the new method is robust and computationally efficient.
Collapse
Affiliation(s)
- Peter Minary
- Department of Structural Biology, Stanford University School of Medicine, Stanford, California 94305, USA.
| | | |
Collapse
|
27
|
Aloy P, Oliva B. Splitting statistical potentials into meaningful scoring functions: testing the prediction of near-native structures from decoy conformations. BMC STRUCTURAL BIOLOGY 2009; 9:71. [PMID: 19917096 PMCID: PMC2783033 DOI: 10.1186/1472-6807-9-71] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2009] [Accepted: 11/16/2009] [Indexed: 11/20/2022]
Abstract
Background Recent advances on high-throughput technologies have produced a vast amount of protein sequences, while the number of high-resolution structures has seen a limited increase. This has impelled the production of many strategies to built protein structures from its sequence, generating a considerable amount of alternative models. The selection of the closest model to the native conformation has thus become crucial for structure prediction. Several methods have been developed to score protein models by energies, knowledge-based potentials and combination of both. Results Here, we present and demonstrate a theory to split the knowledge-based potentials in scoring terms biologically meaningful and to combine them in new scores to predict near-native structures. Our strategy allows circumventing the problem of defining the reference state. In this approach we give the proof for a simple and linear application that can be further improved by optimizing the combination of Zscores. Using the simplest composite score () we obtained predictions similar to state-of-the-art methods. Besides, our approach has the advantage of identifying the most relevant terms involved in the stability of the protein structure. Finally, we also use the composite Zscores to assess the conformation of models and to detect local errors. Conclusion We have introduced a method to split knowledge-based potentials and to solve the problem of defining a reference state. The new scores have detected near-native structures as accurately as state-of-art methods and have been successful to identify wrongly modeled regions of many near-native conformations.
Collapse
Affiliation(s)
- Patrick Aloy
- Institut de Recerca Biomèdica and Barcelona Supercomputing Center, 10-12 08028 Barcelona, Catalonia, Spain.
| | | |
Collapse
|
28
|
Májek P, Elber R. A coarse-grained potential for fold recognition and molecular dynamics simulations of proteins. Proteins 2009; 76:822-36. [PMID: 19291741 DOI: 10.1002/prot.22388] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
A coarse-grained potential for protein simulations and fold ranking is presented. The potential is based on a two-point model of individual amino acids and a specific implementation of hydrogen bonding. Parameters are determined for distance dependent pair interactions, pseudo bonds, angles, and torsions. A scaling factor for a hydrogen bonding term is also determined. Iterative sampling for 4867 proteins reproduces distributions of internal coordinates and distances observed in the Protein Data Bank. The adjustment of the potential and resampling are in the spirit of the generalized ensemble approach. No native structure information (e.g., secondary structure) is used in the calculation of the potential or in the simulation of a particular protein. The potential is subject to two tests as follows: (i) simulations of 956 globular proteins in the neighborhood of their native folds (these proteins were not used in the training set) and (ii) discrimination between native and decoy structures for 2470 proteins with 305,000 decoys and the "Decoys 'R' Us" dataset. In the first test, 58% of tested proteins stay within 5 A from the native fold in Molecular Dynamics simulations of more than 20 nanoseconds using the new potential. The potential is also useful in differentiating between correct and approximate folds providing significant signal for structure prediction algorithms. Sampling with the potential consistently regenerates the distribution of distances and internal coordinates it learned. Nevertheless, during Molecular Dynamics simulations structures are found that reproduce the learned distributions but are far from the native fold.
Collapse
Affiliation(s)
- Peter Májek
- Department of Computer Science, Upson Hall 4130, Cornell University, Ithaca, New York 14853-7501, USA
| | | |
Collapse
|
29
|
Wells SA, Jimenez-Roldan JE, Römer RA. Comparative analysis of rigidity across protein families. Phys Biol 2009; 6:046005. [DOI: 10.1088/1478-3975/6/4/046005] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
30
|
Peterson ME, Chen F, Saven JG, Roos DS, Babbitt PC, Sali A. Evolutionary constraints on structural similarity in orthologs and paralogs. Protein Sci 2009; 18:1306-15. [PMID: 19472362 PMCID: PMC2774440 DOI: 10.1002/pro.143] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2008] [Revised: 03/29/2009] [Accepted: 03/30/2009] [Indexed: 11/10/2022]
Abstract
Although a quantitative relationship between sequence similarity and structural similarity has long been established, little is known about the impact of orthology on the relationship between protein sequence and structure. Among homologs, orthologs (derived by speciation) more frequently have similar functions than paralogs (derived by duplication). Here, we hypothesize that an orthologous pair will tend to exhibit greater structural similarity than a paralogous pair at the same level of sequence similarity. To test this hypothesis, we used 284,459 pairwise structure-based alignments of 12,634 unique domains from SCOP as well as orthology and paralogy assignments from OrthoMCL DB. We divided the comparisons by sequence identity and determined whether the sequence-structure relationship differed between the orthologs and paralogs. We found that at levels of sequence identity between 30 and 70%, orthologous domain pairs indeed tend to be significantly more structurally similar than paralogous pairs at the same level of sequence identity. An even larger difference is found when comparing ligand binding residues instead of whole domains. These differences between orthologs and paralogs are expected to be useful for selecting template structures in comparative modeling and target proteins in structural genomics.
Collapse
Affiliation(s)
- Mark E Peterson
- Department of Bioengineering and Therapeutic Sciences, University of CaliforniaSan Francisco, San Francisco, California 94158
- Department of Pharmaceutical Chemistry, University of CaliforniaSan Francisco, San Francisco, California 94158
- California Institute for Quantitative Biosciences, University of CaliforniaSan Francisco, San Francisco, California 94158
| | - Feng Chen
- Department of Chemistry, University of PennsylvaniaPhiladelphia, PA 19104
- Department of Biology and Penn Genomics Institute, University of PennsylvaniaPhiladelphia, PA 19104
| | - Jeffery G Saven
- Department of Chemistry, University of PennsylvaniaPhiladelphia, PA 19104
| | - David S Roos
- Department of Biology and Penn Genomics Institute, University of PennsylvaniaPhiladelphia, PA 19104
| | - Patricia C Babbitt
- Department of Bioengineering and Therapeutic Sciences, University of CaliforniaSan Francisco, San Francisco, California 94158
- Department of Pharmaceutical Chemistry, University of CaliforniaSan Francisco, San Francisco, California 94158
- California Institute for Quantitative Biosciences, University of CaliforniaSan Francisco, San Francisco, California 94158
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, University of CaliforniaSan Francisco, San Francisco, California 94158
- Department of Pharmaceutical Chemistry, University of CaliforniaSan Francisco, San Francisco, California 94158
- California Institute for Quantitative Biosciences, University of CaliforniaSan Francisco, San Francisco, California 94158
| |
Collapse
|
31
|
Gu J, Li H, Jiang H, Wang X. Optimizing energy potential for protein fold recognition with parametric evaluation function. J Comput Biol 2009; 16:427-42. [PMID: 19254182 DOI: 10.1089/cmb.2008.0128] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In this paper, a new optimization method is proposed to determine a simplified energy potential for protein fold recognition, which consists of the residue-residue contact, hydrophobicity, and pseudodihedral potentials. With a parametric evaluation function method, the Z-scores of all the proteins in a training set are optimized simultaneously to obtain the best parameter set of the potential. For this multi-objective and multi-constraint problem, the new optimization scheme is very effective. The derived potential is then tested on two high-quality decoy sets and compared with other classical fold recognition potentials. With the simplified energy potential, we achieve a high level of discrimination capability between correct and incorrect folds.
Collapse
Affiliation(s)
- Junfeng Gu
- Department of Engineering Mechanics, State Key Laboratory of Structural Analysis for Industrial Equipment, Dalian University of Technology, Dalian, China
| | | | | | | |
Collapse
|
32
|
Gu J, Li H, Jiang H, Wang X. A simple Calpha-SC potential with higher accuracy for protein fold recognition. Biochem Biophys Res Commun 2009; 379:610-5. [PMID: 19121621 DOI: 10.1016/j.bbrc.2008.12.131] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2008] [Accepted: 12/20/2008] [Indexed: 11/18/2022]
Abstract
In this paper, an improved C(alpha)-SC energy potential designed for protein fold recognition was reported. It consists of three extremely simple interaction terms which are supposed to be the dominant interactions in protein folding: residue-residue contact, hydrophobicity and pseudodihedral potentials. The potential function only contains 210 contacts, one hydrophobic and one torsion parameters, which have been optimized using an interior point algorithm of linear programming. Tests of the derived potential function on commonly used decoy sets illustrate that it outperforms most of the existing coarse-grained potentials in terms of its capabilities in recognizing native structures and consistency in achieving high Z-scores across decoy sets, and it has almost equivalent performance to the potentials which considered complex intra-molecular interactions. The results show that our scoring function is a generally prospective potential for protein structure prediction and modeling with regard to its recognition and computation efficacy.
Collapse
Affiliation(s)
- Junfeng Gu
- State Key Laboratory of Structural Analysis for Industrial Equipment, Department of Engineering Mechanics, Dalian University of Technology, Dalian 116024, China
| | | | | | | |
Collapse
|